Incident response without VPN access: a practical guide

It’s 2:47 a.m. The pager goes off. You roll over, open your laptop — and the VPN won’t connect. Either the concentrator is having a moment, or your ISP is doing something creative, or the on-call playbook from 2021 still assumes you’re at the office. Here’s how to keep responding anyway.

The core problem

VPN is load-bearing for most incident response runbooks. When it’s down, you can’t reach the affected system — and frequently the affected system is what is taking the VPN down. Fixing a failed VPN concentrator while ops is paging you is the opposite of a fast recovery.

The substitute: outbound-agent access

LynxTrac (and similar outbound-tunnel tools) don’t depend on your VPN because the target’s agent is already connected outbound to a relay. You authenticate to the relay via SSO, and you get a shell or a desktop regardless of your VPN state.

Practical consequence: if your VPN is down, you can still recover services that matter.

The runbook

Open the dashboard. You need monitoring data first — without context, you are flailing.
Confirm the alert. Is it a real outage or a noisy monitor? Five seconds saved here costs nothing.
Get a shell. Click the affected host, get a terminal. You are now as able as you would have been on the VPN.
Collect before you fix. Grab logs, metrics, process tree. You will want this for the post-mortem.
Act. Run your remediation. Document what you did in the session (LynxTrac auto-captures the keystrokes anyway).
Verify. Monitor the host for 5 minutes after the fix — premature declaration of recovery is the leading cause of reopens.
Hand off or sleep. Update the ticket, tag the on-call follow-up, go back to bed.

What to watch

If your access depends on a single relay region, a relay-region outage breaks your response. LynxTrac relays run multi-region with automatic failover, but verify this on a non-incident day with a tabletop exercise.

Also: the control plane is now part of your critical path. Treat it with the same uptime rigor you’d want for your status page.

The meta-lesson

Every piece of infrastructure in your incident response runbook is itself subject to incidents. The goal isn’t to remove dependencies — you can’t — it’s to make sure the dependencies are more reliable than what you’re responding to.

Outbound tunnels are not immune to outages. They are, empirically, much more reliable than self-hosted VPN concentrators, because the failure modes that plague concentrators (NAT traversal, IP rotation, client version drift) are simply not part of the model.

Try it yourself

LynxTrac is free forever for 2 servers — no credit card, no sales call. Start in under 2 minutes →

MTTR Feb 28, 2026 · 3 min read

First 30 minutes of an IT incident: what great teams do

The first 30 minutes make or break MTTR. Here are the concrete moves high-performing teams make — and the anti-patterns we see everywhere else.

Read article

MTTR Dec 23, 2025 · 3 min read

How modern RMM tools reduce MTTR (mean time to resolution)

Modern RMM tooling shortens MTTR by compressing diagnosis, access, and fix into one surface. Here is where the minutes actually come from.

Read article

KMS Feb 22, 2026 · 3 min read

Using AWS KMS for secure SSH credential management

Storing SSH credentials safely is harder than it looks. Here is how AWS KMS fits into a modern SSH access flow — the good, the friction, and the pitfalls.

Read article

The core problem

The substitute: outbound-agent access

The runbook

What to watch

The meta-lesson

Try it yourself

Related posts

First 30 minutes of an IT incident: what great teams do

How modern RMM tools reduce MTTR (mean time to resolution)

Using AWS KMS for secure SSH credential management