Designing an RMM agent that doesn't slow systems down
Every RMM agent is a tax. Here is how we designed ours to stay under 1% CPU and under 50 MB RSS without dropping signal.
Every RMM agent is a tax on the systems it manages. The best agents make that tax nearly free. Here are the design decisions that kept ours under 1% CPU and under 50 MB RSS without dropping signal.
Principle 1: event-driven, not polled
Most legacy agents poll — every second, every 10 seconds, every minute. Polling wastes CPU on “nothing changed” and misses events that happen between polls.
Modern agents should be event-driven where the OS supports it:
- File changes → inotify (Linux), FSEvents (macOS), ReadDirectoryChangesW (Windows)
- Process changes → proc connectors (Linux), ETW (Windows)
- Network state → netlink sockets, AF_NETLINK / RTM_NEWLINK
- Metric snapshots → still polled, but at the minimum necessary rate
Principle 2: batch everything
Single-syscall operations are expensive. An agent that flushes each metric individually does 10-100x more work than one that batches.
Our rule: batch for at least 100ms, up to 1 second, before flushing. The user-visible latency doesn’t care; the CPU and network do.
Principle 3: compress before ship
Network bandwidth is more expensive than CPU cycles, usually by an order of magnitude. Gzip everything that goes over the wire. The CPU cost is in the noise; the bandwidth savings are 80-95%.
Principle 4: no background busy work
If nothing is happening, the agent should consume nothing. No heartbeats firing every second, no log scans running on empty files, no metric poll waking the CPU from idle.
Steady-state idle: should be 0.0% in top. Anything above that is a bug.
Principle 5: share, don’t duplicate
Inside the agent, frameworks are shared: one serialization layer, one transport layer, one scheduler. Adding a new capability means re-using existing primitives, not stamping out another copy of the boilerplate.
This keeps the binary small and the behavior consistent.
Principle 6: no GUI, ever
GUIs are enormous. Adding Qt, GTK, or (worst of all) Electron to an agent doubles or triples its footprint. If your agent has a UI, it’s not lightweight.
What we measured
Benchmark: 100 endpoints, typical IT workload, 30-day observation window.
- Our agent: p50 CPU 0.2%, p99 CPU 1.8% (during log bursts)
- Legacy Agent A: p50 CPU 2.1%, p99 CPU 8.4%
- Legacy Agent B: p50 CPU 1.5%, p99 CPU 12.1%
Memory:
- Our agent: p50 RSS 38 MB, p99 RSS 52 MB
- Legacy Agent A: p50 RSS 212 MB, p99 RSS 480 MB
These aren’t cherry-picked numbers; they’re steady-state under realistic load.
Why this matters to operators
A heavy agent is an alerting target all its own. You end up monitoring the monitor. You get paged when the agent consumes too much memory. You schedule maintenance to restart the monitoring agent. That’s not an RMM — that’s a second job.
Light agents get out of the way and let operators focus on the real systems.
Try it yourself
LynxTrac is free forever for 2 servers — no credit card, no sales call. Start in under 2 minutes →
Related posts
Lightweight RMMs vs enterprise tools: what small teams need
Small teams do not benefit from enterprise-scale RMM — they are paying for friction. Here is how to choose tooling that moves with you.
Lightweight RMM for DevOps teams
DevOps teams do not want a tool that behaves like 2010 enterprise software. Here's what a lightweight, CI-friendly RMM looks like in practice.
Real-world RMM metrics every IT leader should track
Most RMM dashboards drown you in charts that never change a decision. Here are the few metrics that actually move operations forward.