Changelog

What changed, in plain English.

Every fix and feature — and when something goes wrong on our side, you read about it here first, honestly. The technical detail is one click away when you want it, and out of your way when you don't.

June 12, 20262 changes[+]
added

Problems now come to you — and uptime explains itself

The fleet page now opens with a "needs attention" list: every device with a problem, the reason in plain text ("disk 89% (F:)", "offline — last seen 3h ago"), most urgent first, one click from the device itself. No more scanning site cards hunting for the yellow cell — if the band isn't there, everything is healthy. And every device now answers the question its uptime raises: a new "last boot" line says why the machine last restarted — a planned Windows update, a person, or an unexpected loss of power. That last one matters: a dirty shutdown also appears in the attention list for its first day, because a server that lost power is a server worth a second look.

[+] for the curious

Agent v0.4.3 classifies boot cause once per start from the System event log (1074 initiated / 6008 unexpected / Kernel-Power 41), ships a one-line verdict with the heartbeat, and reads nothing else — Arqos is not becoming a log viewer. Within each site card, troubled devices float to the top; the cards themselves never reorder, so your spatial map of the fleet stays stable. Disk at 90%+ ranks critical in the band; snoozed devices never appear in it.

[!] incident

A morning of false alarms — and the change that makes it impossible to repeat

For about three hours early this morning, Arqos sent "offline" alerts for machines that were running fine. No monitored system was ever down: an update had idle computers reporting an activity number so close to zero our database refused to store it, and the whole check-in was rejected with it. To Arqos, everyone went silent at once. We found the exact error in the logs, fixed it the same morning, and every false alarm closed itself. The structural change: a device's "I'm alive" signal is now recorded before — and independent of — its measurements. A bad statistic can degrade a chart; it can never impersonate an outage again.

[+] for the curious

Windows agents' emulated load average decays into IEEE-754 subnormal floats (~1e-124) on idle machines; the value passed range validation but Postgres float4 rejected it (22003), and the un-isolated metrics insert failed the request before last_seen updated. Fix: ingest flushes |v| < 1e-37 to 0, the metrics insert is isolated from the liveness write, and the agent clamps the value at the source in v0.4.3. Full postmortem available on request.

June 11, 20265 changes[+]
added

Arqos now watches the gear that can't run an agent

Hypervisors, switches, and storage appliances don't allow software installs — so any Arqos agent at the same site now quietly polls them for you (ping and port checks, every minute). There's nothing to assign or babysit: agents share the work as a pool, and if one goes away another takes over on its own. If the site ever loses all eyes, you get one honest "monitoring blind" notice instead of a wall of false alarms.

[+] for the curious

Checks are site-scoped leases claimed and renewed by any reporting agent (failover in ~3–4 minutes once the fleet is on v0.4+). Probed devices use the same absence-based detection as agents: green checks count as life, partial failure shows as degraded, and silent-with-no-evidence becomes gray "unknown" with a single site-level alert after a grace period. SNMP and read-only vSphere collection are in development.

added

Alerts now know who to tell, and when to escalate

You can route notifications by site and severity, every incident stays in one email thread from first alert to all-clear, and anyone can acknowledge with a single click — no sign-in needed. If a critical alert sits unacknowledged, Arqos walks it up the chain you define, one contact at a time.

[+] for the curious

Per-channel routing with severity floors, signed one-click acknowledge links, and an escalation ladder driven by per-channel unacknowledged-delay. Recovery notices reply into the original incident thread, and anyone who was escalated gets the all-clear too.

added

Snooze alerts for planned maintenance

Rebooting a server on purpose? Snooze its alert emails for an hour, four hours, or until 8am — right from the alert email itself. Detection never pauses and the record never has gaps; only the emails go quiet. If the machine is still down when the snooze expires, you get one final alert.

improved

The agent can no longer be wedged by a dead network drive

A disconnected mapped drive or a stuck Windows subsystem could previously freeze an agent mid-collection — alive as a process, silent as a reporter. Every system call the agent makes now has a deadline, and a watchdog restarts the agent cleanly if a collection cycle ever stalls.

[+] for the curious

Per-call timeouts on load, partition, and per-mount usage reads; mounts that time out are benched for 30 minutes; no completed cycle within max(5× interval, 5 min) exits the process for the scheduled task to restart. Shipped as agent v0.4.2.

improved

The fleet view reads like a status, not a spreadsheet

Devices are now grouped by site with problems sorted to the top, healthy fleets collapse into a one-line "all quiet" strip that shows your last incident and how fast it recovered, and a quick switcher (⌘K) jumps you to any device by name.

June 10, 20262 changes[+]
security

One thing Arqos will never do: reach back

The agent has no remote-control channel — not a hidden one, not a disabled one. It can read health numbers and send them out; it cannot receive commands, run scripts, or update itself without your administrator acting first. That's an architecture decision, not a policy, and it's why a compromise of our service can't become a compromise of your network.

[+] for the curious

Outbound-only TLS on port 443, HMAC-signed reports with per-device secrets, no listening sockets, no execution surface. Agent updates run through your own admin tooling, never pushed from us. The full technical dossier is available on request.

added

Arqos goes live

One command installs the agent on a server, and from then on it checks in every minute over an encrypted outbound connection. If a machine goes silent, you get an email within about two minutes — and when it comes back, the alert closes itself and tells you. Setup is an afternoon; the day-to-day is silence.

See your whole fleet by this afternoon.

Early access is open to a small number of MSPs and IT teams.

Get early access60-second install · free during pilot