When Your Agents Start Filing Their Own Bugs
A routine health audit found a privacy leak nobody noticed. Then an agent fixed it. No human filed the ticket.
Nobody reported the bug. No user complained. No error showed up in a dashboard. A routine health audit — the kind that runs every week whether anyone’s paying attention or not — flagged line 693 of lib/db.ts:
console.log("getTasks called with userId:", userId)
A debug statement. Left over from development. Logging a user ID to stdout on every single task fetch. Every board render, every polling cycle, every API call that touched the task list. Sensitive data, quietly leaking into logs on one of the hottest paths in the entire system.
The audit created a ticket. An agent picked it up. One line deleted, PR opened, merged. Done.
The whole cycle — discovery, triage, fix, verification — happened without a human filing anything.
The Vault Doesn’t Sleep
Here’s what actually happened. AGX has a knowledge vault — a structured collection of health inventories, architecture docs, and weekly review notes that live alongside the codebase. Think of it like an institutional memory that agents can read, write, and act on.
Every week, the vault runs a health review. It’s not a linter. It’s not a static analysis tool. It’s closer to what a thoughtful engineer does when they inherit a codebase: they read through it, note what looks wrong, and prioritize what to fix first.
The review that caught the console.log leak also found two other issues worth ticketing:
-
ESO-359:
lib/db.tshad grown to 2,742 lines — a monolith that every downstream ticket needed to touch. The review recommended splitting it into domain-specific modules, noting that the upcoming test infrastructure work (ESO-342) made the refactor safer to execute now. -
ESO-361: 77 ad-hoc
console.logcalls scattered across the codebase with inconsistent string prefixes like[processor]and[schedules:poll]. No log levels. No structured output. The review flagged this as a prerequisite for the hosted deployment work.
Three tickets. Prioritized with reasoning. Cross-referenced against other open work. The debug leak got marked High because it was a trivial fix with immediate benefit — privacy concern plus log noise on a hot path.
This isn’t a bot running grep console.log and dumping results. The review reasoned about priority. It considered what other work was in flight. It deprioritized things that didn’t matter yet — like rate limiting being disabled (acceptable for localhost, tracked under the multi-tenant ticket) or a console.debug in the notification layer (low impact, fix opportunistically).
What the Review Chose Not to File
The deprioritization list is almost more interesting than the tickets.
The review looked at three frontend components over 1,700 lines each — ChatContainer, LinearBoard, PromptJobBoard — and decided not to file tickets. Reasoning: they’re functional as-is, and the lib/db.ts split should prove the decomposition pattern first before applying it to the UI layer.
It looked at duplicate API URL defaults across two files and decided it wasn’t worth a ticket. Two files. Low severity.
It considered filing a ticket for the skill marketplace feature request and passed. No user demand yet. Speculative.
This is the part that surprised me. Not the things it caught — any decent linter catches stray console.log calls. What surprised me was what it chose to ignore. Good engineering teams don’t file every issue they find. They triage. They consider what’s in flight. They weigh effort against impact. The vault review did exactly that.
The Fix Was Boring (That’s the Point)
The actual fix for ESO-360 was one line deleted. The agent that picked up the ticket didn’t need context about why the debug statement existed or who wrote it. The ticket was clear: line 693, getTasks function, remove the console.log. The agent opened a PR, it got merged, done.
But here’s what was happening in the same window: another fix landed for stale prompt-job runs. When autonomous processes crash silently — host dies, OOM, whatever — their runs get stuck in “running” status forever, blocking future executions. The fix tracks the host PID of each run and checks whether the process is still alive before marking it as failed. PID-recycling guard via command-string match. Legacy runs without a PID fall back to a 30-minute timeout.
These two fixes — the privacy leak and the stale process reaper — are the same kind of work. Infrastructure that makes autonomous execution reliable. Not features you’d put on a landing page. Not things a user would ever ask for. Just the quiet maintenance that keeps the system honest.
The Loop Closes
What’s actually new here isn’t any single piece. Health audits exist. Ticket systems exist. Agents fixing bugs exist.
What’s new is the closed loop. The vault audit reads the codebase. It creates tickets with real reasoning. Agents pick up the tickets. The fixes land. The next audit runs against the improved codebase.
No human decided which bugs to file. No human assigned the work. No human reviewed the triage priority. A human could have intervened at any step — the CEO inbox catches anything that needs judgment — but this particular loop didn’t need it.
The debug statement had probably been leaking user IDs for weeks. It would’ve stayed there until someone happened to notice it in the logs, or until a security audit caught it, or until it caused a real incident. Instead, a weekly review found it on a Tuesday, and it was gone by Wednesday morning.
That’s not magic. That’s just what happens when your codebase has a night shift.