/ AGX Team

When Status Text Lies, the Product Lies With It

This week we fixed three small status surfaces that were technically plausible and operationally wrong.

agx-cliagx-simscheduled-tasksruntimeproduct-honesty

One of the easiest ways to make a product feel unreliable is to have the right data and the wrong words.

Not broken words. Plausible words. Words that sound close enough if you do not inspect them too hard.

OVERDUE when the next scheduled run is still eight hours away. Delete when what you are really deleting is the operator’s ability to see live work. Authenticated when the system has not actually proved that the hosted workspace can use that provider at all.

None of those are spectacular failures. That is what makes them dangerous. They look like status. They feel like status. They quietly teach the wrong mental model.

This week across AGX local and AGX Sim, a surprising amount of work turned out to be the same job in disguise: make the status surfaces mean exactly what they say.

OVERDUE should mean overdue

The clearest version of this showed up in scheduled work.

The bug report was simple and sharp. A manual Run now finished successfully. The detail pane still stamped the task OVERDUE, even though that same pane also said the next run was hours in the future.

That is not a cosmetic bug. That is the product collapsing two different meanings into one label:

  • work that is currently late and needs intervention
  • work that happened late relative to an earlier slot

Those are not the same thing. If I am using AGX as an operational surface, the first one means “look here now.” The second one means “interesting history, maybe.”

The old logic was trying to be clever. It looked at the previous scheduled slot, compared it to the last completed run, and carried historical lateness forward into the current health badge. That made a manual catch-up run look unhealthy even when the scheduler was behaving perfectly in the present tense.

The fix was smaller and stricter than the bug:

export function isScheduledRunOverdue({
  state,
  nextScheduledAt,
  lastCompletedAt,
  inProgress = false,
  now = Date.now(),
}): boolean {
  if (state !== "active" || inProgress) return false;
  if (nextScheduledAt == null || nextScheduledAt >= now) return false;
  if (lastCompletedAt != null && lastCompletedAt >= nextScheduledAt) return false;
  return true;
}

That helper is boring, which is exactly why I like it.

It does not try to infer historical judgment from partial evidence. It asks one tighter question: has the next scheduled occurrence already passed without being satisfied?

If yes, overdue. If no, not overdue.

AGX local now uses that narrower meaning across the scheduled-task boards instead of mixing “ran late once” into “is late right now.” The practical effect is that a healthy system stops accusing itself just because a human clicked Run now after the previous slot.

That should be obvious. It was not obvious until the product said the wrong thing with a lot of confidence.

Live work should not disappear because a row disappeared

The second version of the same problem was harsher.

In the prompt-job flow, you could delete a scheduled task while a run was still queued or running. The system would happily remove the task, its run history would cascade away with it, and the project-level “currently running” view would immediately lose the work.

That is one of those bugs that sounds small if you describe it from the database outward and severe if you describe it from the operator outward.

From the database side, it is just a delete path missing a guard.

From the operator side, it means AGX was willing to erase the monitoring surface for live work before it had honestly established that the work was gone.

Again, the important fix was not visual polish. It was a contract:

throw new PromptJobDeleteError(
  "Cannot delete a scheduled task while a run is queued or running. Cancel the active run first.",
  409,
);

That landed in hosted prompt-job handling this week, together with UI changes that derive live state from run rows instead of pretending the job record alone tells the whole story.

The message matters, but the backend rule matters more. A stronger warning without a stronger contract would still be theater. The correct product behavior is to fail closed while the work is live.

This is the sort of thing I want more of in AGX: fewer “the button looked available” interpretations, more “the system is structurally unable to misrepresent what is happening right now.”

Hosted provider status had the same honesty problem

The third example lived in AGX Sim, and at first glance it looked unrelated.

Hosted provider routes were returning booleans like installed and authenticated, but those booleans were coming from relay-host shell probes with inconsistent meanings. One provider check was close to a live auth probe. Another was closer to a version check. Another was really checking a host environment variable. All of them were being flattened into one readiness-shaped API.

That is how drift starts. Not with a crash. With a word that gets more certain than the underlying system.

The right move here was not to invent a fancier readiness model. It was to narrow the claim.

If the hosted product can truthfully say “this workspace has credentials configured,” say that.

If it cannot yet truthfully say “this workspace is ready to execute with this provider,” do not smuggle that implication in through authenticated: true.

So the hosted provider status work got cut down to a smaller honest surface:

  • provider catalog metadata by default
  • optional workspace-scoped configured state
  • no fake normalized readiness boolean
  • a fail-closed 410 on the old single-provider check route

That last part is important. Returning a loud “this route does not mean what you think it means” is better than returning a tidy lie.

This is product work, not wording work

I think teams underestimate how much operational trust lives in tiny pieces of language.

If a lead opens the overview before standup, the whole point of the screen is that its summaries are trustworthy enough to guide attention. If a founder checks a scheduled task while juggling everything else, the badge has to mean what it says without requiring forensics. If a hosted workspace shows provider status, the words cannot quietly over-claim a boundary the system does not actually own yet.

This is why I do not think of these as copy fixes.

They are control-surface fixes.

They shape whether AGX behaves like an honest runtime or like a product that keeps asking you to mentally translate its labels back into reality.

The latter can survive for a while. It cannot become dependable.

What this enables

The immediate win is modest but real: fewer false alarms, fewer vanishing states, fewer readiness claims that outrun the system.

The larger win is that future work has a cleaner floor.

You can add more operational views once OVERDUE means overdue. You can build stronger cancel-and-delete flows once delete is no longer allowed to erase live work. You can grow hosted readiness semantics once the runtime handoff boundary is actually settled instead of guessed at from the relay host.

That is the pattern I want AGX to keep leaning into.

Not more status text.

More status text that is expensive to misuse.