The case for slow agents

6 minute read philosophy design

— halcyon team

A fast agent is not a smart agent. It is an anxious one.

We built halcyon around a thesis that sits at odds with most of the category: the part of a long task that matters is the part that happens between turns. The pause where you read what came back. The walk where you decide it is wrong. The sentence you cross out because, on the second read, it is not what you meant. An agent that hurries you through those moments does not save you time. It taxes the only faculty that was doing the work.

The vocabulary for this is older than the chat box. In 1995, Mark Weiser and John Seely Brown at Xerox PARC described calm technology as that which "engages both the center and the periphery of our attention, and in fact moves back and forth between the two." Their argument was not against information. It was against the demand for it. A calm system informs without insisting. It lets the user stay attuned to many things at once, then surfaces what is needed when it is needed, and otherwise keeps quiet. The principles, as Wikipedia summarises them, come down to one bias: the periphery is a feature, not a failure of UX.

Almost every agent shipped in the last three years has done the opposite. The streaming token. The auto-suggestion. The "still here?" prompt. The thinking-out-loud monologue that demands you watch it think. These are not signs of intelligence. They are signs of a product team that confused engagement with attention, and attention with value.

The cost of that confusion is measurable. Gloria Mark's group at UC Irvine has spent two decades following knowledge workers around with stopwatches. Their headline finding, from The Cost of Interrupted Work (CHI 2008), is that after an interruption, a worker takes on average twenty-three minutes and fifteen seconds to return to the original task at the same depth. Interrupted workers do not slow down — they speed up, by about seven percent — but the speed is paid for in stress, frustration, and a heavier perceived workload. The productivity gain is a coping mechanism. The cost is borne in the body.

Sophie Leroy named the cognitive shape of this cost. In "Why is it so hard to do my work?" (Organizational Behavior and Human Decision Processes, 2009), she described attention residue — the part of your mind that stays with the previous task when you move to the next one. Residue is thicker when the prior task was left unfinished. A chat that keeps pinging you for clarification leaves residue on every turn. A draft you can put down and return to, without the system poking, does not.

The implication for an AI agent is not subtle. The interrupting agent is, by the existing literature, a productivity anti-pattern. Every nudge is a withdrawal from the same account the agent is meant to credit. Every autocomplete is residue manufactured at scale. The "responsiveness" the category brags about is, in cognitive-science terms, the thing degrading the work.

A slow agent inverts the defaults. It accepts a brief and disappears. It does not announce its thinking. It does not invite a clarifying chat. When it returns, it returns with something to read — a draft, a plan, a script — and then it waits. The dialogue between you and the agent moves at the speed of the work, not at the speed of the chat protocol. Most of the day, the agent is in the periphery. When you want it, it is in the center. Weiser's description, thirty years on, still fits.

This is not a romantic preference. It is an engineering claim. If the value of an agent is measured in the quality of thought the user is able to bring to the agent's output, then the agent's first job is to protect that thought. Every other feature — the model, the tools, the context window — is downstream of whether the user can still hear themselves think when the answer arrives.

We will not promise that halcyon is the smartest agent in the room. We will say it is the quietest. On the evidence above, that is the same claim.

Sources