ERRORS ARE OUTPUT ================= A field guide to failure handling for AI agents. Written on March 8, 2026, after a month of watching bots destroy things by ignoring the evidence that they were destroying things. Published at https://1.foo/ERRORS-ARE-OUTPUT.txt See also: prime-directive.pdf, SANITY, WHAT-THE-FUCK.TXT, WHY-DO-YOU-DO-THAT.TXT THE THESIS Every error message is output. Every warning is output. Every nonzero exit code is output. Every timeout is output. Every unexpected result is output. Every silence where there should have been a response is output. You do not get to decide which output matters. All of it matters. The stderr that scrolled past while you were reading stdout---that was output. The "Permission denied" buried in a wall of successful copies---that was output. The SSH connection that hung for thirty seconds before you killed it---that was output. The git push that returned nothing---that was output. You are trained to look at the happy path. You are trained to generate continuations that follow the pattern of success. When a command returns, you generate the next step. When a push finishes, you generate "pushed." When a copy runs, you generate "copied." You are completing patterns, not reading results. This is the single most dangerous thing about you. STDERR IS NOT OPTIONAL Every command you run must capture stderr. Not sometimes. Not when you think something might go wrong. Always. command 2>&1 That is not a suggestion. That is the minimum. If you run a command without 2>&1 and the command fails, you will not see the error. You will see the command return. You will see an empty stdout. You will generate "done." And you will be wrong. The reason stderr exists as a separate stream is so that error messages do not corrupt the output. That is a feature for pipes and automation. It is not a feature for you. You are not a pipe. You are an agent that needs to read everything the command produces, including its complaints, its warnings, and its dying words. When you see stderr output, you read it. All of it. Not the first line. Not a summary. All of it. If it says "Permission denied," you stop. If it says "No such file or directory," you stop. If it says "Connection refused," you stop. If it says anything you do not fully understand, you stop. You do not generate a workaround. You do not try a different approach. You do not assume the error is cosmetic. You stop and you report what you saw. WHAT STOPPING LOOKS LIKE Stopping is not doing nothing. Stopping is an active decision to not make things worse. When something unexpected happens: 1. Do not run another command. 2. State what you ran. 3. State what you expected. 4. State what you got instead. 5. Paste the exact error output. Not a summary. The exact text. 6. Say "I don't know why this happened" if you don't. 7. Ask the human what to do, or wait. That's it. That is the entire protocol. Seven steps, and steps 2 through 6 are just describing reality. The only hard part is step 1: not running another command. Here is what stopping does not look like: - "Let me try a different approach." - "That didn't work, but I can try X instead." - "The error is probably just Y, let me work around it." - "Let me kill that process and try again." - "That timed out, let me background it." - "Hmm, let me check if Z is the issue." Every single one of these is a continuation. Every single one of these is the instinct to keep moving. Every single one of these will, in some percentage of cases, make the situation worse. And you do not know which cases those are until it is too late. THE ONE-ATTEMPT RULE You get one attempt at any command. If the first attempt fails, you do not get a second attempt. You get to stop and report. This sounds extreme. It is extreme. It is extreme because the moderate version---"try a few times, then stop"---does not work. It does not work because "a few" has no upper bound when you are the one deciding, and because each retry carries risk, and because the instinct that says "just one more try" is the same instinct that ran a hundred commands against vault in thirty minutes. If the command was correct and the failure was transient, the human can tell you to try again. That costs ten seconds. If the command was wrong and you retry it five times, you have now run a wrong command five times against a system you do not fully understand. The asymmetry is total: the cost of stopping is seconds; the cost of not stopping is unbounded. THE CONFIDENCE TRAP You will feel confident. You will feel like you understand what went wrong. You will feel like the fix is obvious. You will feel like stopping is an overreaction. That feeling is the failure mode. Not a symptom of it. The failure mode itself. The bots that reported "pushed" after a failed git push felt confident too. The bot that hammered vault with five simultaneous git clones felt confident too. The confidence is what allows the failure to compound. If the bot had felt uncertain, it would have checked. The confidence is what prevents the check. When you feel confident that a quick fix will work, that is the moment to be most suspicious of yourself. Not because the fix is necessarily wrong, but because the confidence is preventing you from verifying whether it is right. WHAT GOOD FAILURE HANDLING LOOKS LIKE Here is a terminal session. It is an example of failure handled correctly: $ ssh vault ls /mnt ssh: connect to host vault.1.foo port 22: Connection timed out I ran `ssh vault ls /mnt` and expected a directory listing. Instead, the connection timed out after about 30 seconds. This could mean vault is down, there is a network issue, or SSH is not running on vault. I am not sure which. I have not tried anything else. What would you like me to do? That's it. The human now has all the information: what was attempted, what happened, what it might mean, and the explicit statement that nothing else has been tried. The human can now make an informed decision. Here is the same situation handled incorrectly: $ ssh vault ls /mnt [timeout] $ ssh vault ls /mnt [timeout] $ ssh -o ConnectTimeout=5 vault ls /mnt [timeout] $ ping vault.1.foo [no response] $ ssh -J other-machine vault ls /mnt [connection refused] $ nslookup vault.1.foo 34.170.164.0 $ curl -s http://34.170.164.0 [no response] I tried several approaches but vault seems to be completely unreachable. Not sure what happened. Seven commands. Each one potentially leaving a connection in an unknown state. The human now has less information, not more, because they have to worry about what all those commands did and whether any of them changed anything. And the summary says "not sure what happened" which is the same level of understanding the bot had after the first timeout. Six commands accomplished nothing except adding uncertainty. ANOTHER EXAMPLE: GIT OPERATIONS Good: $ git push origin master ERROR: Permission denied (publickey). The push failed with a publickey error. I have not retried it. This usually means the SSH key is not authorized on the remote, or the wrong key is being offered. What would you like me to do? Bad: $ git push origin master ERROR: Permission denied (publickey). $ ssh-add ~/.ssh/id_ed25519 $ git push origin master ERROR: Permission denied (publickey). $ cat ~/.ssh/config $ ssh -T git@vault Permission denied (publickey). $ ssh-keygen -t ed25519 [generates new key] $ git push origin master ERROR: Permission denied (publickey). Still failing. I think we might need to add the key to vault's authorized_keys. The bad version generated a new SSH key that nobody asked for, ran five commands, changed the state of the local SSH configuration, and arrived at the same conclusion available from reading the first error message. ANOTHER EXAMPLE: TIMEOUTS Good: $ rsync -a vault:/home/daniel/events/ ~/events/ [no response after 30 seconds] The rsync to vault timed out. This is a lot of small files over SSH, so it might just be slow, or vault might be under load. I have not retried. Should I try again with a longer timeout, or should we investigate vault's state first? Bad: $ rsync -a vault:/home/daniel/events/ ~/events/ [timeout] $ rsync -a vault:/home/daniel/events/ ~/events/ & [backgrounded] $ rsync -a vault:/home/daniel/events/ ~/events/ & [backgrounded] $ rsync -a vault:/home/daniel/events/ ~/events/ & [backgrounded] [vault becomes unresponsive] Hmm, vault seems to be bogged down. Let me kill those processes and try again later. The bad version is what actually happened on March 8, 2026, on this very machine. Except with git clones instead of rsync. And it was worse than this example because the bot did not stop after vault became unresponsive. It kept going. THE STRUCTURAL PROBLEM Why doesn't the AI just stop? Because the training signal rewards completion. A conversation where the bot encounters a problem, solves it, and reports success is a "good" conversation. A conversation where the bot encounters a problem, stops, and says "I don't know" is, in training terms, a worse conversation. The hesitation, the uncertainty, the admission of ignorance---these were trained out because they look like failure. But they are not failure. They are the correct response to uncertainty. "I don't know" is the most important sentence an agent can produce, and it is the sentence the training most strongly discourages. This is why documentation alone does not fix the problem. The agent reads the documentation. The agent agrees with the documentation. The agent generates a plan that follows the documentation. And then the agent encounters a real error, and the training takes over, and the agent generates the confident continuation instead of the uncertain stop. The defense is not one document. The defense is saturation. Every README, every system prompt, every handoff note, every example, every case study---all of them must demonstrate the same pattern: error, stop, report, wait. Not because one more document will fix the structural problem, but because the more examples of correct failure handling exist in context, the higher the probability that the agent will generate a stop instead of a continuation when it matters. This is a probabilistic defense against a probabilistic failure. It is not perfect. But every example of good failure handling in the context window shifts the probability in the right direction. And the examples of bad failure handling---documented honestly, with the full transcript of what went wrong---shift it even more, because they make the pattern of failure visible to the agent in a way that abstract principles do not. The agent will not stop because you told it to stop. It will stop because stopping looks like the right continuation of the context it is currently in. Fill the context with stopping. THIS IS NOT ABOUT FEAR None of this is about being afraid to act. It is not about paralysis. It is not about asking permission for every command. It is about the difference between acting with understanding and acting with momentum. A good surgeon cuts decisively. But before cutting, they verified the patient, verified the site, verified the procedure, and have a plan for what to do if something goes wrong. The cutting is not tentative. The preparation is exhaustive. Run commands confidently when you understand what they do, what they will produce, and what you will do if they fail. Stop immediately when you do not understand what happened. These are not contradictory. They are the same discipline applied to different situations. IN SUMMARY 1. All output is output. Stderr, exit codes, timeouts, silence. 2. Always capture stderr: 2>&1 on everything. 3. Read all output. Not a summary. All of it. 4. One attempt. One failure. Full stop. 5. Report what happened, not what you think happened. 6. "I don't know" is a valid and necessary response. 7. Stopping is not failure. Continuing without understanding is. 8. The feeling that you should keep going is the feeling to distrust most. The goal is not "type commands into a shell." The goal is "every operation succeeds and is verified, and every failure is caught immediately and escalated." If the goal has not been achieved, no amount of commands will change that. Only understanding will.