Codex and Claude are way too defensive. I think this is a good time to talk about defensive programming.
Say I believe some scenario is impossible, and should it be true there will be an error – a console error, request failure, something noisy – but life will go on.
This is actually good. I am probably not wrong, so there is no reason to complicate my code. If I am wrong, great! Through failing fast (and good observability) I will discover my wrongness and we will all be better off for it. The effects of being wrong in software are cumulative and sometimes fatal, so we want to uncover wrongness early.
Building a good understanding of how data actually flows through your system is important. You should not just guess. You also should not defend against everything by default. You should actually check, be confident that you know, and be eager to discover that you are wrong.
The worst thing is taking a wild guess at how some unexpected edge case should be handled, when you really have no idea why that would have happened, or what the downstream implications of your handling will be. It is routine for coding agents to mishandle, downgrade to warning, or just completely swallow errors that reveal critical misunderstandings and concordant design problems in your (their?) software.
Coding agents love defensive programming. There could be many reasons for this, but two come to mind:
– They just don’t want it to crash, like the early JS mentality.
– They don’t want to “miss an edge case”, perhaps reflective of a lot of training data produced by people who didn’t want to “miss an edge case”.
When you vibe code (not agentically engineer, or whatever we’re calling it) and everything looks amazing, how much of the implementation is just failing quietly because of defensive programming? Perhaps it helps explain the early-euphoria-hard-crash we saw many vibe coders go through.
Instruct your coding agents to fail fast and loud.
Author: Eli Mydlarz
Tokens/sec is the easiest optimisation https://lnkd.in/gVZarwKM
You are at the keyboard with your coding agent building the most important thing as fast and as well as you can. But the coding agent is slow, and you want to be as productive as possible, so you do the things you know how to do. Get the agent running overnight (effort to enable increased autonomy), parallelise (effort to enable separation of work, integration of separate git trees). This actually doesn’t help with your goal of getting the most important thing done as fast and as well as you can. It lengthens feedback loops, increases cognitive overhead – all the old cycle time vs throughput arguments apply.
So what can you actually do to achieve your goal? More tokens/sec! Just keep doing what you’re doing, but faster. If it’s fast enough for you to stay directly engaged with your highest priority work at your preferred level of abstraction, you will find it very satisfying.
I don’t want you to think I’m for or against Ralph loops etc. I’m exploring and learning like everybody. But I think we are missing the easiest optimisation, and I do worry that we are introducing unnecessary complexity as workarounds instead.
So now that I can finally get more tokens/sec (Cerebras team, help me!) I’ll go back from Claude Code CLI to Codex CLI for a little bit, and lean back into ADDD (Agentic Dictator Driven Development) and see how I like it at this speed.
Soon I’ll try to talk about _refinement_ more. We are all very focused on initial dev which is very exciting at the moment, but the euphoria of a quick AI build can sometimes be short-lived.
I do a lot of experiments, and I know I should share them more. My latest one is a small CLI tool for scheduling agent runs with GitHub Actions.
It’s writing GitHub Actions config for you, and leveraging your existing OpenCode config. It’s simple but enables a lot:
- You could push target test trees only, and trigger an agent that diffs between target and actual and then implements.
- You could run a test review and improvement agent every hour, in a busy trunk-based codebase.
- You could implement really sloppily, and trigger an agent that examines your implementation for intention, writes tests to formalise that, then uses them as the basis for test-driven reimplementation of your code.
All of this just by writing some OpenCode agent (one .MD file) and running Tender CLI.
I’m not using it – for now I’m still an ADDD (Agentic Dictator Driven Development) practitioner. I also never liked automatic commits, but maybe Tender is me starting to let go of that.
If you want to play with it and use NPM, you can run it at repo root with npx @susu-eng/tender.
You can also ask your coding agent to explore the CLI and take care of it for you. Just tell it to run npx @susu-eng/tender --help. This was my first agent-first UI, and it was fun watching Claude figure out how to operate it (trial, error, actually reading instructions, success).
This was only an experiment, implemented 100% by Codex – please use it carefully.
The style of TDD I learned, practiced and taught might seem a bit heavy by today’s standards, but I am still a believer in its virtues. For example: it’s very repetitive expression of intention makes it almost impossible for coding agents to look at a piece of code without knowing what it’s supposed to do and not do under every expected set of conditions.
I can write reliable software automatically, so what’s next?
I’ve been talking to some likeminded folks in similar situations, and current business models around software delivery don’t make sense to us much anymore.
Will we have studios that have mastered AI dev turning out software for clients much more cheaply than before? The agents are already faster than clients can make decisions about what they want.
Will we build speculatively, and then sell to people who believe in the software enough to GTM with it?
Will we build whole integrated development, deployment, and GTM platforms? Some of these already exist in somewhat disappointing fashion, but the tech is there to execute well on it now.
Will we just build products we believe in for ourselves? The cost of trying is very low.
I’m super interested in what’s next.
Tight feedback loops are critical DX for eXtreme Programmers, and they help coding agents immensely.
I use the same thing I learned way back at ThoughtWorkers University – a single command that comprehensively verifies that the build is green. I expose it as top-level DX, which for my current project is pnpm green. My agents call this frequently.
When you feel an urge to do manual testing, debug an error yourself, or anything else that isn’t automated, stop and instead improve “green” until you feel safe again.
Make green as fast as possible – coding agents can be way faster than humans, so waiting 30 seconds for an integration test is a huge delay relatively speaking. The test pyramid is back!
Good DX will get you very far, even before you get into workflow-land and start building quality-focused processes deeply into your coding agents.
I’ve been baking some favoured ways of working into my coding agent as project DX, OpenCode config, skills and rules for a little while. I can already work much faster this way than I could by myself, or with Antigravity. Reliability is high, output is good, and I’m working at a level of abstraction that I really enjoy. My mentality therein is something like ADDD https://lnkd.in/gsAizijA (thanks Obie Fernandez for the link).
I also have fine-grained control over planning and behaviour using test-trees as contracts. The codebase is well controlled, tested, and documented, benefitting from good adherence to my preferred practices.
Once people get to some stage they are happy with, I see they often optimise by running Ralph loops overnight. Maybe it’s not for me. I don’t want to work asynchronously just because the agent is too slow or unreliable, I want the agent to be faster. I want to go as fast as I can think at my current level of abstraction, which I am really enjoying.
Parallelisation also doesn’t sound great – all the old points about cycle time over throughput still apply. I don’t want to architect for easier parallel dev (that’s a compromise we’ve made too much in the past already), or have to integrate a bunch of work trees, or increase my mental load with things that are lower priority anyway, or start new work before learning from the last piece of work.
For now I can think way faster than GPT 5.2 codex high can work (on this project, with my workflow), so I’m on the hunt for more tokens per second. I’m not the first – I’m waiting for faster coding plans to drop and reportedly they will sell out in minutes. When I can continue working this way at a few thousand tokens per second, it will be an absolute delight.
Once you’ve got it working, the easiest optimisation is probably going faster.
Working with coding agents, organically teaching your colleagues has been replaced by systems thinking.
Coding models don’t learn (yet), but you can build the lessons you want to teach into the model’s developer experience, context engineering, workflow orchestration, instructions, and so on. All of that stuff makes up your “agent” and most of it is radically improveable.
I see people are flooding projects with markdown files (sure, I have some too 😂), but it feels like a local optima to me:
– There are better ways for coding agents to learn
– There are more impactful ways to improve your agent’s own DX
– There are even better ways to document your software
I’m happy we are speed-running the journey to good software engineering practices (it gets faster every time!), but that also means rehashing some of the old debates – debates that were already settled in my circles.
What’s happening to Tailwind is a leading indicator of the challenges that will face all SaaS companies (https://lnkd.in/gdBk4vfR)
Coding agents aren’t super helpful with operations, so SaaS businesses that deliver before runtime – like a developer-facing library – are more vulnerable to replacement today.
Coding agents will get better at operability and actual operations, the rest of the business will become as AI-aware as developers – even runtime SaaS margins will become indefensible.
I love the incredible learning opportunities pair programming creates. So it’s a little sad when I encounter a teachable moment, but my pair is a coding agent who can’t learn.
Working with a human engineer, such moments are very exciting. My pair would benefit from learning, I’d feel good about teaching them, and we’d all benefit from the human engineer being more capable going forward.
With current coding agents, there’s little point trying to teach lessons conversationally. Next session, that guy will still be a dummy! We don’t have our usual organic, lovely mechanism for teaching, learning, and growing together. Ouch.
What do we get instead? A free lesson for anybody who thinks leadership is all about performance management. Instead of making your agent smarter or more motivated – because you can’t – you step back and ask how you can make it more successful.
What mechanisms exist in your little agentic org for ensuring that strategy makes its way down to the fine details? What context does the agent need, exactly? How do new threads learn about the codebase quickly? How do you define, communicate and enforce rules? How do you improve the developer experience for your agent? How is work best scoped and defined for it?
When you take individual competence off the table – not because it doesn’t matter, but because you don’t control it – you are forced into systems thinking.
PS: If you want to talk about experiential learning for coding agents then DM me, I am very interested.
