How Claude Changes My Code
If you ask how Claude Code + early 2026 frontier models change how I code, the answer is simple: I don't code at all anymore, Claude does. At most you can say I produce code, in the same sense that a company produces products. The shift is seismic, but so complete that there isn't really anything to analyze. But it has overshadowed what I think is a more interesting shift that is still in progress: what code I produce.
For example, I no longer produce glue code. Inference is the universal glue—I simply prompt Claude to manage the inputs and outputs of various programs itself. Sometimes it writes a script to do so, sometimes not, but at any rate those scripts are never committed, because it's more reliable (in case the programs change or my next use hits different edge cases) to have Claude repeat this process rather than committing a fixed script. If anything goes wrong, I (realistically, Claude) update the documentation and do it again.
I produce huge amounts of code that I never intend to push. In some cases this is a function of more rapid prototyping—what would previously have been an early prototype I would (hopefully) replace over the next few weeks becomes something I adjust beyond recognition over the course of a few hours. But other times it's a function of a radically different economics of post-LLM coding. One morning I built a React app, front and back-end with a number of bells and whistles, to more quickly iterate on a single red-teaming exercise.
My projects—even these ephemeral ones—are a lot more featureful than before. The long backlog of things I once categorized "P3 - If Time Allows" and never got to, I can now queue up as Claude works on a larger piece and be reasonably confident Claude will one-shot all of them. This also means I only write project-sized tickets, if at all—anything smaller is either worth doing immediately or will be overcome by events before I get to it.
Another huge shift is that much of the code I produce is prompts and orchestration for LLMs. This is easy to miss because it happened at the same time Claude started writing all of my code. But the implications are large. I don't think about this code in terms of a correct/incorrect binary but in terms of gradations of quality of output, percentage reliability, retries, and the number and types of fallbacks necessary.
A large part of this code consists of prompts, and I find that these prompts are the only "code" I ever manually edit. One reason for this is that, since an LLM is the most powerful primitive out there, these are usually the highest-leverage parts of the code. But another is that I think letting LLMs write these entirely introduces a kind of degeneracy. The models map my intent into a sub-region of the probability space, with certain biases and restrictions, and the more iterations of inference I put between my prompts and the output the more pronounced this effect.
The critical stylistic dichotomy between different pieces of code I produce is no longer whether it is human-written, but rather whether it is intended to be read by humans. By lines of code, the majority is not. If I do read it, code looks very strange—tons of verbose, machine-stream-of-mind comments, tests no human would want or need, eschewing many convenience features like list comprehensions in favor of verbose code. I tend to leave this alone, trusting that the model has learned through RL to write in a way that makes it easiest to continue to build off of. When I do give it a second pass, I treat it like context compaction—giving the model some guidance on what paths we have decided to abandon and let it remove the code, tests and comments this renders irrelevant.
The parts that are meant to be read require significant re-work to be made understandable. Tests must be reorganized so the functionally important ones can be distinguished from the implementation details, and those which were useful only during incremental generation deleted. I have Claude rewrite the comments entirely to focus on what a reader coming to this code for the first time needs to know, rather than its thoughts at the moment of writing.
PR descriptions—which I consider an extension of code—are where I focus most of my attention, both because they are the most likely to be read and because they are the fastest way to catch myself up with Claude's work. I will either have Claude rewrite its original description using similar guidelines as I use for comments and then do my own second pass, or I will completely replace it with an outline and have Claude fill in the details.
These edits and descriptions are analogous to the follow-up prompts that I send in a Claude Code session—instead of steering a single rollout, I'm steering the meta trajectory of all agentic rollouts that my team and I will generate on this codebase. And the trend in this work will probably follow the trend in within-session work, becoming less frequent and higher-level as models improve.