Complexity is the ceiling: software design in the age of AI coding

AI has made writing code faster than ever. The harder work is understanding a system and changing it without breaking it. That has not gotten cheaper, and it now decides how much you can hand to a machine.

Introduction

In 1987, in an essay called “No Silver Bullet,” Fred Brooks predicted that no tool or technique would bring a tenfold gain in software productivity within a decade [1]. The decades since have largely proven him right, and the reason is that his argument never rested on the technology of its day. Brooks split the difficulty of building software into two kinds. Accidental complexity is the incidental effort our tools impose: syntax, boilerplate, plumbing. Essential complexity is what the problem itself demands: working out what the system must do, and designing a structure that holds up as it grows. Tools, he argued, only ever chip away at the accidental. The essential is left untouched, and the essential is most of the work.

AI coding assistants are the most effective attack on accidental complexity yet. They write a function or scaffold a whole test suite in seconds, and they have made the mechanical parts of programming cheaper than ever. That has encouraged a conclusion repeated often enough to sound obvious: code is cheap now, so the code itself barely matters. Describe what you want, let the model generate it, and when something breaks, change the description and regenerate.

The software educator Matt Pocock recently made a version of the counterargument in a conference talk, and it matches what I see in my own work [2]. I lead AI engineering at a legal-research company, where building with these tools is my daily work, and in a real codebase the “code is cheap” conclusion does not hold up. Writing code is cheap. Understanding it and changing it without breaking something else is not, and a model has to understand a codebase before it can safely modify one. The complexity of a system is therefore the ceiling on how much of it you can delegate to a machine. Rather than making software design optional, AI raises the cost of neglecting it.

The cost that didn’t go away

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

When people say code is cheap now, what they mean is that it is cheap to write. But writing was never where the expense lived. The expensive part of software is everything that comes after the first version works: making sense of it later, and changing it without breaking something you weren’t even looking at.

John Ousterhout gives this expense a precise name. Complexity, in his definition, is anything about the structure of a system that makes it hard to understand or modify [3]. It shows up as change amplification, where a small change forces edits in many places at once; as cognitive load, the sheer amount a developer must hold in mind to touch the code safely; and as unknown unknowns, where you cannot even tell which parts a change might affect. None of these has anything to do with typing speed. They are all about comprehension, and comprehension is exactly what generating code faster does not buy.

AI moves this arithmetic in the wrong direction. A model produces far more code than a person, and far faster. That means more surface area to understand and more places a single change can reach, all of it competing for the same working memory. The comprehension burden also doubles, because now two parties have to understand the system: the model, which must grasp it well enough to change it correctly, and you, who must grasp both the system and the model’s changes well enough to trust them. “Code is cheap” is half true. It is the dangerous half.

AI is a tactical programmer

Ousterhout draws a sharp line between tactical and strategic programming [3]. The tactical programmer optimizes for getting the current task working and moves on. The strategic programmer spends extra effort keeping the structure of the system clean, so the next change is cheaper and safer. Tactical work is faster today and more expensive every day after.

A language model, left to its own defaults, is a relentlessly tactical programmer. It is trained and prompted to produce code that runs, not code a colleague will be glad to inherit. So it duplicates a block rather than factoring out the shared idea, adds another parameter instead of rethinking an interface, and reaches for a local fix that works in isolation and quietly worsens the whole. The Pragmatic Programmer calls this drift software entropy: each change made without regard for the design of the system nudges it further toward disorder [4].

This drift is starting to show up in the data. A 2026 study examined more than 300,000 AI-authored commits across over 6,000 public repositories, running static analysis before and after each change to measure what the model actually introduced [5]. More than fifteen percent of those commits added at least one new issue, and of all the issues found, nearly nine in ten were code smells: structural problems that compile and pass their tests but make the code harder to understand and change. The code works while the design quietly degrades. That is accidental complexity accruing one commit at a time, and it is exactly the cost a model optimizing for a passing result will not charge itself. Someone has to supply the strategic layer the model does not, and that someone is the engineer.

Deep modules are the control surface

If complexity is the problem, the most useful instrument Ousterhout offers against it is the deep module: a unit with a simple interface that hides a powerful implementation [3]. The idea predates the name. In 1972, David Parnas argued that a system should be divided not according to the steps of its computation but according to the decisions each part can hide from the rest, so that a change inside one module need not ripple out across the others [6]. Information hiding is the whole point, and depth is what makes it work.

That same depth turns out to decide how much you can safely hand to AI. A deep module hands you two things at once: a contract small enough to hold in your head, and an implementation you can delegate. You specify the interface, let the model fill in the body, and review what matters at the boundary: its contract, its invariants, its tests, and any risk-sensitive internals, without having to reconstruct every implementation detail. The module becomes a kind of gray box: you scrutinize its edges and the parts that carry real risk, and let the rest stay complex inside.

A shallow design takes that option away. When behavior is spread thin across many small modules with leaky interfaces, there is no boundary to verify against, and understanding any change means tracing it through all of them. That cost falls on you and on the model at the same time. In practice, an agent does its best work inside a well-bounded module, where the task is legible and the contract is clear, and its worst work in tangled code, where it cannot tell what depends on what and makes things subtly worse while appearing to help. The structure of the codebase, far more than the cleverness of the prompt, sets the size of the job you can safely give away.

Complexity is the ceiling

The pieces fit together. The cleaner a system is, the more an agent can do in it without supervision, and the better the feedback it gets while doing so, because strong types and tests at clean interfaces tell a model immediately when it has gone wrong. The Pragmatic Programmer’s rule holds for people and machines alike: the rate of feedback is your speed limit [4]. A messy system slows that feedback down while the model speeds the damage up.

The evidence that this ceiling is real has started to arrive, and some of it is counterintuitive. In one early-2025 randomized controlled trial, METR had sixteen experienced open-source developers complete tasks in large, mature repositories they knew well, with and without AI assistance [7]. The developers expected the tools to speed them up by about a fifth; measured against the clock, the tools slowed them down by nearly as much. On a complex system that someone already understands deeply, the cost of steering and correcting the model outweighed the speed of its output. METR frames this as a snapshot of one moment, and its own later data is harder to read and may show more speedup [7]. The point is not that AI always slows people down, but that the complexity of the system governs whether it helps. At industry scale, the finding has only sharpened. Google’s 2025 DORA report, drawn from developers now adopting AI at near-universal rates, frames the technology as an amplifier: it lifts throughput and performance where a team’s engineering foundations are strong, and magnifies instability, more change failures and more rework, where they are weak [8]. The teams that benefit are the ones whose systems and practices were already in good shape.

The risk turns sharpest when there is no boundary to check against and the engineer trusts the output anyway. A Stanford study found that developers given an AI assistant wrote less secure code than those without one, and, more troubling, were more confident their code was secure [9]. Output you have not verified is not finished, and confidence is not verification. None of this means AI fails to help. It means the help is bounded by the quality of the system it works inside, and by the engineer’s willingness to do the design and the review that the model cannot do for itself.

Invest in design every day

The conclusion is not that AI is overhyped, or that any of this is new. The skills that decide the outcome are the ones the field has been writing down for half a century, from Parnas in 1972 to Ousterhout today. What has changed is the price of ignoring them. When code was expensive to write, a tangled system mostly slowed people down. Now that code is cheap to generate, a tangled system caps the leverage of an unusually powerful tool, while a clean one compounds it.

That places the engineer’s real work where it has always sat, one level above the code itself. Kent Beck’s practice of incremental design, putting a little into the structure of the system continuously rather than saving it for occasional rewrites, is the right discipline for an era in which a machine produces the lines [10]. The model is a fast tactical programmer, and it needs someone thinking strategically above it. The teams that get the most out of AI will not be the ones that generate the most code. They will be the ones whose systems stay simple enough that a machine can move quickly through them without breaking them. Design has become the limiting reagent, and it is the part of the work that is still ours.

References

[1] Frederick P. Brooks Jr., “No Silver Bullet: Essence and Accidents of Software Engineering,” Computer 20(4), 1987, pp. 10-19.
[2] Matt Pocock, “It Ain’t Broke: Why Software Fundamentals Matter More Than Ever,” keynote at AI Engineer Europe, 2026. https://www.youtube.com/watch?v=v4F1gFy-hqg
[3] John Ousterhout, A Philosophy of Software Design, 2nd ed., Yaknyam Press, 2021.
[4] David Thomas and Andrew Hunt, The Pragmatic Programmer: Your Journey to Mastery, 20th Anniversary Edition, Addison-Wesley, 2019.
[5] “Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild,” 2026. arXiv:2603.28592
[6] David L. Parnas, “On the Criteria To Be Used in Decomposing Systems into Modules,” Communications of the ACM 15(12), 1972, pp. 1053-1058. https://dl.acm.org/doi/10.1145/361598.361623
[7] METR, “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity,” 2025. arXiv:2507.09089. See also METR, “We Are Changing Our Developer Productivity Experiment Design,” 2026. https://metr.org/blog/2026-02-24-uplift-update/
[8] DORA / Google, “State of AI-Assisted Software Development,” 2025. https://cloud.google.com/resources/content/2025-dora-ai-assisted-software-development-report
[9] Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh, “Do Users Write More Insecure Code with AI Assistants?” ACM CCS 2023. arXiv:2211.03622
[10] Kent Beck and Cynthia Andres, Extreme Programming Explained: Embrace Change, 2nd ed., Addison-Wesley, 2004.

Story by Rilton Franzone

Rilton Franzone has built production software since he was sixteen, from AI research tooling at CatalyzeX to the a16z-backed fintech Clutch. (show all) Rilton Franzone has built production software since he was sixteen, from AI research tooling at CatalyzeX to the a16z-backed fintech Clutch. He now leads AI engineering at midpage.ai, a legal-research company used by more than 10,000 litigators and 300 law firms in the United States.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Provided by Rilton Franzone

Also tagged with

Design

Complexity is the ceiling: software design in the age of AI coding

Introduction

The cost that didn’t go away

AI is a tactical programmer

Deep modules are the control surface

Complexity is the ceiling

Invest in design every day

Get the TNW newsletter

Also tagged with

Equal AI raised $30M to screen phone calls for Indians who get 20 spam calls a week

Pleo layoffs hit engineers a day after it launched finance AI agents

Discover TNW All Access

Anthropic’s Claude Fable 5 curbs target China. The backlash came from its own side.

Google sues suspected Chinese cybercrime ring that used Gemini to build scam websites