Engineering’s AI reality check

Most engineering leaders cannot answer the one question their CFO is about to ask: “Can you prove this AI spend is changing outcomes, not just activity?”

Every December, roadmaps get locked, budgets get approved, and board decks are polished until everything looks precise and under control. Underneath, many CTOs and VPs are still working with partial visibility. They have a feel for their teams, but not a reliable view of how work moves through the system, how AI is really changing delivery, or where time and money actually go.

For a while, that was survivable. Experience, pattern recognition, and cheap capital covered the gaps. You could hire around bottlenecks, overstaff critical teams, or quietly pivot away from the messiest parts of the system. Then AI showed up and became the perfect distraction. Pilots, PoCs, Copilot seats, and “AI initiatives” created visible activity and bought time.

In 2026, that grace period ends. Boards and CFOs are shifting from “show me you are experimenting” to “show me measurable impact, this year.” Not because they stopped believing in AI, but because the market no longer rewards vague promises. Every AI dollar will need a traceable path to productivity, quality, or customer value.

The moment of exposure

If you run engineering, you probably recognise this scene. You present a slide with AI highlights. Adoption is up. Developers say they like the tools. You share a few anecdotes about faster coding and smoother reviews. Then the CFO asks a simple question: “Exactly how is this budget changing output and outcomes?”

Typical answers lean on:

AI adoption numbers and licenses
Time saved on coding tasks
Roadmaps of what will be possible “once we fully roll this out.”

What is almost always missing is a clear breakdown of:

Where AI is actually used across the SDLC
How much capacity does it have that is freed in practice
How is that time being redirected to customer-facing work, quality, or strategic initiatives
Whether AI is improving system behaviour, not just individual speed

So the conversation slips back to learning curves, compounding benefits, and talent attraction. All true, but too soft for a tough budget review. That will not be enough.

Why 55% faster does not mean 55% more

AI vendors love task-level numbers. A coding task completed 55 percent faster looks impressive on a slide. But once you zoom out to teams and systems, the picture changes.

Large datasets across thousands of developers show a consistent pattern:

Roughly half of team members say AI is improving team productivity by 10 percent or less, with a non-trivial share seeing no measurable improvement at all
Only a minority reports 25 to 50 percent gains, as seen in case studies.
Field experiments find that developers do complete more tasks with AI. Still, the gains are much smaller than the “55 percent faster” headlines suggest once you account for real-world complexity, debugging AI output, and integration work. And when you zoom out again to delivery metrics across teams, some organisations see throughput flatten or even dip as AI usage grows, because changesets get larger, integration risk increases, and coordination overhead rises.
The pattern is simple: task-level efficiency does not automatically become system-level productivity. Short bursts of time saved get chopped up by meetings, support work, and context switching. Developers need long, uninterrupted blocks for deep work, but most of their day is fragmented. Even if AI shaves 20-30 minutes off a task, that time easily dissolves into Slack, reviews, and incident pings instead of turning into meaningful new output.

The problem is not the tools. It is the lack of a system for where the “extra” capacity goes.

The real productivity question for 2026

Most organisations still frame AI productivity in terms of speed: more story points, more tickets, higher deployment frequency. That misses the bigger question:

How much of our engineering capacity goes to net new value versus maintenance, incidents, and rework, and is AI improving that mix?

High-level benchmarks are blunt but helpful. On average, about 45 percent of developer time is spent on maintenance, minor enhancements, and bug fixes rather than on genuinely new, customer-facing work. If AI helps you produce more code inside an unchanged system, you risk:

Shipping features faster with the same defect rate
Adding new surface area while technical debt quietly compounds
Making teams busier without creating the product or the business meaningfully better

That is how you end up with impressive local metrics and a leadership team that still feels like engineering is slowing down.

Two moves that turn AI from hype into compounding gains

If you want to walk into a 2026 budget conversation with objective evidence, you need to be deliberate about how AI-driven time savings are used. Two moves matter.

Reinvest micro savings into quality and future capacity, not just speed

AI is already good at boilerplate, tests, documentation, and simple refactors. The trap is treating the saved time as unstructured “extra” capacity that disappears into the noise. Instead:

Reserve recurring time specifically for quality work: refactoring, test coverage, documentation, security improvements
Keep a visible, prioritised list of high-interest technical debt and refactor targets.
Use AI to accelerate those tasks so that even 20-30 minute windows chip away at the backlog.

When teams systematically reduce technical debt and improve tests around critical flows, they cut future incidents and rework. Over a year, that frees more capacity for new work than shaving a few minutes off each ticket ever will.

2. Point AI at the ugly, high-friction work that commonly blows up roadmaps

The biggest productivity wins are not in everyday code generation. They are in:

Framework or language migrations
Large-scale legacy refactors
Systematic security vulnerability remediation
Architecture simplification and platform consolidation

These activities steal weeks or months of capacity and stall strategic initiatives. Using AI to understand legacy code faster, propose refactoring plans, generate migration scaffolding, and highlight recurring failure patterns can dramatically compress timelines for this work.

In parallel, there is real leverage upstream in the problem space. Teams that reach higher levels of AI adoption report better gains when they:

Use AI to clarify requirements and user stories
Summarise customer feedback and support tickets
Explore alternative solution approaches earlier

That reduces wasted builds and focuses effort on changes customers actually care about. The most significant gains do not come from replacing human creativity, but from amplifying it and aiming it at better-defined problems.

You can be elite on DORA and still waste 45% of your capacity

DORA metrics are not the enemy. Deployment frequency, lead time, MTTR, and change failure rate remain among the best signals we have for delivery performance. The risk is mistaking them for the whole picture.

It is entirely possible to:

Deploy many times per day
Recover quickly from failures
Maintain a low change failure rate

and still:

Burn nearly half of the engineering time on maintenance and bug fixes
Ship features that do not move product or revenue metrics
Exhaust teams with constant pressure and hidden after-hours clean up

Leading organisations are already expanding their scorecard to include:

Customer-facing changes shipped and adopted.
Time and cost by value stream or product
Ratio of new work to maintenance, support, and incidents
Developer experience signals, such as focus time and satisfaction

In 2026, the question in the boardroom will shift from “Are we elite on DORA?” to “How much of our capacity is going into things customers notice, and is AI improving that mix or not?” To answer that cleanly, DORA is necessary but not sufficient. You need a way to connect AI usage, workflow, quality, and business outcomes across the system.

Engineering intelligence as the new operating layer

This is where engineering intelligence platforms move from a nice-to-have to mandatory. The organisations that win in 2026 will not do it with one more AI tool or one more disconnected dashboard. They will do it by pulling together data they already have but rarely use in one coherent view:

Git and code review activity
Issue trackers and planning tools
Signals about AI usage across the SDLC

From there, leaders can answer the questions that actually matter:

How is engineering time really allocated by product, initiative, and work type?
What does “before and after” look like for teams that adopted AI heavily?
Where does flow break: planning, development, review, testing, release, or operations?
Which teams are stuck in reactive work, and which consistently deliver high-impact, customer-visible changes?

Instead of defending AI spend with anecdotes, you walk in with:

A baseline of throughput, quality, and allocation before AI
A clear trend line after AI, including areas where AI helped and where it created new friction
Specific decisions you made as a result, such as redirecting licenses, changing processes, or rebalancing teams

That is the difference between “we believe in AI” and “here is how AI changed our delivery engine in measurable ways.”

A checklist to be ready for 2026

To be ready for the more complex questions coming next year, use this planning cycle to do four things.

Measure your baseline – Track where time goes today: new features, maintenance, incidents, rework. Capture DORA metrics, as well as customer-facing changes and defect trends.
Instrument AI adoption properly – Look beyond license counts. Track which teams actually use AI, for what kinds of work, and watch what happens to lead time, failures, and incidents in those areas.
Decide how you will reinvest AI time – Pick one or two big quality levers, such as refactoring hot spots or increasing tests around critical flows. Block time for them, and support teams in using AI to go faster on those specific tasks.
Choose one flagship, high-friction initiative – Take a migration, refactor, or remediation effort that usually drags on and make it your test case for using AI plus engineering intelligence to compress time and reduce risk.

Do this, and you will not just have “AI activity” to show in 2026. You will have a credible, data-backed story from AI spend to business outcomes.

Who thrives in engineering leadership in 2026

The leaders who thrive next year will not be the ones with the flashiest AI demos or the loudest “AI strategy” slide. They will be the ones who:

Know where they sit on the AI adoption curve, beyond anecdotes
Have honest visibility into how their engineering system behaves, not just how busy it looks
Use AI to fix fundamentals like ownership, workflows, and quality
Answer hard questions with numbers instead of narratives

Engineering intelligence platforms is a key part of that shift. They give you the complex data to show where time and money go, how AI is really changing delivery, and whether your current pace is sustainable. The shift to data-backed engineering leadership is happening either way.

The gap in 2026 will be between teams still guessing and teams that can prove, in detail, how their engineering organisation works.

Story by Alex Circei

CEO @Waydev

Alex Circei is co-founder and CEO of Waydev, a software engineering intelligence platform that helps enterprise teams quantify delivery perf (show all) Alex Circei is co-founder and CEO of Waydev, a software engineering intelligence platform that helps enterprise teams quantify delivery performance, improve predictability, understand developer experience, and measure AI adoption across the SDLC. A Y Combinator alumnus and holder of a patent in Git analytics, he has spent more than a decade building and scaling technology companies in Europe and the United States. At Waydev he focuses on giving engineering leaders a clear, data driven view of how AI is actually used in their organisations, tying adoption to concrete changes in throughput, quality, and system behaviour instead of abstract promises. Alex contributes insights on DevOps, engineering productivity, and AI in software delivery to outlets such as TechCrunch, Fortune, Forbes, and DevOps.com. He lives in Menlo Park, California, and is an Ironman triathlete, applying the same endurance mindset to building resilient engineering cultures.