Would you use code in production that no developer has read and understood? Let’s say it passes all automated tests, security scanning tools, code quality tools, etc… but it was not only written entirely by AI, it was also reviewed entirely by AI.

Just pausing here to allow any readers having a strong reaction to rant and rave and pound the table.

Come on, Leaf! AI gets so much wrong! AI…

  • will disable or delete tests to get them to “pass”
  • may write code that “works” but is unmaintainable
  • could be writing insecure code or even put in some sort of backdoor
  • doesn’t understand the larger ecosystem in which it’s coding
  • might be making questionable ethical choices
  • could have some horrible error with disastrous consequences
  • will lie to you (“yeah, it can handle 2000 users”) or make stuff up (hallucination)

And if nobody has even read the code, then how will we fix it when something’s wrong at 2 am?

Something about this isn’t adding up for me, though.

These are not new problems

We’ve seen all of these problems before. Human developers do all of the above.

And for my first ever on-call shift as a developer, I got a call in the middle of the night about a production failure in an application I had never even used and didn’t realize I was supposed to support. So it’s possible for programmers to troubleshoot code they have never seen before.

Human code review can catch a lot of these problems, if it’s done well and not just a glance and a thumbs up. But even human review misses the mark sometimes.

cat in front of a computer monitor

"Looks good to me." Photo by Volodymyr Dobrovolskyy on Unsplash

To what extent can a well-trained AI meet, or even exceed, the quality of human code review? I’m not saying that the tools you and I are using, in the way you and I are using them, are capable of human-equivalent review right now. I wouldn’t just tell AI “hey, review this code” and assume the results are great.

But is it possible that some developers currently have a sufficiently robust AI process that their AI code review is as good or better than human review?

And if AI can both write and review code, what’s the role of the human developer?

An AI development process to consider

Take a look at Bryan Finster’s post about Leading Agentic Teams. (“Agents” are autonomous AI programs that can do a lot of work without needing constant human intervention the way an AI chatbot usually does.)

Bryan is one of the voices behind MinimumCD.org, a guide for key practices for teams adopting Continuous Delivery. He is writing for an audience of developers, so I’ll translate his main points for the non-developers reading.

Here’s his plan:

  1. “Set the mission. Three to five sentences. What you’re building and why, not how.” - Instead of leaping into telling the agents what to code, he explains the larger goals.

  2. “Plan with the team. Have the agent generate a plan. Give it feedback. Agree on the approach.”

  3. “Define ‘done’ first. Gherkin features. Prune anything that doesn’t belong.” - Gherkin is used for software testing, so he’s saying write tests first to check for the features you want.

    Of course all the tests will fail if you don’t have any code yet. That’s fine! The idea is that you then code until the tests all pass. This is a form of test-driven development (TDD).

  4. “Build focused specialists with an orchestrator. Review agents for tests, architecture, naming, domain patterns, coordinated by an agent with overall context.” - He’s suggesting using setting up agents as specialists in various aspects of software development. A virtual team of experts. “Orchestrator” here is another agent to act as a manager for the agents to make sure they’re all working together.

  5. “Let the team code. Define what the code should do and let them deliver it.” - “The team” here is his group of AI specialists.

  6. “Validate outcomes, not activity. If the tests pass and the architecture is clean, ship it.” - This is the controversial part. Step 6 isn’t “now review all that code yourself to make sure it’s right.” He’s training the AI to understand what “right” means and having the AI check everything for him. And then he’s trusting the AI.

  7. “Run your pipeline. Pre-commit checks, all agents, all analysis. Every time.” - He is reminding people to take advantage of the automated quality process they have built, and not skip it.

About step 6, he says:

After I build my automated quality process, I don’t look at the code anymore. I can’t. It’s dangerous… If I manually review every line, I become the constraint. Everything waits on me. That means batching up more work before each review, which means larger deliveries, which means more risk that we’re building the wrong thing even if we’re building it the right way. Manual code review recreates the same bottleneck that manual QA gates create: it feels like quality, but it slows feedback and increases batch size. If the tests prove the code does what it should, the architecture agents confirm the structure is clean, and the static analysis passes, what exactly am I looking for by reading the code?

He does suggest reviewing the code after it is in production, to continuously improve the automated quality checking system.

It’s a very different way of thinking than what many of us may be used to. It would be interesting to see if we could eventually build a level of trust in the automated quality checks.

Now, Bryan doesn’t say what he’s working on. Is he working on the high-stakes production code, or something much less risky? We don’t know.

But the way he’s working sounds beneficial even if you did still have a human review the code. Whether or not you have a human review every line, setting up an AI “team” in the way Bryan describes sounds like an interesting way forward.

But what’s the role of the developer?

If AI is writing the code, and maybe even reviewing it, what do human developers still do?

As developers, we’ve sometimes tried to delegate anything that isn’t writing code to our colleagues.

Understanding the customer or the business? That’s for our product owner, designer, or business analyst. Testing? That’s QA. Architecture, security, accessibility, maybe even performance… can’t others handle those things? For many teams, support and maintenance are delegated to others, too. Even communicating we seem to want to leave to our scrum experts, project managers, and people leaders.

I get it. I’ve done it too. It’s as though we want others to perfectly express the software need so we can translate that into code exactly as it was described to us, and then we’re done.

The code, for many of us, is where the fun is, where the beauty is, where the craft is. As I said last week, it’s the magical incantation that makes the computer go.

But we humans aren’t just code-generating machines. We’re creators, responsible for making sure our creation meets the needs at hand.

In Bryan’s scenario, notice that he hasn’t simply replaced coding and reviewing code with copying and pasting someone else’s requirements document into a chat with an AI. He understands and explains the mission, helps the AI create a suitable plan, and provides clarity about quality. And his discipline in how code is delivered helps the work go more smoothly.

First, let’s expand and elevate our role. Our craft is changing. There’s a good post about it here: We mourn our craft, by Nolan Lawson, and a lovely response to it here: The five stages of losing our craft, by Andrew Murphy.

Elevating our role involves working with colleagues to understand and clarify what is needed and why, so that we can provide the mission to the AI team and evaluate correctly if it has been completed. We still need to evaluate the plan AI has made, to make sure we’re going in the right direction and our instructions were communicated clearly. Our role includes defining what good looks like for non-functional requirements such as security, so we can make sure those requirements are met.

Or, from Andrew Murphy:

The engineers who are doing well right now… stopped identifying primarily as “people who write code.” They started identifying as people who solve problems and provide value to users. Code is one of the tools they use to do that.

They got comfortable directing the work instead of doing the work, without treating that as a demotion… And they found that their years of experience make them better at using AI tools, not worse, because they know what good looks like.

Second, let’s evaluate how code is delivered in our organizations. From the end of Bryan’s blog post:

If you have the discipline of continuous delivery, of test-driven development, of defining what you’re building before you build it, an agentic team gives you a dramatic improvement. If you don’t have that discipline, you’ll struggle. You’ll generate garbage faster and spend all your time cleaning it up.

Developers have the chance to lead the way with practices that have always been a good idea. The 2025 DORA State of AI-assisted Software Development Report shows that these practices are essential for successful AI adoption.

Developers, if you are already working with practices like test-driven development and continuous delivery, great! Continue to do that when working with AI.

If you aren’t already working this way, now may be a good time to look into these concepts and experiment with them on your team. Good places to start include the book Accelerate by Dr. Nicole Forsgren et al, the DORA Capabilities model, and MinimumCD.org.

Third, the role of the developer involves evaluating the tradeoffs that AI can’t assess without the context we have. For example, when is an AI code good enough for your particular product at your particular organization? Or, when are you better off delivering something flawed to production so it can be continuously improved based on real feedback from real users?

Right now, someone is shaking their head to say: “No, Leaf, you don’t understand. Because of the nature of our business, we can’t deliver something flawed to production.” Sorry to be the bearer of bad news, but no matter the nature of your business, everything ever delivered to production is flawed. The question isn’t whether to ship flaws - it’s what matters most in your context, and we need human judgment to make that call.

Finally, developers will need more and better communication with colleagues. Management needs to understand how we’re working with AI and what tradeoffs we’re making. Non-developers using AI to solve problems will need help making their creations production-ready. And connecting with other developers becomes even more important, both for continued learning and for finding meaning in work that suddenly involves a lot more conversations with AI.

The human skills you’ve been taught to ignore in favor of more “technical” skills are the ones that will serve you best. Skills like collaboration, communication, handling ambiguity, creativity, discernment, and judgment were always helpful as you grew in your career as a developer. They’re only more essential now in the world of AI.

A quick note about Beyond Writing Code

It turns out that writing a polished blog post every week is time consuming in a way that may be preventing me from writing other content. Like… the book proposal. Or the book.

And this has been a full week, so this piece is less polished than I’d like, not to mention a day and a half later.

I am cautiously blocking off most weekday afternoons for writing only. That might help. If not, then I might start posting every other week. We’ll see!

Drop me a note

I would love to hear from you. Hit reply and let me know what’s on your mind and how this week’s message landed with you.

I read every message and reply when I can!