The Next Evolutionary Step

May 9, 2025

The recent releases of Claude Code and OpenAI Codex caught my attention, but not for the reasons you might expect. I was puzzled by the enthusiasm they’ve generated. These are, after all, terminal tools. In an era where graphical IDEs dominate, with tools like Cursor, Windsurf, and VSCode evolving toward ever-more seamless integrations, like ‘Design Mode’, why are we getting excited about terminal-based AI coding tools that feel like a step backwards?

This is what comes to mind when I think of a natural-language-based CLI - an old Commodore 64 text adventure game that I used to play as a kid.

A move back to the basics seems odd but intriguing, so I decided to give them a try and see if I could figure out what all the fuss was about.

Claude Code

Setting it up was a breeze. Just follow the instructions here and you’ll be up and running in no time.

You’ll need an Anthropic account because you need some money on your account before you can use Claude Code. I gave mine $5 USD and it was enough to build a couple of simple apps.

In a weird way, it feels like I’m playing Questprobe Featuring Spider-Man but instead of asking it to go north, I’m asking it to write me an app.

From the simple prompt, “Can you create a web-based version of the text adventure game called Questprobe Featuring Spider-Man from 1984 - https://www.squakenet.com/game/questprobe-featuring-spider-man/ ?”, Claude Code generated a basic but functional web-based version of Questprobe Featuring Spider-Man in less than a minute.

I really appreciated how it created a plan and then systematically ticked off each step, all from such a simple prompt. By default, it stops and asks you to confirm each significant step - but it also gives you the option to confirm and “not ask again this session”.

Well hello dear friend, vigilance decrement. I knew you’d show up sooner or later.

It’s far too enticing to select that option. After all, how much reviewing are you really doing in the terminal window? This reminds me of something I’ve noticed about code reviews throughout my career - they’re one of those activities that engineers know are valuable (especially for learning about the codebase and seeing how others solve problems), but rarely enjoy doing. In most teams, you’ll find engineers having to repeatedly ask - sometimes even beg - their colleagues to review their code. It’s not surprising really; we engineers love creating things, and reviewing someone else’s code feels about as far from creation as writing tests does. Both are essential for quality, but they don’t scratch that builder’s itch.

OpenAI Codex

Setting it up is straightforward - just follow the instructions here. As with Claude Code, you’ll need an OpenAI account with some credit before you can use it.

I asked Codex to analyse a really old side project of mine - an iPad game called Easter Egg Hunt written in Objective-C back in 2013 or so. The prompt was simple: “Can you describe the purpose of the app?”

OpenAI Codex Easter Egg Hunt Description

Within seconds, it produced a remarkably accurate description of the app - from the three themed environments (Garden, Beach, or Snow) to the core gameplay of eggs periodically “popping” up in random locations. It even caught implementation details like the particle effects and chimes every five eggs, the in-app purchase system for unlocking scenes, and the use of Localytics for analytics. All this without ever seeing the app running or having any screenshots - just pure code comprehension.

Encouraged, I pushed a little further. “Draw some high-value diagrams using Mermaid that would help a new engineer understand this codebase.”

The results were genuinely impressive. Codex generated a series of clear, thoughtful visuals - from class relationship diagrams showing the core game structure to sequence diagrams illustrating the gameplay flow. These diagrams captured relationships that would take hours for a human to extract from unfamiliar code.

OpenAI Codex Easter Egg Hunt Class Diagram

OpenAI Codex Easter Egg Hunt Sequence Diagram

Think about onboarding - in minutes, a new engineer could understand a codebase that would normally take days to unravel.

While these CLI tools might seem like a step backward, I was starting to see glimpses of something more interesting emerging.

Beyond The Terminal Interface

For developers used to modern IDEs, returning to the terminal might feel like a blast from the past. But focusing on the interface misses something crucial.

These tools aren’t here to replace our IDEs - they represent the next natural evolutionary step in development. A step where AI intelligence can be embedded throughout our entire workflow.

Both OpenAI Codex and Claude Code can run in headless mode, allowing them to be integrated into pipelines, scheduled tasks, and automated testing suites. Claude Code offers a non-interactive mode designed for programmatic use. OpenAI Codex similarly supports a non-interactive / CI mode for use in pipelines.

This capability transforms these AI assistants from mere copilots into programmable units of intelligence that can be woven seamlessly into our larger systems. The implications of this shift are significant.

Imagine this:

After your build, test and deploy steps, get Claude Code to generate a series of diagrams for your codebase which it can then safely stash away in your documentation repository as a snapshot-in-time.
You know those pesky flaky tests that keep causing your pipeline to break but aren’t always easy to fix (even if you did have time to look at them)? Why not get OpenAI Codex to help you fix them automatically?
It could perform a pre-post-mortem analysis - evaluating the combined impact of changes and assessing risks across all dimensions (operational incidents, security vulnerabilities, change reversibility) before they become real problems.

What we’re seeing is just the beginning. Think about all the repetitive, time-consuming steps in our development lifecycle - both inner loop activities like coding and outer loop activities like reviews and documentation. These tools are opening up possibilities to automate not just the coding itself, but also those friction-heavy outer loop activities that often slow us down. As Adi Noda, CEO and cofounder of DX, recently said in a Gartner Podcast that was asking the question Does Developer Experience Really Matter?:

We tend to see more outer loop friction points than inner loop. So code reviews, CI, and release processes tend to surface as greater areas of friction than the inner loop.

We now have the tools to address these outer loop friction points, automating away the steps we know add value but don’t enjoy doing. While these tools are still relatively simple, they represent a natural evolution in the AI-driven development era - one that unlocks opportunities far beyond just writing code.

Final Thoughts

When I first encountered these terminal-based tools, I couldn’t help but wonder if we were taking a step backward. But that initial reaction missed something important: this isn’t about the interface - it’s about programmability.

Just as git’s command-line interface enabled countless automation workflows, these AI assistants can be invoked programmatically to work silently in the background - from reviewing code to generating documentation to managing releases.

This is how evolution often works - not in dramatic leaps, but in subtle shifts that fundamentally change how we work. Software engineering isn’t dead, but it is changing. The next step is yours to take: look at your development workflow, identify the friction points, and consider how to weave these capabilities into your pipeline.