What It's Really Like Using an AI Coding Assistant

January 25, 2025

I’ve now spent a couple of months using Windsurf fairly regularly and I thought I’d share some of my experiences with it.

To set the scene, I’m a software engineer with over 20 years experience. I have a degree in Computer Science and Math and spent the first half of my career in hands-on individual contributor roles across 7 industries and 4 countries. I would generally describe myself as a C# .NET backend dev but I’ve built my fair share of frontends and even some mobile apps. I’ve hopped back and forth between management and engineering roles over the last decade. I haven’t written code as part of my day job for about 5 years now. I still enjoy building software myself and have plenty of grand ideas that I’d love to bring to life, but sadly it just takes too long.

I’ve been using Windsurf to help me write R and Python scripts for my research. Besides that, I’ve been tinkering with agentic AI frameworks like CrewAI (more Python), a couple of small web-based games (JavaScript) and experimented building mobile apps (React Native). These experiences have taught me a lot about working with AI coding assistants.

Before I dive into my specific experiences, it’s worth noting that as with any new technology, it takes time to learn how to use it well. There’s simply nothing as effective as real hands-on experience and deliberate practice to build skills. Learning how to collaborate with an AI coding assistant is no different. In fact, initial insights from my Master’s research shows that there is a strong correlation between frequency of use and developer productivity, with daily users reporting significantly higher productivity than infrequent users.

Graph showing the relationship between AI coding assistant usage frequency and developer productivity. Daily users report significantly higher productivity gains compared to occasional users. — Productivity by Frequency of Use
Daily users report significantly higher productivity gains compared to occasional users.

With that in mind, let me share what I’ve discovered about these tools - both the parts that have delighted me and the parts that have occasionally frustrated me. My experience aligns well with what other developers have reported: there’s a lot to love, but also some important quirks to be aware of.

The Good Parts

It’s like a fabulously helpful pair programmer

Using Windsurf really does feel like you have an endlessly patient and supportive pair programmer by your side. Interactions feel natural, like you’re actually collaborating on a piece of work with another human being. It helps you think through problems, suggests improvements, and catch potential issues before they become bugs.

It’s great at automating away toil and repetitive tasks

Have you ever had to go through and tediously update heaps of files or strings just for consistency’s sake? Sometimes you can get away with a simple find + replace, but sometimes that just doesn’t cut it. Windsurf can pick up on these sorts of patterns and rip through them automatically, systematically updating every occurrence in a file or string. This can save you a lot of time!

I should mention that sometimes it doesn’t pick up on all the instances, so it’s a good idea to double check its work - but it certainly takes most of the grunt work away from you.

It can run scripts, see the results and react autonomously

When using Windsurf’s Cascade feature in ‘Write’ mode, it will not only create and edit source code, but it will also offer to run it for you. You can configure it to run certain commands automatically, but otherwise it will just pause and ask for your confirmation. If you let it run your script, it will see any output or logs that would’ve been piped out to the terminal. If there are any errors in that output, it will automatically attempt to fix them. If not, it may try to summarise what it saw and provide insights or suggestions.

You see what I mean when I say it’s like having a fantastic pair programmer by your side?

It’s great for prototyping and experimenting with unfamiliar languages

As I mentioned earlier, I’ve been using Windsurf to help me write R and Python scripts - languages that I’ve barely ever used before. In the past, I didn’t steer clear of trying new technologies, but it would just take me a long time to get anything done. Windsurf has made it so much easier to prototype and experiment with code, supporting you in a way that reduces cognitive load and gives you a sense of confidence.

However, this ease of use comes with an important caveat: you still need a solid foundation in programming concepts. These tools are incredibly powerful, but they’re more like highly capable assistants that help you implement your ideas, rather than complete replacements for programming knowledge. They can help you learn a new language’s syntax or framework’s API, but they can’t teach you fundamental programming principles from scratch.

This leads to an interesting challenge: while these tools make it easier to try new things, it’s hard to gauge how much you’re actually learning. I can write R and Python code much faster now, but I’m not sure I could write particularly good code in either language without continued AI assistance. It’s a bit like having training wheels that you’re not sure you should take off.

This is exactly what Addy Osmani describes in his article about the “70% problem”. The initial progress feels almost like magic - you’re writing code in languages you barely know! But then reality kicks in: without underlying experience and expertise, you may find yourself stuck at that 70% mark, unable to tackle the more complex challenges that require deeper understanding.

The Annoying Parts

It can get stuck in an endless loop of incorrect suggestions

Sometimes you’ll ask it to write some code and it won’t get it right. Then you’ll tell it that that didn’t work and it will try something else, which also doesn’t work. This back and forth can go on for a while, and often it can lead you right back to the original incorrect suggestion. As you can imagine, this can get quite frustrating!

Since Windsurf has this agentic nature, it can also end up in this sort of loop itself - recognising that it’s made a mistake and trying to fix it repeatedly. The biggest downside of this is that it can end up making a lot of unnecessary changes to your code which you then need to try and unpick.

My advice here is to recognise when it’s going down these sorts of rabbit holes and asking it to stop, undo all the changes it’s made, go back to the start of the task, take the time to think about the problem, use chain-of-thought reasoning, make plan and start again. It can also help to tell it to make the least amount of change possible. This is often enough to help it find a better solution quicker.

It’s not great with transitive relationships in scripts

I’m not sure if it’s the way I’m structuring my R scripts or a limitation of the tool, but say I’ve got an R script (analysis.R) which loads another (setup.R), which in turn loads another (helper.R). I expect Windsurf to know that it can access and should use functions defined in helper.R from analysis.R, but it often doesn’t and instead creates new functions directly in analysis.R. I have to keep reminding it to reuse the functions defined in helper.R.

The Surprising Parts

It makes mistakes, but knows it too

Like humans, it makes mistakes. However, every now and then you see evidence of the underlying agentic nature of Windsurf kicking in. It will make some changes, realise that it’s made a mistake, and fix it all by itself. This is probably one of the most impressive things I’ve seen it do.

It doesn’t question you and takes (most of) your suggestions seriously, even when it was right

I’ve noticed that it will generally trust you even when you’re wrong. For example, if it suggests some code and you ask whether that is the most efficient way to do it, even if you know that it is, it will second-guess itself and suggest a different, less efficient approach instead.

What I tend to do is when I ask it a question like “is this really the best way to do this?” is add “challenge me if you disagree”. This seems to help it assess my question more freely, rather than just agreeing with me and finding another potentially less ideal because I dared ask.

The model you’re using really matters

Windsurf allows you to choose between three models: GPT 4o, Claude 3.5 Sonnet and Cascade Base. I’ve tried GPT 4o and Claude 3.5 Sonnet primarily, and I’ve got to say that Claude 3.5 Sonnet beats GPT 4o for the sort of coding tasks I’ve been using it for, hands down. Interestingly, I’m a huge fan of GPT 4o via ChatGPT for everyday tasks, but for coding, Claude 3.5 Sonnet is my go-to choice now.

If you want structured outputs, give it structured prompts

Think of using these tools like talking to another developer, perhaps someone early on in their career. If you give them vague requirements, chances are that what they’ll build won’t be quite what you were after. If you give them clear and concise specifications, there’s a better chance of getting what you want.

The same thing applies when using AI coding assistants. The more specific you can be, the more likely the outcome will match your expectations.

Keep changesets small and commit often - source control is your friend

Keeping changes small and focused is a fundamental best practice in software development, regardless of your tools. However, it becomes even more critical when using AI coding assistants because they can generate code so quickly that you’ll have hundreds of changes before you realize it. The larger the changeset, the harder it is to spot mistakes and the more time-consuming the review process becomes.

I’m not alone in noticing this challenge. The 2024 DORA Report found that AI adoption is actually negatively impacting software delivery performance, contrary to what many expected. Their research suggests that the dramatic increase in productivity and code generation speed is leading to larger changesets, contradicting DORA’s principle of small batch sizes which are essential for stability.

Graph from the 2024 DORA Report showing how AI adoption is correlating with decreased software delivery throughput and stability metrics — 2024 DORA Report Finding
AI adoption appears to be negatively impacting software delivery throughput and stability

So what does the future hold?

Research clearly shows that AI coding assistants boost developer productivity, but their impact goes far beyond simple efficiency gains. These tools are enabling a renaissance in personal software development. By handling routine tasks and reducing the cognitive load of working with unfamiliar technologies, they free us up to focus on what truly matters - creating innovative software that delights users and makes us proud.

My Windsurf Profile
Statistics from two months of AI-assisted development

To be honest, it’s been a while since I spent so much time coding. Not for a lack of ideas - I’ve got plenty of those - but for lack of time. Building anything useful takes time, precious time. But with AI coding assistants, I feel supercharged. The barrier between having an idea and bringing it to life has never been lower, especially for experienced developers who can guide these tools effectively.

Yet this new era of software development brings its own unique challenges. How do we effectively provide AI assistants with enough context about our codebase? How should we store and version control the prompts or specifications we use to guide code generation? And what about the explosion in the number of repositories being created, now that the barrier to creating new projects is so much lower? These represent entirely new problems that we’ll need to solve as this technology matures, and I’ll be writing more about each of these areas as I uncover and learn more about them.

If you haven’t tried an AI coding assistant yet, I strongly encourage you to give it a go. Pick a tool that suits you best and get started. Start small, be patient as you learn how to work with them effectively, and most importantly - have fun exploring what’s possible. Remember that these tools are currently the worst they’ll ever be - just two years ago, what we now take for granted seemed almost impossible.

The future of software development is being reshaped right now, and it’s an incredibly exciting time to be part of it.