Some Thoughts on Copilot and AI for Software Development

I have been using Github Copilot for software development for the past year, and my company is now starting to use the corporate version, and I’m even getting Microsoft’s training.  Using AI for software development is clearly the biggest change to happen to the industry since the internet, and it is important to continually evaluate how useful and effective we find new tools to be.  My results range from as a friend described it “autocomplete for my brain” – to, actively making my job harder, my work slower, and (if not checked) having more bugs.

The benefits of Copilot (I will be using Copilot and general AI for software development rather synonymously, but not totally in this post…I assume you are an adult and can figure it out) are clear to many of us who have used it for any extended period of time.  I do not know if it reads my clipboard, or is just good at guessing, but its ability to know that I want to add a specific variable to a string, or in a log is uncanny. When debugging a problem I have even seen it know when I click on a specific line, exactly what I want to log before I do anything – then when I verify the problem is what I believed it to be, when I then click on a different line, it knows the correct fix.  I have really had some mind-blowing moments.

Another thing AI can be great for is refactoring.  A lot of refactoring isn’t “hard” so much as “long and tedious” where it is easy to miss something here or there and break functionality.  In my experience Copilot does a good job of helping refactor without missing things – something I have always struggled with when using say a VSCode extension for refactoring.  I fully admit my experience using extensions is limited as every time I have tried to use them, I get frustrated and just say “fuck it, I’ll do it myself!”

An additional area where I see a benefit using AI tools is when there is something I don’t quite know how to do.  For instance, when I want to unit test a piece of code, but I am not sure how to build the scaffolding needed for the unit test to function – literally the worst part of writing code – many times Copilot can help at least get me started down the right path.

This however leads us directly into my first major problem with using AI.  When we talk about unit testing, AI really wants to help you do it.  It can be great for quickly generating scaffolding.  The problem arises with what checks the tests actually perform.  When you ask AI to unit test code for you, it usually just tests that the code does what you programmed it to do.  This isn’t just unhelpful, it is actively harmful.  Much of the point of writing tests is to catch problems, if the test simply checks that the code does what you programmed it to do, that is a bad test.  We need tests that catch the mistakes in what we programmed.  Otherwise we have a not just a false sense of security but an actively bad sense of security.

Following that problem is the one I have with AI writing code for me.  When I write code, I write code to do something, and I know what I intended it to do because I wrote it to do that. Many times AI wants to take the wheel and complete the logic for me.  Though this is nice in theory, reading and understanding code you didn’t write takes longer, and is harder than just writing it yourself.  I have had experiences where Copilot has written 3-4 lines, or even an entire block of logic, and the amount of time it takes me to analyze it to verify that it is doing what I want, and doing it correctly, actually takes longer than just writing it myself.  Basically it is like I have to review a PR – the other worst part of development.  And yet, we all look to StackOverflow…And some of us looked to help files and books before the internet existed. Sometimes it can be useful in that it might show me a path I may have forgotten, still many times it is slightly not quite correct, or even if it is, it was harder, and took longer for me to use its code.  If it is incorrect…How many developers are actually taking the time to analyze this code correctly?  Well, how many currently do it in PRs?  YMMV.

I have also found code-reuse to be another issue with AI generated code.  Though I don’t believe that every line has to be as DRY as possible, I do believe that code reuse is important.  Moving functionality that does the same thing to a single location helps prevent bugs because the “same” logic isn’t rewritten multiple times, it also makes updating logic easier because it only exists in one place.  I have found that AI isn’t super helpful for this.  It doesn’t suggest I refactor out code from before because I am doing the same thing here.  It doesn’t suggest I build some kind of factory instead of generating the same objects over and over.  It doesn’t know or care that I’m doing this.

The last area I have found AI to be a large problem is learning new things.  And yet, I do think it can learn these these skills. It can – possibly – be an amazing instructor.

I recently decided to learn some Rust for fun.  After about an hour of work I found I had to completely turn off Copilot.  Having autocomplete try to help me with everything when I know nothing just means I learn nothing.  I keep hitting autocomplete and nothing ever cements in my brain.  What can be helpful is when I get stuck, using the Copilot chat to try to help.  This way it never tries to help me until I ask it to. I can ask for help on how to do something new – help is never bad. Blindly listening is totally unhelpful.

In general I have found AI to be helpful in MY workflow – I have reservations about other developers, most especially junior developers using it.  I worry that people will not fully vet the code that it gives them.  I worry that people will generate tests that make them believe their code is safer than it is.  I worry that people won’t actually LEARN.  That said – did we not have the same concerns with things like StackOverflow?  I think we can all admit SA has become less useful over the last 5 to 10 years.  I rarely even care about results that are more than a year old.  Issue trackers, forums, and conferences are much better sources of information.  Perhaps AI is what is needed to replace legacy tools like SA.  What I do know for sure is that if my 8-year-old self had something like Copilot when I first started coding, I would have loved it, just as I loved the QBasic help files when I was that kid.