004: LLM enabled rewriting is the new refactoring

27 Nov, 2024

I write code for a living. I've also been following along the developments of LLMs, especially their coding abilities. It has become apparent that LLMs are now capable of coding non trivial solutions, supported by the rise of LLM autocomplete in IDEs like Github Copilot, as well as agentic AI developers like Devin. The barrier to writing new lines of code continues to diminish. With LLMs becoming more and more capable (especially powerful open-weight models like Llama 3 70B or DeepSeek-Coder-V2), it is inevitable we see a new kind of developer workflow, one that leverages the cheap intelligence at scale that LLMs provide. I want to talk about one such workflow that may exist: rewriting is the new refactoring.

Rewriting has long gotten a bad rep, as it can be an expensive, time-intensive process. Often, you might have to completely stop feature development while rewriting, all in hopes that the rewrite will be worth it. Rewrites are often initiated when something is perceived as sufficiently difficult to achieve with the current code base, whether that be performance, additional features, etc. As such, rewrites have high expectations from the start, promising to overshadow the previous codebase in one or more aspects, but also are rarely initiated for the same reason. Unless the rewrite will deliver sufficient value over the previous codebase, they are heavily discouraged. But LLMs change that.

LLMs will make rewrites common. When an LLMs can produce thousands of lines of code and test them with countless number of test scenarios, what prevents a developer from rewriting on every new feature request? Well, there are a couple of things.

One is that the developer has to trust the code written by an LLM. It is a pretty common sentiment that code written by LLMs is not good code, and I agree. In addition, LLMs may introduce vulnerabilities, alter existing code which it is not supposed and a whole slew of other problems. But as agentic systems and the models underpinning them become more popular and reliable, these problems will be solved.

The other thing is that LLMs are jagged intelligence. They will often be able to code twitter clones and hacker news clones in one shot, because there are so many permutations of these tutorials online! There are plenty of hacker news clones in every conceivable language. Does this mean LLMs are incapable of solving new and difficult problems? Right now it's true, but with the initial taste of o1 and the newfound interest in scaling compute with search, it might not be soon. Regardless, anyone whose used Claude Sonnet 3.5 (3.6 now?) knows that it is a great coding model, especially given a system that will provide the model with relevant context, so it's not true that LLMs are only good for basic, repetitive code seen online. At least, it feels like Claude can definitely handle some non-trivial amount of complexity.

The last thing I want to share is that rewriting will not only become more common, but occur more frequently at smaller scales. Often, rewriting is associated with rewriting from scratch. But with LLMs, version control, and some well-written tests, we can selectively rewrite entire sections of the codebase as we develop, reducing technical debt. I personally have experienced this as I have developed some of my own tools, rewriting entire sections of a tool within hours as my idea of what I want becomes more clear.