LLMs Are Kryptonite for Legacy Code (But Don't Let Them Touch It)

09 Apr, 2025

There's a myth floating around that large language models (LLMs) like sonnet or gpt are mainly good for greenfield, one-off projects. It's true they're fantastic for brainstorming new ideas, prototyping quickly, or scaffolding out fresh codebases. But what's surprising—and actually much more exciting—is how effectively you can deploy them against legacy code.

I've spent a fair bit of my life deep in legacy code, as a freelancer but also just out of curiosity. When ChatGPT came out, I was deep in legacy WordPress/WooCommerce code, including the joyous experience of building new admin views or migrating vast amounts of tangled data into a clean data lake using dbt. Along the way, I've developed some strategies to leverage LLMs as "software archeologists," making the task of untangling old code less painful.

Here's the practical workflow that works wonders:

Step 1: Let the LLM Document Everything

Today, we've got impressive agentic editors and powerful models available. Point them at your codebase, and they quickly map out the landscape—where functions are defined, what frameworks and tools are in play, and even which versions you're running. It's frankly remarkable. If you want to try it, just ask "what is this?" in an empty cursor chat.

Better yet, you don't have to rely solely on the initial output. Chatting with a model allows you to dig deeper, ask pointed questions, and then synthesize that interaction into coherent, human-readable documentation with a clear purpose.

Step 2: Instrument and Log Everything

Logs are your best friends. If you're dealing with legacy systems, robust logging and instrumentation are mandatory. LLMs are great at "transforming language", and thus can easily replace missing or messy logs. You want to go for consistency and coverage.

Once you have structured logs, LLMs can digest them easily, turning raw data into something actionable. Because they are great at matching patterns in text, they can be used to correlate an error trace to the code, quickly pinpoint which function returned a problematic value, or just create scripts to reproduce the issue. For example, it is now quite easy to create tailored fuzzing scaffolds, something that used to take me a huge amount of effort.

Step 3: Generate Accurate Mocks from Your Logs

Mocks built from real logs usually are closer to reality than the original specs or documentation. Mocks can of course be built for improved testing, but it is also straightforward to tweak them intentionally to expose subtle edge cases and uncover behaviors that standard unit tests or integration tests might miss.

Step 4: Let Logs Inform Your API and UI Tools

Instead of reverse-engineering legacy APIs from spaghetti code, use recorded requests and responses as your baseline. These logged interactions are the ground truth of how your system actually behaves, and you can directly turn these logs into API clients, admin dashboards, or support UIs.

Step 5: Use Logs to Create Cleaner APIs and Database Views

You can use logged API interactions as a blueprint for designing newer, cleaner APIs. Similarly, database views built on top of old, dirty schemas can isolate your new code from the chaos below. This allows you to gradually clean up your data and APIs, while potentially keeping other systems to use the old schema.

Step 6: Migrate and Clean Data with Log-Driven Scripts

When it's time for the big data migration, logs again come in handy. They're your reliable guide for creating migration scripts, ensuring that data cleanup is precise and reflective of your actual system state.

Why Never Let LLMs Directly Touch Legacy Code?

Legacy code, especially the spaghetti variety, has hidden dependencies, misleading naming conventions, and subtle behaviors embedded deeply. LLMs are powerful pattern-matchers but can easily be tripped up by inconsistencies or seemingly trivial naming issues. Trust me: subtle bugs introduced by an overeager LLM can become nightmarish to debug.

The solution? Use LLMs strategically, as guides or analysts—not editors. Let them help you deeply understand the existing system until you replace it piece-by-piece with something clean, well-tested, and sane.

Bonus: The Joy of Software Archeology

One of the coolest roles I've discovered for LLMs is as software archeologists. Feed them terse git commit histories or messy design docs, and they'll piece together surprisingly coherent narratives around past architectural decisions or the evolution of certain features. It's a bit magical, honestly.

Final Thoughts

Leveraging LLMs in legacy code doesn't mean giving them the keys to your production system. It means using them to amplify your understanding, documentation, mocks, and ultimately your confidence.

So next time you're drowning in legacy chaos, remember: the kryptonite is already at hand—just don't let it touch your actual legacy code.

💻 🤖 ✨ Interested in coding with AI and LLMs? Follow me @ProgramWithAI.

🧠 ❤️ 🖥 Connect with me on Mastodon @mnl@hachyderm.io or check my writing on dev.to/wesen.

For more detailed notes and insights, visit my 🧠 Obsidian vault.

🎵 Explore my music projects as slono.

Subscribe via 💌 RSS to keep updated.