LLMs - A smarter way to modernise legacy code?

08 September 2025

Software leaders are staring at a long tail of legacy code: older languages, unsupported frameworks and test suites that no longer reflect how teams build. But what would it mean if migrations could be faster, safer and more cost-effective than ever before?

LLMs - A smarter way to modernise legacy code?

Large Language Models (LLMs) change this equation. When combined with disciplined engineering and automation, they can compress timelines, reduce risk and make migrations economically attractive without compromising code quality.

For years, the only option was an expensive, slow and highly manual migration. Large Language Models (LLMs) change that equation. When combined with disciplined engineering and automation they can compress timelines, reduce risk, and make migrations economically attractive without compromising code quality. At Instil, we’ve partnered with multiple customers to ship LLM‑assisted migrations in two broad categories:

Application code translation - for example, moving services from one language/runtime to another as part of a platform strategy.
Test modernisation - for example, upgrading from an outdated testing framework to a modern one to improve stability and confidence.

Why LLMs work for code migration

LLMs excel at pattern recognition and structured transformation when the task can be described with clear before/after intent. Migrations at scale are largely about consistency rather than invention. By feeding models the right context (domain conventions, examples, architectural constraints) and surrounding them with validation and automation, we can refactor thousands of files with a predictable level of quality and reserve scarce engineer time for the small minority of complex cases.

A recent public example illustrates the point. In their Medium article, Airbnb described how they migrated roughly 3.5k React component tests from Enzyme to React Testing Library with an LLM‑driven pipeline. The team combined a step‑based workflow, retry loops and rich prompts that pulled in related files and high‑quality examples. They completed the bulk of the work in weeks rather than the year they’d previously estimated while preserving test intent and coverage.

The high‑level workflow

Every organisation and codebase is different but effective workflows tend to follow the following pattern.

Assessment and scoping

We start by mapping the migration surface area and the unit of work (e.g., a single test file, a controller, a service). From a shallow analysis and a sample migration, we classify files by complexity, identify blockers (framework gaps, API differences, flaky tests) and define success criteria: compile, pass tests, meet lint/type rules and preserve behaviour.

Design the target and the rules

LLMs are only as good as the target they aim at. We codify the destination: language constructs, library choices, architectural boundaries and testing style. We build concise migration guidelines and a small set of best practice examples that show desired before/after transformations. These are used in prompts and in automated validation.

Build the automation pipeline

Rather than “chatting” with a model, we run a deterministic pipeline that treats each file like a job moving through stages:

Transform - invoke the LLM with the file, neighbouring context and examples.
Validate - compile, run linters/formatters, execute relevant tests and perform static analysis checks.
Repair - if validation fails, feed errors back to the model and retry with a bounded loop. This state‑machine approach makes outcomes observable, tunable and highly parallelisable.

Provide rich, local context

For the long tail of tricky files, we expand context: related modules, sibling tests, common fixtures, and team‑specific patterns. The goal is to help the model preserve intent and idioms, not just syntax. We keep prompts simple but context‑dense and we curate which related files matter.

Human‑in‑the‑loop by design

Engineers stay in control. We stage output as pull requests, tag riskier diffs for review and route genuinely hard cases to humans early. Observability (dashboards, per‑stage success rates, cost/latency) lets teams focus their attention where it counts. The aim is to automate the common path and make human time surgical.

What good looks like

A well managed workflow doesn’t chase perfect automation; it aims for high automation coverage with predictable fallbacks. In practice, we see:

A large first pass completes automatically and safely.
Focused tuning lifts coverage further with diminishing returns.
The remainder is finished quickly by engineers using LLM‑generated diffs as a baseline. Crucially, quality gates (types, linters, tests, behaviour checks) are non‑negotiable. Velocity rises because fewer people do less repetitive work, not because we relax standards.

Typical use cases we deliver

Language/runtime moves (e.g., JavaScript → TypeScript) to align with platform strategy, reduce licensing/runtime costs, or unlock performance and hiring pools.
Framework upgrades (e.g., test frameworks) where API differences make hand‑migration prohibitively slow.
Test suite modernisation to improve developer experience and CI stability, shifting teams to practices that better match current frameworks.

Getting started

The best on‑ramp is a bounded pilot on a meaningful, representative slice of the codebase. That proves the pipeline, quantifies automation coverage and exposes real‑world blockers. From there, treat migration as a product: build a roadmap, publish metrics and expand.

If you’re considering a migration, Instil can help you design the pipeline, set the guardrails and deliver results at scale. The combination of LLMs and mature engineering practices lets you trade slow, risky rewrites for a controlled, observable workflow that actually accelerates the business.