AI in 2025: The terminal becomes cool again
Here's a look back at some of the biggest AI advancements in 2025 - along with our predictions for 2026.
2025 was quite a year. Looking back at where we started in January and where we find ourselves now, it's almost hard to believe it’s only been twelve months. From the way development teams ship code to how enterprises consume AI, 2025 was a year of tremendous pace (which has sometimes been hard to keep up with!). Here's our take on the trends, tools and transformations that defined the year and what they mean as we head into 2026.
Enter Claude Code
Perhaps no shift has been more profound for software teams than the emergence of agentic coding tools that live in the terminal. Claude Code led this charge, becoming generally available in May and quickly reaching $1 billion revenue in just six months. The tool doesn't try to replace your IDE – it enhances your existing workflow by leveraging existing CLI tools, deployment scripts, MCP servers (more on that later!), and skills.
What makes these tools genuinely transformative isn't just code generation. It's the ability to hand off entire workflows - triaging issues, writing code, running tests and submitting pull requests all from natural language commands. As one developer from the Claude Code team put it in The Pragmatic Engineer, the philosophy behind Claude Code is to let people "feel the model as raw as possible" rather than cluttering the experience with unnecessary scaffolding.
The competition hasn't sat idle. OpenAI followed with Codex CLI, Google launched Gemini CLI, AWS has Kiro CLI, Cursor has their own CLI….and so on.
Here's what we've observed at Instil, teams with strong engineering practices benefit most from these tools. Test-driven development has become more important than ever – writing tests first then having the agent implement a first pass produces remarkably reliable results. The teams struggling are those trying to use agents as a shortcut around good practices rather than an accelerant for them.
Anthropic's own research backs this up. Surveying their engineers, they found that 27% of Claude-assisted work consists of tasks that simply wouldn't have been done otherwise – “papercut fixes”, exploratory work and nice-to-have tooling that previously fell off the priority list. Engineers reported using Claude in 60% of their work with a 50% productivity boost but critically most said they can only fully delegate 0–20% of their work. The rest requires active collaboration, verification and most importantly human review.
A Glimpse of multi-agent orchestration
At GitHub Universe 2025 in October, GitHub announced Agent HQ and we're genuinely excited about where this is heading. The concept is simple, rather than forcing developers to choose between competing AI agents GitHub is positioning itself as the orchestration layer that unites them all.
Agent HQ introduces Mission Control, a unified command centre accessible across GitHub, VS Code, mobile and the CLI. From here, you can assign work to multiple agents simultaneously and track progress across tasks. GitHub have stated that agents from Anthropic, OpenAI, Google and xAI will be integrated into the platform and made available as part of existing Copilot subscriptions.
The enterprise governance features are particularly interesting. Custom agents defined via prompt files at the organisation or repository level allow teams to encode their standards and create a new generation of tools capable of reviewing code for security issues, analysing test quality or even suggesting missing tests.
We see Agent HQ as the beginning of something bigger. Today, it's about managing coding agents. Tomorrow, it could be the standard interface for orchestrating any kind of AI-powered work across software engineering teams. The infrastructure being built now will matter enormously.
From vibe coding to AI-assisted engineering
Remember when Andrej Karpathy coined "vibe coding" back in February 2025? The term took off so quickly it became Collins Dictionary's Word of the Year. The original concept was playful - fully give in to the vibes, embrace experimentation and forget that the code even exists.
But as the year progressed, the industry quickly learned some hard lessons. An August survey of 18 CTOs found 16 had experienced production disasters directly caused by AI-generated code. Whilst vibe coding an idea through to production may be great for demos and prototypes, AI assisted engineering treats coding agents more as a pairing partner enhancing and augmenting an engineer’s skills rather than replacing them.
Small models, big impact
If there's one technical trend that excited us most last year, it's the rise of small language models (SLMs) for agentic applications. A paper from NVIDIA researchers made the compelling case that small language models are the future of agentic AI as most agent tasks are "repetitive, scoped and non-conversational" – perfectly suited for specialised, efficient models rather than massive general-purpose ones.
Way back in May 2024 (which is a lifetime in the GenAI space...), Microsoft previously reported high success rates deploying their own Phi-3 model for internal use in data centre supply chain fulfilment demonstrating real world application of SLMs. Epic Systems have also been reportedly leveraging Phi-3 in clinical environments where using cloud hosted models are not an option.
Early last year, Google's Gemma 3n arrived as one of the first multimodal on-device small language models supporting text, image, video and audio inputs. Gemini Nano and Apple Intelligence models have been available on Google/Apple devices for a while but we're now also starting to see these models being integrated more and more into core applications (e.g. web browsers and messaging) along with the APIs opening up to developers.
This is going to be a particularly interesting area to monitor this year as the potential is huge, particularly for agentic workflows in secure environments, providing access to developing economies and the possibility of creating another wave of innovation.
The protocol that won
When Anthropic introduced the Model Context Protocol in November 2024, it was solving a genuine problem, how do AI models connect to external tools and data without building bespoke integrations for every system? A year later, MCP has become the de facto standard.
The adoption story is remarkable. OpenAI officially adopted MCP in March, integrating it across the ChatGPT desktop app, Agents SDK and Responses API. Google confirmed MCP support for Gemini in April. Microsoft released Playwright-MCP for browser automation and MCP connectors for the M365 Copilot ecosystem. The MCP Registry now lists nearly 2000+ servers.
What makes MCP powerful is its simplicity. Built on JSON-RPC 2.0, it provides a universal interface for managing tools, accessing knowledge and sharing prompts. Atlassian, Stripe, GitHub, Notion, Hugging Face and Postman all have official MCP servers. If you're building an AI application that needs to talk to external systems, MCP is a good option. OpenAI have even announced that their app ecosystem is built on top of MCP.
It’s worth nothing that Google also released their Agent-to-Agent (A2A) protocol for cross agent communication and both protocols may evolve to complement each other. But for grounding models in private data and enterprise tools, MCP won 2025.
The curious case of Nano Banana
We can't write a 2025 review without mentioning Nano Banana – Google's gloriously named image editing model that launched in August and had people genuinely going bananas. Built on Gemini 2.5 Flash, it became the top-rated image editing model in the world enabling transformations that maintain character consistency across edits.
In November, Google followed up with Nano Banana Pro, built on Gemini 3 Pro, adding studio-quality precision, multilingual text rendering and the ability to work with up to 14 different images while maintaining character consistency. Turn yourself into a figurine, blend photos seamlessly, try on different hairstyles – the viral potential was off the charts.
Why mention it in a serious industry recap? Because Nano Banana drove meaningful adoption growth for Gemini. Some reports show that Gemini's monthly active users jumped 30% from August to November with Nano Banana specifically cited as a driver. Nano Banana has also made its way into the likes of NotebookLM which now capable of producing detailed visuals in slide decks covering complex topics. Fun features matter.
When AI improves AI
DeepMind's AlphaEvolve, unveiled in May, represents something genuinely new - an AI system that can discover and optimise algorithms including the algorithms used to train itself.
The practical impacts at Google are already substantial. AlphaEvolve discovered scheduling heuristics that continuously recover 0.7% of Google's worldwide compute resources. It proposed circuit modifications for an upcoming TPU that passed robust verification. Most remarkably, it found optimisations to matrix multiplication that accelerated Gemini's training by 1%.
This feels like a preview of where AI development is heading, systems that improve their own foundations and whilst researching this article Google also released a preview of AlphaEvolve as a managed service available via GCP.
Looking forward
Predictions can be hit or miss, but here's what we're watching in 2026:
Agentic AI goes mainstream. The infrastructure is being built now – Agent HQ, MCP, skills, CLI agents – for AI systems that can take sustained action over hours or days. Expect to see production deployments where agents handle significant workflows with human oversight rather than human execution.
The "full-stack AI engineer" emerges. The skills gap is shifting. Engineers who can effectively collaborate with AI tools, knowing when to delegate and when to intervene will be increasingly valuable. The best performers won't be those who resist AI or those who blindly accept its output but those who've developed intuition for productive collaboration.
Edge AI becomes practical. Small language models running on-device with all the privacy and latency benefits that implies will move from demos to production. We expect to see voice assistants, industrial applications and consumer products that work without cloud connectivity.
Consolidation and standards. The current fragmentation (multiple protocols, competing agent platforms, incompatible tools) will likely consolidate. Anthropic’s decision to create open standards for MCP and skills has paid off massively. GitHub's Agent HQ approach suggests the orchestration layer is stabilising. The newly announced Agentic AI Foundation will hopefully also bring further standardisation across the big players.
Workforce questions intensify. McKinsey's recent survey found 32% of respondents expect workforce decreases of 3% or more in the coming year due to AI, while 13% expect increases. The reality is likely more nuanced with roles changing rather than disappearing, new skills becoming essential and old ones becoming automated. Companies that invest in reskilling and thoughtful transition will navigate this better than those that don't.
Final thoughts
2025 was the year AI went from impressive to essential. Not in a hype-cycle way, but in the practical sense of becoming embedded in daily workflows across industries. The tools work. The infrastructure exists.
At Instil, we've spent the year helping customers navigate this transition – building AI-powered applications, integrating agentic capabilities, making sense of rapidly evolving options. What we've learned is that the organisations succeeding aren't the ones chasing every new model release. They're the ones thinking carefully about which problems AI can actually solve, building robust practices around AI collaboration and investing in their people alongside their technology.
The technology will keep advancing. The real work is figuring out how to use it well, sustainably.