Copilot has entered the usage-based era. What can engineers do to reduce costs?

1 June 2026

So, another one bites the dust - GitHub is the latest frontier model provider to announce a major shift in pricing. The days of all-you-can-eat consumption are over and engineers now need to pay attention to basic unit economics and model inference costs.

Glad you called that v2

As of today, 1 June, GitHub Copilot moves to usage-based billing - announced just five days after we argued that the era of unlimited developer AI spend was ending. The timing rather made the point for us. Subscription prices haven't changed, but the included allowance now meters against token usage and across a busy engineering team, it no longer goes very far.

What's changed

The headline figures are reassuring and a little misleading. Copilot Business holds at $19 per user a month and Enterprise at $39, with the individual Pro tiers unchanged too. What's gone is the premium request unit. In its place, each seat now carries a monthly allowance of GitHub AI Credits worth the same as the subscription ($19 of credits per Business user, $39 per Enterprise user) and that allowance is metered by token consumption counting input, output and cached tokens at each model's published API rate.

Code completions and Next Edit suggestions stay free and don't touch the allowance, so everyday autocomplete is unaffected. But that credit pool will not stretch far across a team leaning on the agentic features, and a second meter now runs alongside it - Copilot code review and remote sessions both consume GitHub Actions runner minutes today, billed at the standard rate.

Business and Enterprise customers do get a three-month grace period of enhanced included usage through August and credits pool across the organisation rather than stranding per seat, but once that runway ends, the variable line on the bill is real.

Why it changed

The old pricing model was generous to the point of being mis-priced. Copilot charged one premium request per turn i.e. all inference calls until the agent returns a response/control to the user. In simple use cases, that is one model call. For agentic use cases, a single turn can fan out into dozens of model calls, tool invocations, file reads and edits before it comes back to you - the whole lot costs exactly one premium request.

That was a wonderful deal while agents were a novelty and a structural problem once they became the default way teams work. Token-based billing realigns the price with what the agent actually consumes, which is why the teams leaning hardest on agents are the ones who will feel it the most.

What it means

As we argued a few weeks ago, intelligent routing across models is where things are ultimately headed. But this future hasn't shipped yet. In the meantime, the practical response sits closer to hand:

  • Manage the context you send.
  • Use the cost levers your harness already ships with.
  • Put an enforceable budget between your team and the providers.

I will cover these in more detail in my next article on how to tame your token bill.

If you are staring at a developer AI bill that is climbing faster than you can explain to a board, that is precisely the conversation we are having with CTOs across financial services right now. We are happy to have it with you also.

Chris van Es

Head of AI

AI