For most of 2025 and the first half of 2026, AI tokens behaved like a cheap, almost infinite resource. Tokens are the units that meter how much work an AI model does on a given request, and the basis for what AI vendors charge per request. Meta treated them that way too, building parts of its internal engineering culture around how many tokens its staff burned through each week. That worldview has now run into a wall: Google has informed Meta it can no longer keep up with demand and imposed usage limits on Meta's access to Google's Gemini AI models, according to a Financial Times report that cited three people familiar with the matter.
The restrictions have disrupted and delayed some of Meta's internal AI projects and remain in place, the FT's anonymous sources told the paper, as relayed by CNBC and The Next Web. Both Meta and Google declined to comment on the FT report. The cause, according to those sources, is straightforward: Google could not provision enough AI compute to satisfy Meta's appetite.
That shortfall is severe enough that even Google, one of the largest builders of AI infrastructure, has been forced to rent capacity from elsewhere. Earlier this month, Google agreed to pay SpaceX roughly $920 million per month to access data-center compute operated by xAI, the Elon Musk-founded AI lab that SpaceX acquired in 2025. A company that runs one of the world's largest networks of AI servers is now writing a nine-figure monthly check to a smaller rival owned by a rocket company to keep its own customers served.
The episode also exposes how fragile the industry's recent internal fads have become. Throughout early 2026, Gizmodo reported, some tech employers, including Meta, had begun evaluating engineers partly by how many large-language-model tokens they consumed, an internal metric that only makes sense in a world of effectively unlimited supply. With Google now rationing access, Meta has reportedly shifted its internal messaging: be more efficient with tokens, treat them as a finite resource rather than a performance lever.
The deeper story is not that two giants of consumer AI had a spat. It is that the AI race has moved past the point where model quality alone decides who wins. Compute, the raw data-center capacity needed to train and run large AI systems, has become the binding constraint, and the management practices that grew up around cheap tokens are the first things to break under rationing.
What to watch: whether Meta accelerates plans to diversify its AI suppliers or brings more of its own training capacity online, whether Google's reliance on rented SpaceX and xAI compute becomes a recurring line item rather than a one-off, and whether the rest of the industry quietly abandons token-burn as an engineer evaluation metric before someone has to publicly explain why it stopped working.