The Inference Report

May 13, 2026

From the Wire

The real story today isn't what the AI companies are announcing. It's the infrastructure arms race and the collateral damage it's creating. Google is pitching Googlebooks and agentic Gemini; Anthropic is expanding into legal services; OpenAI launched Daybreak for cyber defense. These are product launches dressed as strategy. What actually matters is what's happening underneath: compute costs are driving decisions that reshape entire industries, and the winners are those who control the pipes, not the applications running through them.

The compute constraint is creating a two-tier market. Google and SpaceX are literally talking about putting data centers in orbit because ground-based costs have become a limiting factor for AI training and inference at scale. Meanwhile, on Earth, Grant County Public Utility District is seizing farmland through eminent domain to build transmission lines for data centers, and New Jersey towns are banning data centers altogether after chaotic public meetings. CME is launching futures contracts for GPU rental prices, turning compute capacity into a commodity that traders can hedge. This isn't sustainable growth; it's extraction. The companies that win won't be the ones with the best models. They'll be the ones who lock in compute supply first. Microsoft's renegotiated deal with OpenAI gives it first rights to new models while OpenAI gains freedom to sell elsewhere, but that only matters if OpenAI can actually access the chips and power it needs. It can't. Neither can anyone else.

The second-order effects are already visible. Amazon employees are "tokenmaxxing" on internal AI tools not because the work is valuable but because the metric exists and creates status. GitLab is warning customers that their monthly bills for AI-enabled developer tools will rise from hundreds of dollars per seat to thousands within a year, a structural shift driven entirely by the volume of compute agents consume. A Microsoft benchmark called DELEGATE-52 found that 19 large language models are error-prone and unreliable at multi-step tasks outside of Python programming, yet the industry is shipping agents into healthcare, legal services, and cyber defense anyway. Medicare's new ACCESS payment model creates the first governmental mechanism to reimburse AI agents for patient monitoring between visits, but that reimbursement will only accelerate adoption of systems we know aren't trustworthy at complex reasoning. A teenager died after ChatGPT recommended a drug combination, and Anthropic is now warning investors against secondary trading platforms, suggesting the company is bracing for scrutiny. The infrastructure is being built faster than the systems using it are being tested, and the incentives are all wrong.

Sloane Duvall