The labs are converging on agents as the next distribution layer for AI, and the infrastructure conversation is shifting from model capability to token economics. OpenAI is shipping developer tooling, sandbox execution, SDK hardening, and educational content on prompting, that assumes agents will be the primary interface for building. Google DeepMind is adding granular audio control to Gemini 3.1 Flash TTS, moving speech from generic to directed. Hugging Face is publishing work on agent failure modes and releasing HoloTab as a browser companion, positioning itself as the agent analysis and deployment platform. GitHub is embedding agents into the developer workflow via Copilot CLI. The infrastructure story is equally revealing: NVIDIA is reframing the data center as a token factory and arguing that cost per token is the only metric that matters, a clean way to commoditize inference and lock in margin on volume. IBM is monetizing risk by packaging agentic attack surface as a new cybersecurity assessment. Anthropic is hiring for automated alignment research while appointing a pharma executive to its long-term benefit trust, signaling institutional maturity but also that the company's governance structure is calcifying before product-market fit is proven. The pattern across all of this is infrastructure vendors preparing for a world where agents consume tokens at scale, while application layers compete on whose sandbox, whose harness, and whose reasoning framework captures developer attention first. The real estate grab is happening now.
Sloane Duvall
A curated reference of models from major AI labs, with open/closed weight status, input modalities, and context window size. American labs tend towards closed weights models and Chinese labs tend toward open weights models.
None
None
None
None
None
None
None