The Inference Report

March 20, 2026
AI Labs — News

OpenAI is building internal infrastructure to catch misalignment in its own agents while simultaneously acquiring Astral to scale Python tooling, a paired move that signals where the company sees both its immediate risk surface and its commercial moat. The acquisition accelerates Codex, which means OpenAI is doubling down on code as the primary interface between human intent and AI execution, betting that developer lock-in through superior tooling matters more than open-source competition. Meanwhile, GitHub is shipping coordinated multi-agent workflows designed to stay "inspectable and predictable," which is a direct answer to the misalignment monitoring problem OpenAI published on the same day, both companies are racing to make agent orchestration legible enough that it can be audited and controlled. Hugging Face introduced SPEED-Bench for speculative decoding, a performance optimization that has nothing to do with safety or capability but everything to do with inference cost, suggesting the real competitive pressure in the lab space has shifted from model weights to operational efficiency. NVIDIA and AMD are both chasing the same workloads, NVIDIA pushing VR streaming at 90fps on GeForce NOW while AMD optimizes GEMM tuning for LLM inference and weather forecasting, which means the hardware vendors are no longer waiting for labs to define the use case; they are shipping solutions that pull demand. MIRI's fundraising push for $6M and a documentary premiere signal that safety-focused organizations are fighting for narrative and capital in an environment where builders are moving faster than governance frameworks can follow, and that battle is being fought through public attention, not technical papers.

Sloane Duvall

AI Labs — Models

A curated reference of models from major AI labs, with open/closed weight status, input modalities, and context window size. American labs tend towards closed weights models and Chinese labs tend toward open weights models.

usAmazon
Closed Weights
  • Amazon: Nova 2 Lite
    TextVisionVideoFiles1M
  • Amazon: Nova Premier 1.0
    TextVision1M
Open Weights

None

usAnthropic
Closed Weights
  • Anthropic: Claude Haiku 4.5
    VisionText200K
  • Anthropic: Claude Opus 4
    VisionTextFiles200K
  • Anthropic: Claude Opus 4.1
    VisionTextFiles200K
  • Anthropic: Claude Opus 4.5
    FilesVisionText200K
  • Anthropic: Claude Opus 4.6
    TextVision1M
  • Anthropic: Claude Sonnet 4
    VisionTextFiles200K
  • Anthropic: Claude Sonnet 4.5
    TextVisionFiles1M
  • Anthropic: Claude Sonnet 4.6
    TextVision1M
Open Weights

None

usGoogle DeepMind
Closed Weights
  • Google: Gemini 2.5 Flash
    FilesVisionTextAudioVideo1M
  • Google: Gemini 2.5 Flash Lite
    TextVisionFilesAudioVideo1M
  • Google: Gemini 2.5 Flash Lite Preview 09-2025
    TextVisionFilesAudioVideo1M
  • Google: Gemini 2.5 Pro
    TextVisionFilesAudioVideo1M
  • Google: Gemini 2.5 Pro Preview 05-06
    TextVisionFilesAudioVideo1M
  • Google: Gemini 2.5 Pro Preview 06-05
    FilesVisionTextAudio1M
  • Google: Gemini 3 Flash Preview
    TextVisionFilesAudioVideo1M
  • Google: Gemini 3 Pro Preview
    TextVisionFilesAudioVideo1M
  • Google: Gemini 3.1 Flash Lite Preview
    TextVisionVideoFilesAudio1M
  • Google: Gemini 3.1 Pro Preview
    AudioFilesVisionTextVideo1M
  • Google: Gemini 3.1 Pro Preview Custom Tools
    TextAudioVisionVideoFiles1M
  • Google: Nano Banana (Gemini 2.5 Flash Image)
    VisionText33K
  • Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)
    VisionText66K
  • Google: Nano Banana Pro (Gemini 3 Pro Image Preview)
    VisionText66K
Open Weights
  • Google: Gemma 3n 4B
    Text33K
usMeta
Closed Weights

None

Open Weights
  • Meta: Llama 4 Maverick
    TextVision1M
  • Meta: Llama 4 Scout
    TextVision328K
  • Meta: Llama Guard 4 12B
    VisionText164K
usOpenAI
Closed Weights
  • OpenAI: GPT Audio
    TextAudio128K
  • OpenAI: GPT Audio Mini
    TextAudio128K
  • OpenAI: GPT-4.1
    VisionTextFiles1M
  • OpenAI: GPT-4.1 Mini
    VisionTextFiles1M
  • OpenAI: GPT-4.1 Nano
    VisionTextFiles1M
  • OpenAI: GPT-4o Audio
    AudioText128K
  • OpenAI: GPT-5
    TextVisionFiles400K
  • OpenAI: GPT-5 Chat
    FilesVisionText128K
  • OpenAI: GPT-5 Codex
    TextVision400K
  • OpenAI: GPT-5 Image
    VisionTextFiles400K
  • OpenAI: GPT-5 Image Mini
    FilesVisionText400K
  • OpenAI: GPT-5 Mini
    TextVisionFiles400K
  • OpenAI: GPT-5 Nano
    TextVisionFiles400K
  • OpenAI: GPT-5 Pro
    VisionTextFiles400K
  • OpenAI: GPT-5.1
    VisionTextFiles400K
  • OpenAI: GPT-5.1 Chat
    FilesVisionText128K
  • OpenAI: GPT-5.1-Codex
    TextVision400K
  • OpenAI: GPT-5.1-Codex-Max
    TextVision400K
  • OpenAI: GPT-5.1-Codex-Mini
    VisionText400K
  • OpenAI: GPT-5.2
    FilesVisionText400K
  • OpenAI: GPT-5.2 Chat
    FilesVisionText128K
  • OpenAI: GPT-5.2 Pro
    VisionTextFiles400K
  • OpenAI: GPT-5.2-Codex
    TextVision400K
  • OpenAI: GPT-5.3 Chat
    TextVisionFiles128K
  • OpenAI: GPT-5.3-Codex
    TextVisionFiles400K
  • OpenAI: GPT-5.4
    TextVisionFiles1M
  • OpenAI: GPT-5.4 Mini
    FilesVisionText400K
  • OpenAI: GPT-5.4 Nano
    FilesVisionText400K
  • OpenAI: GPT-5.4 Pro
    TextVisionFiles1M
  • OpenAI: o3
    VisionTextFiles200K
  • OpenAI: o3 Deep Research
    VisionTextFiles200K
  • OpenAI: o3 Pro
    TextFilesVision200K
  • OpenAI: o4 Mini
    VisionTextFiles200K
  • OpenAI: o4 Mini Deep Research
    FilesVisionText200K
  • OpenAI: o4 Mini High
    VisionTextFiles200K
Open Weights
  • OpenAI: gpt-oss-120b
    Text131K
  • OpenAI: gpt-oss-20b
    Text131K
  • OpenAI: gpt-oss-safeguard-20b
    Text131K
usxAI
Closed Weights
  • xAI: Grok 3
    Text131K
  • xAI: Grok 3 Beta
    Text131K
  • xAI: Grok 3 Mini
    Text131K
  • xAI: Grok 3 Mini Beta
    Text131K
  • xAI: Grok 4
    VisionText256K
  • xAI: Grok 4 Fast
    TextVision2M
  • xAI: Grok 4.1 Fast
    TextVision2M
  • xAI: Grok 4.20 Beta
    TextVision2M
  • xAI: Grok 4.20 Multi-Agent Beta
    TextVision2M
  • xAI: Grok Code Fast 1
    Text256K
Open Weights

None

frMistral AI
Closed Weights
  • Mistral: Codestral 2508
    Text256K
  • Mistral: Devstral Medium
    Text131K
  • Mistral: Mistral Large 3 2512
    TextVision262K
  • Mistral: Mistral Medium 3
    TextVision131K
  • Mistral: Mistral Medium 3.1
    TextVision131K
  • Mistral: Mistral Small Creative
    Text33K
Open Weights
  • Mistral: Devstral 2 2512
    Text262K
  • Mistral: Devstral Small 1.1
    Text131K
  • Mistral: Ministral 3 14B 2512
    TextVision262K
  • Mistral: Ministral 3 3B 2512
    TextVision131K
  • Mistral: Ministral 3 8B 2512
    TextVision262K
  • Mistral: Mistral Small 3.2 24B
    VisionText128K
  • Mistral: Mistral Small 4
    TextVision262K
  • Mistral: Voxtral Small 24B 2507
    TextAudio32K
ilAI21 Labs
Closed Weights

None

Open Weights
  • AI21: Jamba Large 1.7
    Text256K
cnAlibaba (Qwen)
Closed Weights
  • Qwen: Qwen Plus 0728
    Text1M
  • Qwen: Qwen Plus 0728 (thinking)
    Text1M
  • Qwen: Qwen3 Coder Flash
    Text1M
  • Qwen: Qwen3 Coder Plus
    Text1M
  • Qwen: Qwen3 Max
    Text262K
  • Qwen: Qwen3 Max Thinking
    Text262K
  • Qwen: Qwen3.5 Plus 2026-02-15
    TextVisionVideo1M
  • Qwen: Qwen3.5-Flash
    TextVisionVideo1M
Open Weights
  • Qwen: Qwen2.5 Coder 7B Instruct
    Text33K
  • Qwen: Qwen2.5 VL 32B Instruct
    TextVision128K
  • Qwen: Qwen3 14B
    Text41K
  • Qwen: Qwen3 235B A22B
    Text131K
  • Qwen: Qwen3 235B A22B Instruct 2507
    Text262K
  • Qwen: Qwen3 235B A22B Thinking 2507
    Text131K
  • Qwen: Qwen3 30B A3B
    Text41K
  • Qwen: Qwen3 30B A3B Instruct 2507
    Text262K
  • Qwen: Qwen3 30B A3B Thinking 2507
    Text131K
  • Qwen: Qwen3 32B
    Text41K
  • Qwen: Qwen3 8B
    Text41K
  • Qwen: Qwen3 Coder 30B A3B Instruct
    Text160K
  • Qwen: Qwen3 Coder 480B A35B
    Text262K
  • Qwen: Qwen3 Coder Next
    Text262K
  • Qwen: Qwen3 Next 80B A3B Instruct
    Text262K
  • Qwen: Qwen3 Next 80B A3B Thinking
    Text131K
  • Qwen: Qwen3 VL 235B A22B Instruct
    TextVision262K
  • Qwen: Qwen3 VL 235B A22B Thinking
    TextVision131K
  • Qwen: Qwen3 VL 30B A3B Instruct
    TextVision131K
  • Qwen: Qwen3 VL 30B A3B Thinking
    TextVision131K
  • Qwen: Qwen3 VL 32B Instruct
    TextVision131K
  • Qwen: Qwen3 VL 8B Instruct
    VisionText131K
  • Qwen: Qwen3 VL 8B Thinking
    VisionText131K
  • Qwen: Qwen3.5 397B A17B
    TextVisionVideo262K
  • Qwen: Qwen3.5-122B-A10B
    TextVisionVideo262K
  • Qwen: Qwen3.5-27B
    TextVisionVideo262K
  • Qwen: Qwen3.5-35B-A3B
    TextVisionVideo262K
  • Qwen: Qwen3.5-9B
    TextVisionVideo256K
cnByteDance
Closed Weights
  • Seed: Seed 1.6
    VisionTextVideo262K
  • Seed: Seed 1.6 Flash
    VisionTextVideo262K
  • Seed: Seed-2.0-Lite
    TextVisionVideo262K
  • Seed: Seed-2.0-Mini
    TextVisionVideo262K
Open Weights

None

cnDeepSeek
Closed Weights

None

Open Weights
  • DeepSeek: DeepSeek V3 0324
    Text164K
  • DeepSeek: DeepSeek V3.1
    Text33K
  • DeepSeek: DeepSeek V3.1 Terminus
    Text164K
  • DeepSeek: DeepSeek V3.2
    Text164K
  • DeepSeek: DeepSeek V3.2 Exp
    Text164K
  • DeepSeek: DeepSeek V3.2 Speciale
    Text164K
  • DeepSeek: R1 0528
    Text164K
cnMiniMax
Closed Weights
  • MiniMax: MiniMax M1
    Text1M
  • MiniMax: MiniMax M2-her
    Text66K
Open Weights
  • MiniMax: MiniMax M2
    Text197K
  • MiniMax: MiniMax M2.1
    Text197K
  • MiniMax: MiniMax M2.5
    Text197K
  • MiniMax: MiniMax M2.7
    Text205K