NVIDIA is positioning infrastructure for the shift from inference to continuous execution. Vera targets a specific architectural gap: agentic workloads demand sustained multi-core performance and memory bandwidth that differ sharply from the batch-processing patterns that dominated the previous generation of AI compute. This is not a marginal optimization. It signals that NVIDIA sees the bottleneck moving from raw throughput to the ability to keep all cores active under load, which is the computational signature of systems that reason and act rather than respond to discrete queries. AWS and Anthropic, meanwhile, are making regional and organizational moves that suggest a different priority: distribution and market presence rather than core infrastructure innovation. AWS is emphasizing startup engagement and local geographic expansion, while Anthropic is opening in Seoul ahead of Computex, a positioning play for Asia-Pacific mindshare. The divergence is telling. NVIDIA is betting the next phase of competition happens at the silicon level, where agentic demands create new CPU requirements. AWS and Anthropic are betting it happens at the layer above, where access, trust, and regional footprint matter more than who optimized the memory bus.
Sloane Duvall
A curated reference of models from major AI labs, with open/closed weight status, input modalities, and context window size. American labs tend towards closed weights models and Chinese labs tend toward open weights models.
None
None
None
None
None
None
None