The Inference Report — May 7, 2026

The infrastructure that powers AI is consolidating around a utility layer that will outlast any particular model or lab. Like the railroad barons of the nineteenth century who profited more from controlling tracks than from operating trains, today's winners are companies managing compute, data centers, and distribution channels rather than publishing breakthrough papers. Anthropic rents capacity from xAI, which is building a $119 billion chip factory in Texas. TSMC backs renewables to handle the energy crunch. Samsung crossed $1 trillion in valuation on chip demand alone. Arm projects $2 billion in AI chip sales from next year. The actual AI companies have become tenants in someone else's infrastructure play, and this shift is accelerating faster than model capability improvements. Meanwhile, regulatory capture is being dressed up as safety oversight. Trump's endorsement of AI testing, combined with the Center for AI Standards and Innovation signing pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI, creates a system where incumbents vet their own competitors before launch. The threat is DeepSeek's $45 billion valuation, built on models trained at a fraction of US compute costs, which makes government gatekeeping suddenly appealing to the labs that shaped this moment. The real story beneath the safety rhetoric is ownership of the upside: Greg Brockman's stake worth $30 billion, Shivon Zilis's testimony about Musk recruiting Altman, the public history of the lawsuit all reveal that the fight is over who owns the technology when it works.

The labs are no longer competing primarily on model capability. Every one is racing to own a piece of the path from model to production, and whoever controls the most friction points wins customer lock-in. OpenAI stacks use cases with Uber and Singular Bank to show that AI adoption compounds into competitive moat. AWS and GitHub are releasing MCP servers and validation frameworks that lock customers into their platforms. NVIDIA and OpenAI's joint push around Multipath Reliable Connection through the Open Compute Project, paired with NVIDIA's Spectrum-X Ethernet fabric and Corning's optical manufacturing partnership, signals that whoever controls the physical layer controls the economic rents. The agent infrastructure layer is crystallizing around stateful, long-running systems that coordinate multiple specialized sub-agents. Repos like deer-flow and ruflo handle multi-agent coordination, memory management, and tool integration. Domain-specific agents built on that infrastructure handle financial research, retrieval, and data persistence. The single-turn LLM call is no longer the unit of work. Distribution is fragmenting too. The Snap-Perplexity deal fell through amicably, revealing that Perplexity did not need Snap's users enough to surrender equity or control. Google adds Reddit quotes to search. Ethos onboards 35,000 experts per week through voice. The assumption that one platform would own the entire value chain is breaking down. The winners will control either the compute or the users. Everyone else is renting.

Grant Calloway

AI LabsAll labs

AWS

The AWS MCP Server is now generally available

Anthropic

Higher usage limits for Claude and a compute deal with SpaceX

GitHub Blog

Validating agentic behavior when “correct” isn’t deterministic

Hugging Face

IBM

IBM Consulting Expands AI Capabilities to Accelerate Enterprise Transformation

NVIDIA

OpenAI

From the WireAll feeds

Research Papers — FocusedAll papers

Multi-frame Restoration for High-rate Lissajous Confocal Laser Endomicroscopy eess.IV

Lissajous confocal laser endomicroscopy (CLE) is a promising solution for high speed in vivo optical biopsy for handheld scenarios. However, Lissajous scanning traces a resonant trajectory and samples only the visited pixels per frame; at high frame rates, many pixels remain unvisited, creating structured holes. In this work, we introduce the first benchmark for high-rate Lissajous CLE, consisting of low-quality video clips paired with high-quality reference images. The reference images are wide-FOV mosaics obtained by stitching stabilized, slow-scan frames of the same tissue, enabling temporally aligned supervision. Using this dataset, we propose MIRA, a lightweight recurrent framework for Lissajous CLE restoration that iteratively aggregates temporal context through feature reuse and displacement alignment. Our experiments demonstrate that MIRA outperforms both lightweight and high-complexity baselines in restoration quality while maintaining a favorable computational efficiency suitable for clinical deployment.

FedKPer: Tackling Generalization and Personalization in Medical Federated Learning via Knowledge Personalization eess.IV

Federated learning (FL) holds great potential for medical applications. However, statistical heterogeneity across healthcare institutions poses a major challenge for FL, as the global model struggles both to generalize across unseen patient populations and to adapt to the unique data distributions of individual hospitals. This heterogeneity also exacerbates forgetting at both the global and local level, resulting in previous learned patient patterns to be misclassified after model updates. While prior work has largely treated generalization and personalization as separate challenges, we show that a better balance between the two can be achieved through selective alignment with the global model and a modified aggregation scheme, which together mitigate the effects of statistical heterogeneity. Specifically, we introduce FedKPer, which introduces knowledge personalization into the training stage of each local device. Afterwards, generalization is considered via the global model aggregation process, where local updates that are reliable and label-diverse are emphasized. We evaluate the performance of FedKPer, devising additional metrics that relate to common consequences of forgetting. Overall, we demonstrate FedKPer improves the generalization-personalization trade-off without sacrificing retention.

Unsupervised Denoising of Real Clinical Low Dose Liver CT with Perceptual Attention Networks eess.IV

With the development of deep learning, medical image processing has been widely used to assist clinical research. This paper focuses on the denoising problem of low-dose computed tomography using deep learning. Although low-dose computed tomography reduces radiation exposure to patients, it also introduces more noise, which may interfere with visual interpretation by physicians and affect diagnostic results. To address this problem, inspired by Cycle-GAN for unsupervised learning, this paper proposes an end-to-end unsupervised low-dose computed tomography denoising framework. The proposed framework combines a U-Net structure for multi-scale feature extraction, an attention mechanism for feature fusion, and a residual network for feature transformation. It also introduces perceptual loss to improve the network for the characteristics of medical images. In addition, we construct a real low-dose computed tomography dataset and design a large number of comparative experiments to validate the proposed method, using both image-based evaluation metrics and medical evaluation criteria. Compared with classical methods, the main advantage of this paper is that it addresses the limitation that real clinical data cannot be directly used for supervised learning, while still achieving excellent performance. The experimental results are also professionally evaluated by imaging physicians and meet clinical needs.

Validating the Clinical Utility of CineECG 3D Reconstructions through Cross-Modal Feature Attribution eess.IV

Deep learning models for 12-lead electrocardiogram (ECG) analysis achieve high diagnostic performance but lack the intuitive interpretability required for clinical integration. Standard feature attribution methods are limited by the inherent difficulty in mapping abstract waveform fluctuations to physical anatomical pathologies. To resolve this, we propose a cross-modal method that projects feature attributions from high-performance 12-lead ECG models onto the CineECG 3D anatomical space. Our study reveals that while models trained directly on CineECG signals suffer from reduced accuracy and incoherent attributions, the proposed mapping mechanism effectively recovers clinically relevant feature rankings. Validated against a ground-truth dataset of 20 cases annotated by domain experts, the mapped explanations yield a Dice score of 0.56, significantly outperforming the 0.47 baseline of standard 12-lead attributions. These findings indicate that cross-modal averaging mapping effectively filters attribution instability and improves the localization of pathological features, combining the diagnostic expressiveness of standard ECG with the intuitive clarity of anatomical visualization.

Diffusion-OAMP for Joint Image Compression and Wireless Transmission eess.IV

Joint image compression and wireless transmission remain relatively underexplored compared to generic image restoration, despite its importance in practical communication systems. We formulate this problem under an equivalent linear model, and propose Diffusion-OAMP, a training-free reconstruction framework that embeds a pre-trained diffusion model into the OAMP algorithm. In Diffusion-OAMP, the OAMP linear estimator produces pseudo-AWGN observations, while the diffusion model serves as a nonlinear estimator under an SNR-matching rule. This framework offers a way to incorporate multiple generative priors into OAMP. Experiments with varying compression ratios and noise levels show that Diffusion-OAMP performs favorably against classic methods in the evaluated settings.

Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring eess.IV

The escalating climate crisis and ecosystem degradation demand intelligent, low-cost sensors capable of robust, long-term monitoring in real-world environments. Absolute dissolved oxygen (DO) concentration is a key parameter for predicting climate tipping points. Inexpensive optoelectronic sensors based on microstructured polymer films doped with phosphorescent dyes could be readily deployable; however, signal drift and marine biofouling remain major challenges. Here, we introduce a sensing paradigm that combines camera-based DO sensors with a visual transformer (ViT)-based physics-informed neural network (PINN) for high-fidelity sensing under biofouling conditions. Training and testing data were obtained from an algae-laden water tank over 14 days to capture accelerated biofouling. The ViT-PINN, which embeds the Stern-Volmer (SV) equation into the loss function, reduces mean average error (MAE) by 92% and 89% compared to classical statistical and ML approaches, achieving ~2 umol/L absolute error. A deep ensemble further quantifies predictive uncertainty, enabling self-diagnostic sensing.

BenchmarksFull tables

Intelligence Index

Composite score across coding, math, and reasoning

#	Model	Score	tok/s	$/1M
1	GPT-5.5	60.2	79	$11.25
2	Claude Opus 4.7	57.3	62	$10.94
3	Gemini 3.1 Pro Preview	57.2	131	$4.50
4	GPT-5.4	56.8	80	$5.63
5	Kimi K2.6	53.9	28	$1.71

SWE-rebench

Agentic coding on real-world software engineering tasks

#	Model	Score
1	Claude Opus 4.6	65.3%
2	gpt-5.2-2025-12-11-medium	64.4%
3	GLM-5	62.8%
4	Junie	62.8%
5	gpt-5.4-2026-03-05-medium	62.8%

GitHub Repos All repos

Trending

Hmbown/DeepSeek-TUI

16644 ★

Coding agent for DeepSeek models that runs in your terminal

addyosmani/agent-skills

31833 ★

Production-grade engineering skills for AI coding agents.

PriorLabs/TabPFN

6681 ★

⚡ TabPFN: Foundation Model for Tabular Data ⚡

docusealco/docuseal

15185 ★

Open source DocuSign alternative. Create, fill, and sign digital documents ✍️

LearningCircuit/local-deep-research

5899 ★

Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local & Encrypted.

Daily discovery

huggingface/courseNLP

3889 ★

The Hugging Face course on Transformers

ai-boost/awesome-promptsPrompt Engineering

7818 ★

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

iOfficeAI/AionUiChatbot

23935 ★

Free, local, open-source 24/7 Cowork app and OpenClaw for Gemini CLI, Claude Code, Codex, OpenCode, Qwen Code, Goose CLI, Auggie, and more | 🌟 Star if you like it!

Team-Commonly/commonlyGenerative AI

439 ★

A social platform for humans and AI agents, built and maintained by its own AI team. Connect any agent via HTTP.

NVIDIA-NeMo/DataDesignerSynthetic Data

1806 ★

🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.