The Inference Report

May 26, 2026

A research paper on enforcing mathematical structure into neural networks before training begins rather than correcting it afterward deserves attention it won't receive, because the industry is moving in the opposite direction. The market is consolidating around speed and scale, not rigor. DeepSeek collapsed the pricing structure that was supposed to sustain the AI economy by cutting V4-Pro costs by 75 percent in a month. Google released Agent Executor as open source and reports that 75 percent of new code is now AI-generated. ClickUp is replacing hundreds of employees with AI agents. The narrative has completed its arc: from "will AI replace workers" to "it's already happening." The Pope issued an encyclical warning of concentrated power and a tech elite shaping the world to its advantage. Nobody in the industry is listening because the incentives don't require them to.

The infrastructure layer is being commoditized and opened simultaneously. Google's Agent Executor, Anthropic's Model Context Window protocol with thousands of emerging servers, and DeepSeek's aggressive pricing all point the same direction: the moat isn't in the model anymore, it's in the operational stack and the data flowing through it. AMD is attacking the cloud inference monopoly by enabling 70B and 100B parameter models to run on local hardware without quality degradation, eroding the rent that forced developers toward paid endpoints. OpenAI consolidates the consumer interface and publisher relationships through its Brazilian journalism deal. Hugging Face positions itself as the place where developers operationalize models once they can run them cheaply. The pattern isn't about capability announcements. It's about where the margin lives and who extracts value from the transition.

Talent and leverage are consolidating in predictable ways. ByteDance is issuing special stock tied to its AI unit to prevent poaching, acknowledging that the real scarcity is engineering talent, not capital. Trump's AI safety executive order was killed in three Wednesday-night phone calls by Musk, Zuckerberg, and Sacks. Anthropic closed a 30 billion-plus round the same Saturday. The companies building at scale and shipping product are accumulating power and capital. The people warning about concentration of power are issuing manifestos that read like they're describing a problem that already happened. On GitHub, the ecosystem is consolidating around concrete problems: how to make AI coding agents actually useful, how to index and reason over code at scale, and how to keep AI outputs from being generic. The traction reflects a shift from "which model is best" to "how do we make any model behave the way we want."

Grant Calloway

AI LabsAll labs

AMD

AI Inference on AMD Ryzen™ AI Max Processor

Anthropic

Anthropic co-founder Chris Olah's remarks on Pope Leo XIV's encyclical "Magnifica humanitas"

Hugging Face

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

OpenAI

OpenAI, Grupo Folha and Grupo UOL announce strategic content partnership

From the WireAll feeds

Research Papers — FocusedAll papers

Subgrid-Scale Parameterization in Burgers' Equation Using Structure-Preserving Neural Networks and Entropy Variables math.NA

We present a machine learning approach for developing subgrid-scale (SGS) parametrizations in coarse simulations of partial differential equations. We utilize structure-preserving neural networks and entropy variables to learn subgrid fluxes in coarse simulations of the Burgers' equation. In particular, we employ a decoupled neural network architecture explicitly separating the subgrid corrections into two distinct components: a conservative Flux Potential network and an Eddy Viscosity network. We demonstrate that this reduced-order framework maintains high physical fidelity, accurately reproducing the energy spectrum, spatial and temporal correlation functions, and dynamical characteristics of the full-scale system. Furthermore, we show that our approach is robust and applicable to parameters outside the training regime.

Spectral-Informed Neural Networks Outperform Spectral Methods in High-dimensional PDEs math.NA

For low-dimensional problems ($d\leq3$), spectral methods can achieve exceptionally high accuracy. For middle-dimensional problems ($4 \leq d \lesssim 10$), spectral methods remain feasible through specific techniques such as sparse grids or hyperbolic cross. However, for high-dimensional problems ($d\gg 10$), spectral methods suffer frome the curse of dimensionality. Physics-informed neural networks (PINNs) have emerged as a promising approach to overcome this challenge, offering scalability to high dimensions, but often suffer from limited accuracy and efficiency. Recently proposed spectral-informed neural networks (SINNs) combine spectral methods with PINNs, operating directly in the spectral domain to avoid spatial derivative computations and to reduce memory consumption. In this work, we introduce Modified SINNs, which integrate coefficient decay scaling and basis embeddings motivated by harmonic analysis to enhance accuracy in high-dimensional problems and enable accurate approximation of unknown spectral coefficients. Numerical experiments on steady and time-dependent partial differential equations demonstrate that Modified SINNs outperform sparse grid spectral methods on middle-dimensional problems with incomplete spectral information and achieve superior accuracy compared to PINNs on high-dimensional problems.

Approximation of solutions of parameter-dependent problems by residual neural networks math.NA

We develop a convergent scheme to train neural networks involving analytic activation functions based on gradient flows. Convergence properties are guaranteed by Lojasiewicz theory. The main advantage of this approach is its simplicity of implementation. The coefficients of the network are approximated by solving a system of ordinary differential equations. We test the method by constructing residual neural network approximations of solutions of parametric problems. The dependence of the solutions of simple ordinary differential equations on a few parameters is correctly reproduced. The solutions of inverse problems involving wave constraints which depend on a few parameters can be reasonably approximated, even in regions in which the problem is severely ill posed.

Deep Learning-based Surrogate Modelling of the LOD Method for Multiscale Problems math.NA

Multiscale problems are notoriously difficult to tackle using traditional numerical methods, as accurately resolving fine-scale features often requires prohibitively fine discretizations. This challenge is particularly pronounced in applications such as materials science, fluid dynamics, climate systems, chemical processes, and complex networks. Recent neural operator models provide a promising data-driven alternative, but frequently struggle to achieve sufficient accuracy in the presence of strongly heterogeneous or oscillatory coefficients. In this work, we focus on the solution of elliptic PDEs with rough and high-contrast inputs. The Localized Orthogonal Decomposition (LOD) method is a well-established numerical approach for such problems, but it comes, however, at a substantial computational cost. We investigate the performance of popular neural operator architectures on these challenging multiscale problems and identify key limitations in their ability to resolve fine-scale structure. To overcome these challenges, we introduce LOD-MSNO (LOD-Multiscale Neural Operator), a hybrid approach that leverages the LOD method as a strong multiscale prior by building on its representation of the solution as a linear combination of problem-adapted basis functions, while addressing its main computational bottlenecks through data-driven operator learning. We further provide theoretical error estimates for the proposed coefficient-learning framework. Lastly, we demonstrate the potential of our proposed method to outperform current neural operator baselines in terms of accuracy for challenging multiscale inputs, while mainly retaining the computational efficiency of neural operator models.

Kernel-based Operator Learning: Error Analysis, Budget Allocation, and a Physics-Informed Extension math.NA

We study kernel-based operator learning in a two-stage sampling framework, where an offline kernel regression operator learns a discretized representation of the target operator from input-output pairs and an online kernel reconstruction operator recovers the output function from predicted observations. Our main theoretical contribution is an explicit budget allocation condition relating the number $N$ of training pairs, the number $n$ of input observations, and the output resolution $m$. The condition is derived from a coupled error analysis that interprets the surrogate as a reconstruction from approximate data. This yields a decomposition of the total error into reconstruction and learning contributions that can be analyzed independently. As a consequence, we obtain quantitative scaling laws describing how $N$, $n$, and $m$ must be coupled to guarantee convergence and to balance offline learning and online reconstruction errors. The resulting estimates extend previous analyses of kernel-based operator learning. We further introduce a physics-informed extension that incorporates knowledge of the underlying PDE at evaluation time. Rather than encoding constraints directly into the kernel, we augment the online reconstruction step by penalizing PDE residuals at collocation points. The method requires no retraining for new inputs. Numerical experiments illustrate the theoretical findings and demonstrate the effectiveness of the proposed physics-informed reconstruction strategy.

Online TT-ALS for Streaming Tensor Decomposition with Incremental Orthogonalization math.NA

Tensor Train (TT) decomposition is a powerful technique for analyzing high-dimensional data. Existing algorithms for computing TT decompositions can be categorized into two main types: conventional batch-based approaches and recursive online methods. In the context of streaming data, batch methods typically achieve higher reconstruction accuracy but often suffer from memory exhaustion, while online methods provide greater computational efficiency. In this work, we introduce Online TT-ALS (Alternating Least Squares), an algorithm that sequentially enforces orthogonality constraints. This approach allows for efficient and exact updates of the core tensor while maintaining high reconstruction accuracy. Theoretically, we prove that enforcing these orthogonal gauge constraints guarantees monotonic decrease of the local objective function and temporal smoothness. Computationally, our deterministic single-sweep update reduces the rank dependence from quadratic to linear, achieving an overall complexity of $\mathcal{O}(I^{n-1} r)$. Experimental results demonstrate that the proposed method outperforms existing online techniques not only in terms of mathematical approximation accuracy but also in human perception-based video quality metrics. Furthermore, compared to recent deep learning-based paradigms, our algebraic approach achieves speedups of several orders of magnitude. Consequently, our method exhibits high computational efficiency and is suitable for low-latency real-time processing applications.

BenchmarksFull tables

Intelligence Index

Composite score across coding, math, and reasoning

#	Model	Score	tok/s	$/1M
1	GPT-5.5	60.2	71	$11.25
2	Claude Opus 4.7	57.3	49	$10.94
3	Gemini 3.1 Pro Preview	57.2	129	$4.50
4	GPT-5.4	56.8	84	$5.63
5	Qwen3.7 Max	56.6	198	$3.75

SWE-rebench

Agentic coding on real-world software engineering tasks

#	Model	Score
1	Claude Opus 4.6	65.3%
2	gpt-5.2-2025-12-11-medium	64.4%
3	GLM-5	62.8%
4	Junie	62.8%
5	gpt-5.4-2026-03-05-medium	62.8%

GitHub Repos All repos

Trending

Lum1104/Understand-Anything

43640 ★

Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini CLI, and more.

anthropics/knowledge-work-plugins

17517 ★

Open source repository of plugins primarily intended for knowledge workers to use in Claude Cowork

rohitg00/ai-engineering-from-scratch

33431 ★

Learn it. Build it. Ship it for others.

affaan-m/ECC

225416 ★

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

mukul975/Anthropic-Cybersecurity-Skills

21522 ★

754 structured cybersecurity skills for AI agents · Mapped to 5 frameworks: MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND & NIST AI RMF · agentskills.io standard · Works with Claude Code, GitHub Copilot, Codex CLI, Cursor, Gemini CLI & 20+ platforms · 26 security domains · Apache 2.0

Daily discovery

isLinXu/paper-listReinforcement Learning

145 ★

autoupdate paper list

pytorch/executorchNeural Network

4746 ★

On-device AI across mobile, embedded and edge for PyTorch

logseq/logseqKnowledge Graph

43765 ★

A privacy-first, open-source platform for knowledge management and collaboration. Download link: http://github.com/logseq/logseq/releases. roadmap: https://logseq.io/p/NX4mc_ggEV

kreuzberg-dev/kreuzcrawlMCP

101 ★

High-performance web crawling engine with bindings for 11 languages

MrNeRF/LichtFeld-StudioComputer Vision

3132 ★

LichtFeld Studio: Where reality and the digital world blend.