The Inference Report

June 28, 2026

Compute scarcity, not capability, is now the binding constraint reshaping the AI industry. Google's rationing of Gemini access to Meta, Anthropic's export restrictions on Mythos accelerating Asian independence, and the persistent stability of SWE-rebench leaderboards all point to a market where the frontier has stopped advancing because the infrastructure cannot keep pace. Elon Musk's orbital data center pitch and Paul Meade's move from Apple to OpenAI's hardware team reflect where capital sees the real bottleneck, yet even these supply-side solutions arrive too late to prevent fragmentation. U.S. labs face a structural problem: they can no longer control distribution globally, meaning the winners will be those who own the compute, not those who promise the most capable models.

The practical consequences are already visible across three domains. In labor markets, Shenzhen's robotaxi expansion is displacing drivers today, not in some speculative future. In high-stakes decision-making, Claude's ability to synthesize medical data into actionable insights shows AI moving from research artifact to operational tool. And in knowledge infrastructure itself, AI-generated images are now convincing enough to corrupt scientific journals, meaning the systems meant to validate truth are under active pressure. These shifts share a common thread: AI is transitioning from a capability question to a deployment and trust question.

Developer infrastructure reflects this reality plainly. GitHub trending shows two tiers: practical tools like Qdrant, dbt-core, and cognee that solve genuine bottlenecks in data pipelines and agent memory systems, gaining adoption because they work; and speculative agent frameworks that trend because they might work, not because they do. The honest infrastructure play is in memory systems and vector databases with clear APIs and measurable performance. The agent layer remains noisy and speculative because it promises to replace developer judgment at scale, a claim that depends on capabilities that do not yet exist reliably. What matters to watch is whether the infrastructure layer stabilizes into a standard before the rules do, and whether fragmentation along geographic and trust lines creates separate AI markets that never reconverge.

Grant Calloway

AI LabsAll labs

No lab headlines.

From the WireAll feeds

Research Papers — FocusedAll papers

Parametric Generalized Adaptive Moment Features (PG-AMF) for Bearing Fault Diagnosis and Machine Health Monitoring eess.SP

Accurate fault diagnosis of rolling element bearings in rotating machinery is considered essential for ensuring industrial safety and enabling predictive maintenance. Conventional statistical feature-based methods rely on predefined descriptors, whose diagnostic sensitivity is constrained by fixed configurations and limited adaptability across varying fault conditions. Although deep learning approaches offer strong representational capacity, their effectiveness is often restricted by high data requirements and reduced interpretability. In this work, a parametric adaptive feature extraction framework is proposed, in which feature characteristics are learned directly from data rather than being manually specified. Multiple complementary representations are extracted from vibration signals, including absolute features capturing signal energy distribution, signed moment features reflecting waveform asymmetry, and AC-coupled moment features emphasizing dynamic fluctuations, while interactions between multiple sensor channels are modeled through a structured fusion mechanism to enhance fault representation. The proposed approach is evaluated on a benchmark gearbox bearing dataset comprising five health conditions, including normal operation and multiple fault types. Improved classification performance is observed compared to conventional methods, with consistent results under cross-validation, indicating strong generalization capability. Additionally, enhanced feature separability is demonstrated through clearer clustering patterns in low-dimensional projections. The learned representations effectively capture a wide range of signal characteristics, supporting both improved diagnostic performance and practical applicability in industrial monitoring systems.

State-Specific Respiratory Signatures for Affective and Stress Recognition: Interpretable Respiratory Markers, Autocorrelation Lags, and Compact CNN Models eess.SP

Respiratory activity is a direct and interpretable physiological channel for wearable stress and affective-state recognition, yet many studies emphasize classification accuracy without identifying which respiratory properties separate different states. This work reframes RESP-based recognition as a joint predictive and explanatory problem. Using the chest respiratory channel of the WESAD dataset, we analyze 60 s windows under leave-one-subject-out validation and combine two complementary branches: compact raw-signal one-dimensional convolutional neural networks (1D-CNNs) and physically grouped handcrafted respiratory signatures. The primary application task is binary stress versus non-stress detection, while baseline, stress, amusement, and meditation are additionally analyzed in a one-vs-rest setting to reveal state-specific respiratory markers. The feature space is organized into respiratory timing, breath-to-breath variability, waveform statistics, spectral/time-frequency descriptors, and autocorrelation/nonlinear predictability descriptors, with the raw 60 s signal treated as a sixth representation for the CNN branch. We introduce autocorrelation transition lags (Zpm/Zmp) as interpretable markers of respiratory correlation scale and separately evaluate exploratory FEG-Pro/Lyapunov-like descriptors. In the final CNN refit setting, the raw-signal model achieved the strongest stress-vs-rest performance, with accuracy 96.72 percent, macro-F1 95.30 percent, and MCC 90.61 percent. In contrast, compact feature models were stronger for baseline, with MCC 65.34 percent, amusement, with MCC 35.69 percent, and especially meditation, with MCC 88.65 percent. These results show that CNNs are most useful for the practical stress detector, whereas interpretable respiratory signatures provide stronger and more physiologically transparent state-specific markers for several non-stress conditions.

Inverse Design of Compact and Wideband Inverted Doherty Power Amplifiers Using Deep Learning eess.SP

This paper presents a deep learning-assisted methodology for the inverse synthesis of a compact, wideband inverted Doherty power amplifier (PA). Convolutional neural networks (CNNs) and genetic algorithms (GAs) are jointly employed to generate pixelated Doherty combiner networks that integrate load modulation, impedance matching, power combining, and phase compensation into a single structure. As a proof of concept, we design and fabricate a GaN HEMT Doherty PA with a pixelated output combiner. The prototype achieves a measured peak drain efficiency of 51%-63% and a 6-dB back-off efficiency of 48%-54% over 1.9-2.5 GHz. Within the same frequency range, the measured output power is 44+/-0.3 dBm. Furthermore, with digital predistortion (DPD) applied, the prototype circuit demonstrates an adjacent channel leakage ratio (ACLR) better than -53.2 dBc.

Integrated Sensing and Communications for Real-time Avatar Control in XR over 5G eess.SP

Extended Reality (XR) presents a challenging use case for 5G and 6G networks, requiring high data-rates and lowlatency communication to deliver a truly immersive experience. Moreover, in order to seamlessly translate physical actions to the virtual world, accurate gesture recognition and pose estimation are required. Current XR interaction solutions based on handheld controllers and cameras cannot easily capture full-body poses, inhibit the free use of hands, and require good visibility and a clear line of sight. In this work, we propose a multimodal sensing architecture for XR that combines 5G MillimeterWave (mmWave) Integrated sensing and communication (ISAC) and surface electromyography (sEMG) signals. 5G mmWave ISAC cannot only be used to deliver content wirelessly to the Head-mounted display (HMD), but also the same communication signals can be used to derive coarse body-level gestures and poses of the user, to support real-time avatar control. For fine-grained finger-level gestures, our architecture leverages lightweight sEMG sensors that capture forearm muscle activity. To illustrate the need of both modalities, we present evaluations of both sensing technologies. At the body level (5G), our architecture relies on power-per-beam-pair (PPBP), which can be computed from standard beam management or beam sweeping procedures of the 5G NR standard. PPBP-based sensing achieves 82.2$\pm$5.9% average accuracy when evaluated on users not seen during training. For fine-grained finger-level interactions, we show that surface electromyography (sEMG) carries strong discriminative information achieving consistent promising performance across different movement settings. Thus, combining the two modalities enables multi-scale gesture recognition, at the body level via existing 5G signals and finger level via lightweight sEMG sensors, forming a complete XR framework.

Low-rank Updates in Slowly Time-varying Graphs for Spatial-Temporal Signal Interpolation eess.SP

A crucial assumption in graph signal processing (GSP) is the existence of an underlying graph that captures the pairwise similarities between nodes, allowing filters to be designed based on this graph for tasks such as denoising. For spatial-temporal data in which node-to-node similarities evolve over time, a static spatial graph is insufficient. In this paper, to represent slowly time-varying pairwise relationships, we model the graph changes in two consecutive adjacency matrices $P = W^{(2)} - W^{(1)}$ across time as a low-rank matrix. % Specifically, given an initial adjacency matrix $W^{(1)}$ at time $t=1$, we jointly interpolate a signal $x_2$ and estimate $W^{(2)}$ at $t=2$ using both a graph signal smoothness prior for $x_2$ and a low-rank prior on $¶$. We alternate optimization steps. With $W^{(2)}$ fixed, $x_2$ is interpolated by solving a linear system. Alternatively, holding $x_2$ fixed, $W^{(2)}$ is updated via proximal gradient descent (PGD). The proximal mapping of the rank term $Gamma(W^{(2)} - W^{(1)})$ is approximated in linear time using a fast orthogonal matching pursuit (OMP) algorithm that selects a sparse combination of atoms from a dictionary $cR$ formed by the outer products of $W^{(1)}$'s eigenvectors. We unroll iterations of our algorithm into layers to build a lightweight neural network for limited data-driven parameter tuning. Experiments show that our joint optimization achieves better signal interpolation compared to existing time-varying graph models.

PROTECT-90: A Fault Dataset for Power System Protection eess.SP

The increasing interest in data-driven methods for power system protection is accompanied by a lack of standardized, publicly available high-voltage waveform datasets that enable transparent and reproducible evaluation. To address this gap, this paper introduces the PROTECT-90 dataset, an open electromagnetic transient (EMT)-simulated reference benchmark for high-voltage fault studies with consistent digital-fault-recorder-like measurements, publicly released with this work. The dataset comprises 9,022 physically consistent short-circuit simulation episodes generated on a standardized 90 kV double-line topology with systematically documented domain randomization of grid operating points, line parameters, and fault conditions. For each episode, synchronized three-phase voltage and current waveforms are recorded at eight measurement locations and released together with structured, machine-readable metadata describing fault type, fault location, inception time, and operating conditions. All modeling assumptions, parameter ranges, and data-generation procedures are explicitly documented to ensure transparency and cross-study comparability. By combining physically grounded EMT simulation, balanced scenario coverage, and open accessibility, PROTECT-90 establishes a standardized foundation for reproducible benchmarking of protection-oriented signal processing and learning-based methods.

BenchmarksFull tables

Intelligence Index

Composite score across coding, math, and reasoning

#	Model	Score	tok/s	$/1M
1	Claude Fable 5	59.9	0	$20.00
2	Claude Opus 4.8	55.7	57	$10.00
3	GPT-5.5	54.8	82	$11.25
4	Claude Opus 4.7	53.5	54	$10.00
5	GPT-5.4	51.4	166	$5.63

SWE-rebench

Agentic coding on real-world software engineering tasks

#	Model	Score
1	OpenAIgpt-5.5-2026-04-23-xhighModel	62.7%± 0.91%
2	JunieJunieAgent	61.6%± 0.64%
3	OpenAICodexAgent	60.4%± 1.37%
4	AnthropicClaude CodeAgent	59.6%± 1.98%
5	OpenAIgpt-5.5-2026-04-23-mediumModel	58.9%± 0.78%

GitHub Repos All repos

Trending

simplex-chat/simplex-chat

14198 ★

SimpleX - the first messaging network operating without user identifiers of any kind - 100% private by design! iOS, Android and desktop apps 📱!

commaai/openpilot

62161 ★

openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.

IceWhaleTech/CasaOS

35918 ★

CasaOS - A simple, easy-to-use, elegant open-source Personal Cloud system.

ripienaar/free-for-dev

124301 ★

A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev

google-labs-code/design.md

22588 ★

A format specification for describing a visual identity to coding agents. DESIGN.md gives agents a persistent, structured understanding of a design system.

Daily discovery

qdrant/qdrantNeural Network

32711 ★

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

argilla-io/distilabelRLHF

3303 ★

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

ongridio/ongridChatbot

329 ★

An ops AI Agent that understands your infrastructure, finds the root cause, and fixes it — right from Slack, Telegram, Lark or DingTalk.

Extremesarova/ds_resourcesDeep Learning

148 ★

Data Science Resources for interview preparation and learning

yunncheng/MMRLMultimodal

116 ★

[CVPR 2025 & IJCV2026] Official PyTorch Code for "MMRL: Multi-Modal Representation Learning for Vision-Language Models" and its extension "MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models".