The Inference Report

June 27, 2026

The US government has moved from theoretical concern about frontier AI to operational control over its deployment, and the effect is immediate fragmentation. Two weeks apart, the Trump administration ordered Anthropic to take Mythos offline for foreign users, then asked OpenAI to delay GPT-5.6's general release. Both complied. No law, no formal process, just government requests that major AI labs treat as directives. OpenAI's public pushback claiming restrictions "shouldn't be the norm" rings hollow when the company is simultaneously releasing GPT-5.6 to a government-vetted subset of users. The administration granted Anthropic permission to distribute its model to over 100 US companies and agencies, which suggests the concern is less about capability leakage and more about who gets to decide who uses what. OpenAI's announcement that it's building Jalapeño, its custom inference chip with Broadcom, looks less like technical independence and more like insurance against future restrictions, a way to own the supply chain when the government controls the distribution chain.

This control mechanism is now forcing a territorial reorganization of AI infrastructure. South Korea is training half a million soldiers as drone operators. China's Tencent is embedding DeepSeek's models into WeCom, its enterprise collaboration tool. Europe is leveraging Trump's protectionist posture to build its own stack, explicitly flagging AWS and Azure as gatekeepers under the Digital Markets Act. Instead of a unified global AI infrastructure centered on US companies and Nvidia chips, nation-states and blocs are building their own stacks, training their own workforces, and using regulatory leverage to carve out protected markets. This isn't competition; it's de facto sanctions wrapped in procurement policy.

The enterprise layer is quietly reorganizing around a parallel logic: that public cloud AI is too expensive and too exposed, and that custom infrastructure beats generic platforms. Enterprises are merging their OLTP and OLAP storage to feed AI agents real-time operational data. Microsoft is turning Windows into an AI operating system, promising unmetered local inference so companies can run models for free on their own hardware. Apple is raising prices up to 25% to cover memory costs, which means AI is now a line item in hardware budgets. The companies building their own chips, OpenAI, Google, Apple, SpaceX, are not trying to compete with Nvidia in general-purpose compute. They're optimizing for inference at scale, which is where the margin and control live. On the benchmarks, the top performers remain locked in place: OpenAI's gpt-5.5-2026-04-23-xhighModel holds first at 62.7% on SWE-rebench, with Claude Fable 5 leading the broader Artificial Analysis benchmark at 59.9. The stability suggests the top agents have reached a plateau, or that evaluation resolution cannot detect sub-point improvements. Meanwhile, the real developer momentum has moved past building agent frameworks to solving production plumbing: converting documents into LLM-parseable formats, giving agents sensory input and execution capability, and bundling tools into prescribed setups that treat AI as a team member with defined roles. This is the economics of vertical integration in a world where government can shut down your API access with a phone call.

Grant Calloway

AI LabsAll labs

AI21 Labs

Token spend isn’t going down. You need more than naive routing to manage it

AMD

Anthropic

Economic ResearchAnthropic Economic Index report: Cadences

Google

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction

IBM

IBM, Red Hat, and Deloitte Announce Lightwell Collaboration to Help Strengthen Open Source Software Supply Chain Trust

OpenAI

Previewing GPT-5.6 Sol: a next-generation model

From the WireAll feeds

Research Papers — FocusedAll papers

HiLSVA: Design and Evaluation of a Human-in-the-Loop Agentic System for Scientific Visualization cs.HC

Large language model (LLM) agents enable natural language interaction for scientific visualization (SciVis). Still, prior systems have essentially prioritized autonomy over human analytical control, thereby limiting transparency and human oversight. We present HiLSVA, a human-in-the-loop agentic system that supports mixed-initiative SciVis workflows. HiLSVA integrates a plan-first multi-agent architecture with explicit human oversight, stepwise provenance tracking, and learn-at-test-time adaptation from user feedback. The system supports fluid handoff between humans and agents through both natural language and direct manipulation of visualizations, while sandboxed execution ensures safe, reproducible workflows. In doing so, HiLSVA reframes agentic SciVis as a collaborative process that augments, rather than replaces, human analytical reasoning. We evaluate HiLSVA through representative case studies and a controlled user study with twelve participants of varying expertise across multiple autonomy settings. Results show that mixed-initiative interaction improves task completion, user control, and workflow transparency across different levels of user expertise, while revealing a tradeoff between execution efficiency and human oversight. These findings highlight the importance of human-centered design in agentic SciVis and guide the development of future collaborative visualization systems. We encourage readers to explore our demo video, case studies, and source code at https://hilsva.github.io/.

AI Healthcare Chatbots as Information Infrastructure: A Large-Scale Study of User-Reported Breakdowns cs.HC

AI healthcare chatbots are increasingly used to support health information seeking and self-management, yet their performance and impact on users remains to be studied. This study examines over 15,000 user reviews from 59 AI healthcare chatbot apps to explore how these systems function in everyday informational and emotional contexts. Topic modeling and interpretive analysis identify three recurring breakdowns: access barriers and service unreliability, user experience and interaction quality, and billing and customer support issues. Privacy and security concerns are associated with the most negative experiences. By framing AI healthcare chatbots as information infrastructures, our findings highlight how failures in access, usability, and trust affect users, offering actionable insights for designers, policymakers, and information professionals aiming to improve digital health systems.

Proactive Systems in HCI and AI: Concepts, Challenges, and Opportunities cs.HC

The last few years have seen a significant rise in interest in highly autonomous and proactive systems, fueled by advances in AI. Systems that anticipate user needs, take initiative, and act without explicit user input. Such systems span a wide range of applications, from smart lighting that adapts to user activity to assistive robots that plan actions in advance to intelligent thermostats that learn routines and adjust environments proactively. Despite this breadth, the concept of proactivity remains loosely defined and inconsistently applied across research and practice. Current usage of the term often conflates fundamentally different system behaviors. For instance, simple reminders or recommendation systems are frequently labeled as proactive, even though underlying mechanisms and intentions differ significantly. This conceptual ambiguity limits our ability to systematically design, compare, and evaluate proactive systems. Moreover, existing methodologies for design and evaluation are largely rooted in reactive interaction paradigms, failing to address the unique challenges posed by proactive behavior, including timing, appropriateness, user control, transparency, and trust. This multidisciplinary workshop aims to establish a clearer and more rigorous foundation for understanding proactive systems. We bring together researchers and practitioners from Human-Computer Interaction, AI, and related fields to (1) develop a shared conceptualization of proactivity, (2) identify gaps and limitations in current design and evaluation approaches, and (3) co-create human-centered guidelines and research directions for future systems. Through interactive discussions and collaborative activities, the workshop seeks to map key challenges and opportunities, ultimately advancing robust and consistent frameworks for designing and evaluating proactive technologies.

FUTO Swipe: Layout-Agnostic Neural Swipe Decoding cs.HC

Neural swipe decoders are typically tied to the keyboard they were trained on, requiring a new corpus and training run for each layout. In this report, we document our approach toward training models that can function on any contiguous mobile keyboard layout. At each point along the swipe, our encoder predicts whether the user is indicating a character and where on the keyboard that character lies. The keyboard layout is supplied at inference time and used to map the spatial and temporal prediction to a logit at each key, rather than being learned during training. Training neural models requires substantial data, but public swipe data is limited, particularly for non-QWERTY layouts. We release swipe.futo.org, the largest MIT-licensed swipe corpus we are aware of, containing over 1M donated swipes from more than 12k donor sessions. To generalize beyond the English QWERTY layout, we apply geometric augmentations to both the swipe trajectory and the keyboard layout at every training step, forcing the model to make predictions based on characteristics of the swipe gesture rather than the training layout. The model generalizes to layouts absent from training, in some cases more accurately than the layout it was trained on. This combines the layout-flexibility of an algorithmic decoder with the accuracy of a neural model. Trained models are publicly available.

Explainable Control Framework (XCF) based on Fuzzy Model-Agnostic Explanation and LLM Agent-Supported Interface cs.HC

Increasing demand for precise and reliable control in complex scenarios has led to the development of increasingly sophisticated controllers, including data-driven approaches employing closed box models and mathematically rigorous yet complex designs. This complexity highlights the needs for explainable control that can provide human-understandable insights into controller behavior. In this paper, an explainable control framework (XCF) along with supporting algorithms and user interface are proposed to explain how controllers determine their control actions and their underlying working mechanism. The novel contributions of this work are threefold: First, the XCF is designed to provide model-agnostic explanations for controllers in closed-loop systems and can optionally refine local explanations by system response dynamics. Second, a novel explanation method, hierarchical fuzzy model-agnostic explanation for control systems (HFMAE-C), is proposed based on the designed framework. The HFMAE-C employs a fuzzy logic system to approximate the controller's behavior and system dynamics, providing sample, local, domain and universe level explanations via IF-THEN rules revealing the controller's decision logic and salience values quantifying the contribution of system states to control actions. Third, a large language model agent-supported user interface is developed to automatically analyze user requirements, select appropriate algorithms, interpret the generated explanations to a natural language report, and provide interactive consultation. Case studies on inverted pendulum system and Turtlebot obstacle avoidance demonstrate the effectiveness of the proposed method through simulated user experiments and quantitative comparisons with mainstream explainable control approaches.

The impact of generative artificial intelligence on academic development of Chinese students in humanities and social sciences cs.HC

Generative artificial intelligence(GenAI) is reshaping learning in higher education, with particularly pronounced implications for the humanities and social sciences(HSS), where learning outcomes are commonly expressed through written and interpretive forms that align closely with GenAI's capabilities. Yet, systematic evidence on the educational impacts of GenAI on HSS students remains limited. Addressing this gap, this study draws on a large-scale survey of HSS students in China to examine its role in academic development. Guided by relevant learning theories, this study focuses on four dimensions: patterns of use, effects on learning processes and academic performance, challenges associated with GenAI use, and preferred approaches to curricular integration. We found that more than half perceived enhanced learning motivation, independent thinking and creativity, although a substantial minority reported little change or even decline. Comparatively, a notably larger majority reported academic performance gains, although these gains may partly reflect limitations in conventional assessment practices. The study identifies variations in perceived learning and performance improvements among students with differing durations of GenAI experience, along with observable disciplinary differences and modest gender differences. While an overwhelming majority valued the importance of ethical considerations, only slightly more than half were satisfied with privacy protection. Limited accuracy and overreliance emerged as the most pressing concerns reported by students. Students favored partial or optional curricular integration supported by practice-oriented training, and widely recognized GenAI's significance for their future professional development. Grounded in student perspectives, this study offers evidence-based recommendations for the responsible and pedagogically meaningful integration of GenAI

BenchmarksFull tables

Intelligence Index

Composite score across coding, math, and reasoning

#	Model	Score	tok/s	$/1M
1	Claude Fable 5	59.9	0	$20.00
2	Claude Opus 4.8	55.7	60	$10.00
3	GPT-5.5	54.8	83	$11.25
4	Claude Opus 4.7	53.5	57	$10.00
5	GPT-5.4	51.4	163	$5.63

SWE-rebench

Agentic coding on real-world software engineering tasks

#	Model	Score
1	OpenAIgpt-5.5-2026-04-23-xhighModel	62.7%± 0.91%
2	JunieJunieAgent	61.6%± 0.64%
3	OpenAICodexAgent	60.4%± 1.37%
4	AnthropicClaude CodeAgent	59.6%± 1.98%
5	OpenAIgpt-5.5-2026-04-23-mediumModel	58.9%± 0.78%

GitHub Repos All repos

Trending

simplex-chat/simplex-chat

12994 ★

SimpleX - the first messaging network operating without user identifiers of any kind - 100% private by design! iOS, Android and desktop apps 📱!

google-labs-code/design.md

21770 ★

A format specification for describing a visual identity to coding agents. DESIGN.md gives agents a persistent, structured understanding of a design system.

commaai/openpilot

61895 ★

openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.

kunchenguid/no-mistakes

3614 ★

git push no-mistakes

grafana/grafana

75023 ★

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

Daily discovery

Comfy-Org/ComfyUI_frontendGenerative AI

1861 ★

Official front-end implementation of ComfyUI

promptslab/Awesome-Prompt-EngineeringText-to-Speech

6104 ★

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

he-yufeng/GitSenseLLM

101 ★

AI-powered open source contribution finder and repo radar

invergent-ai/surogateFine-tuning

803 ★

Full-Stack Development Platform for Building Reliable Agents

lobehub/lobe-uiChatbot

2064 ★

🍭 Lobe UI - an open-source UI component library for building AIGC web apps