The Inference Report

July 2, 2026

The GitHub trending data reveals a decisive shift: developers are treating AI agents not as experimental features but as production infrastructure. The majority of high-traction repos fall into three overlapping categories. First, agent frameworks and orchestration layers, agency-agents, omnigent, herdr, that treat multiple LLM providers and specialized agent personas as composable components rather than monolithic services. Second, infrastructure for making agents practical: sandboxing (CubeSandbox), authentication (logto), token optimization (OmniRoute's compression saving 15-95% of tokens), and video editing via agents (browser-use/video-use). Third, tooling to extract and prepare data at scale, olmocr for linearizing PDFs into LLM-trainable formats, CVAT for vision annotation, the exercises dataset. What's notably absent from the trending set is any single dominant LLM provider's wrapper. Instead, repos prioritize abstraction layers that let developers swap models without rewriting code, which suggests the market has already decided that provider lock-in is a solved problem worth avoiding.

The discovery repos confirm this pattern while adding texture. SGLang and TanStack/ai both emphasize type safety and streaming primitives across multiple providers, they're solving the actual engineering problem of building reliable, testable AI applications rather than demo-ware. ARIS takes a narrower but revealing approach: autonomous ML research through markdown-only skills and cross-model review loops, no framework lock-in. Project Tapestry and Foundation-Models-Framework-Lab represent a secondary trend: developers building on constrained or regional models (sovereign models, Apple's on-device frameworks) rather than assuming global API access. The practical detail worth noting is that repos solving token efficiency, sandboxing, and annotation at scale are gaining serious traction alongside the agent frameworks themselves. This suggests the bottleneck has shifted from "can we build an agent" to "can we afford to run it, keep it safe, and feed it good data."

Jack Ridley

Trending
Daily discovery
omnigent-ai/omnigentLLM
5984

Omnigent is an open-source AI agent framework and meta-harness: orchestrate Claude Code, Codex, Cursor, Pi, and custom agents — swap harnesses without rewriting, enforce policies and sandboxing, and collaborate in real time from any device.

Haoyu-ha/LNLNMultimodal
116

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

The-AI-Alliance/tapestryFederated Learning
170

Project Tapestry aims to give every nation and participant frontier AI they can call their own — uniting a global consortium to train a shared frontier model from which partners build and own sovereign models aligned to their national, socio-cultural, and industrial needs.

sgl-project/sglangReinforcement Learning
29894

SGLang is a high-performance serving framework for large language models and multimodal models.

rudrankriyam/Foundation-Models-Framework-LabSpeech Recognition
1138

A practical lab for building, testing, and evaluating apps with Apple's Foundation Models framework.

wanshuiyin/Auto-claude-code-research-in-sleepMCP
12915

ARIS ⚔️ (Auto-Research-In-Sleep) — Claude Code skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation via Codex MCP

cvat-ai/cvatComputer Vision
16205

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

TanStack/aiChatbot
2858

🤖 Type-safe, provider-agnostic TypeScript AI SDK for streaming chat, tool calling, agents, and multimodal apps across OpenAI, Anthropic, Gemini, React, Vue, Svelte, and Solid.

sportsdataverse/sportsdataverse-pyData Science
106

sportsdataverse python package

GoogleCloudPlatform/vertex-ai-samplesMLOps
756

Notebooks, code samples, sample apps, and other resources that demonstrate how to use, develop and manage machine learning and generative AI workflows using Google Cloud Vertex AI.