The Inference Report

May 14, 2026

The GitHub ecosystem is consolidating around two distinct problems: making AI agents practical, and making them controllable. The agent-focused repos dominate trending, but they split into infrastructure versus skills. On the infrastructure side, repos like trycua/cua and activepieces/activepieces solve the actual hard problem: how do you let an AI agent interact with a desktop or trigger workflows without it going rogue or hallucinating its way into production? These aren't frameworks asking you to rewrite your stack. They're sandboxes, benchmarks, and orchestration layers that treat agent control as a first-class concern. The skills layer, K-Dense-AI/scientific-agent-skills, mattpocock/skills, danielmiessler/Personal_AI_Infrastructure, packages domain knowledge as reusable components. This reflects where real development effort is going: not building agents from scratch, but teaching them to do specific work reliably. That's a maturation signal.

Separately, there's a strong current in local, on-device computation. supertone-inc/supertonic runs multilingual text-to-speech natively via ONNX. mlx-embeddings handles vision and language embeddings on Mac hardware without cloud dependencies. These aren't about performance theater. They solve a real constraint: latency, privacy, and cost for production systems. Meanwhile, rasbt/LLMs-from-scratch and github/spec-kit sit at opposite ends of the spectrum, one teaches you how LLMs actually work from first principles, the other automates spec-driven development workflows. Both are getting serious traction, suggesting developers want either deep understanding or high-level automation, not much in between. The infrastructure plays are winning because agents are only useful if they don't become liability machines.

Jack Ridley

Trending
Daily discovery
activepieces/activepiecesAI Agents
22180

AI Agents & MCPs & AI Workflow Automation • (~400 MCP servers for AI agents) • AI Automation / AI Agent with MCPs • AI Workflows & AI Agents • MCPs for AI Agents

debpalash/OmniVoice-StudioSpeech Recognition
878

The open-source ElevenLabs alternative for local voice cloning, design, create, dubbing and dictation Desktop App

Blaizzy/mlx-embeddingsRAG
381

MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.

vllm-project/vllm-ascendLLM
2077

Community maintained hardware plugin for vLLM on Ascend

lightly-ai/lightly-trainObject Detection
1456

All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.

Wakals/CoVTMultimodal
353

Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"

thu-nics/MoAModel Compression
156

[CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

WZDTHU/TiMDiffusion Models
146

Transition Models

invoke-ai/InvokeAIImage Generation
27188

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.

tensorforger/FluxRTDiffusion Models
195

Real-time stream editing pipeline powered by the FLUX.2-klein-4B model, optimized for consumer GPUs