The Inference Report

March 24, 2026

The trending set splits cleanly into two tribes with opposite philosophies. One group treats agents as orchestrators for existing systems: browser-use automates web tasks by giving LLMs a DOM interface, n8n-mcp lets Claude build workflows in an established automation platform, and deer-flow provides a harness with sandboxes and memory to coordinate multi-step work. These tools assume agents will operate within known boundaries and existing infrastructure. The other tribe treats agents as autonomous executors operating in hostile or unstructured environments: pentagi performs penetration testing without human intervention, TradingAgents coordinates financial decision-making across multiple models, and MoneyPrinterV2 automates online monetization. Both categories are gaining traction, but they're solving different problems. The orchestration tools reduce friction for developers building agent workflows; the autonomous tools are betting that agents can replace human judgment in specific domains.

What's notable is the infrastructure layer emerging beneath both. LocalAI, Ray, and ONNX Runtime aren't agents themselves but run the models agents depend on, and their star counts suggest developers are serious about running inference locally or at scale. The discovery repos reinforce this: screenpipe captures local context for agents to reason about, metorial manages MCP connections at platform scale, and Speech-AI-Forge adds voice as another modality agents can operate through. Meanwhile, the skills and plugin ecosystems around Claude Code and Obsidian indicate that developers are standardizing on agent capabilities as discrete, teachable units rather than monolithic systems. This is pragmatic. It means you can compose agents from parts instead of adopting a complete framework, and you can run them on your own hardware without vendor lock-in.

Jack Ridley

Trending
Daily discovery
epfml/discoFederated Learning
182

DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.

screenpipe/screenpipeComputer Vision
17513

screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, all yours.

hyunwoongko/nanoRLHFRLHF
174

nanoRLHF: from-scratch journey into how LLMs and RLHF really work.

metorial/metorial-platformMCP
203

The engine powering hundreds of thousands of MCP connections 🤖 🔥

microsoft/onnxruntimeDeep Learning
19647

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

onnx/neural-compressorModel Compression
100

Model compression for ONNX

mudler/LocalAILLM
44302

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

lenML/Speech-AI-ForgeText-to-Speech
1387

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

ray-project/rayReinforcement Learning
41828

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

apocas/restaiTransformers
481

RESTai is an AIaaS (AI as a Service) open-source platform. Supports many public and local LLM suported by Ollama/vLLM/etc. Precise embeddings usage and tuning. Built-in image generation (Dall-E, SD, Flux) and dynamic loading generators.