The Inference Report

April 5, 2026

The GitHub trends this week show two distinct movements: developers building practical infrastructure for LLM deployment, and a surge in tools that let anyone run capable AI models locally. The infrastructure play is clear across repos like Microsoft's agent-framework and block/goose, which abstract away the complexity of orchestrating multi-step AI workflows. These aren't thin wrappers around API calls; they handle state management, tool execution, and error recovery across different LLM backends. onyx-dot-app/onyx takes a different angle, packaging chat and RAG capabilities into a self-hosted platform that works with any LLM, positioning itself as the open alternative to proprietary AI platforms. What connects these is a shared bet that the bottleneck isn't model quality anymore but infrastructure maturity.

The local-first movement is equally pronounced. MLX-VLM lets you fine-tune vision language models on Mac hardware without cloud dependencies. AutoRAG focuses on the unglamorous but necessary work of evaluating and optimizing RAG pipelines, suggesting teams are past the honeymoon phase and now optimizing production systems. dstack addresses a real friction point: provisioning GPU compute across fragmented hardware ecosystems without vendor lock-in. Meanwhile, siddharthvaddem/openscreen strips away the complexity from demo creation, eliminating subscription friction that keeps people in proprietary tools. The pattern here isn't about raw capability but about control and cost. Developers are voting with their forks for systems they can run, modify, and own, particularly where cloud pricing or vendor lock-in created friction. The discovery repos reinforce this: defradb's peer-to-peer database model, AutoRAG's focus on evaluation rigor, and dstack's multi-cloud provisioning all solve real operational problems that emerge only after you've deployed something to production.

Jack Ridley

Trending
Daily discovery
sourcenetwork/defradbEdge AI
869

DefraDB is a Peer-to-Peer Edge-First Database. It's the core data storage system for the Source Ecosystem, built with IPLD, LibP2P, CRDTs, and Semantic open web properties.

darkdevil3610/100-AI-Machine-learning-Deep-learning-Computer-vision-NLPData Science
147

100+ AI Machine learning Deep learning Computer vision NLP Projects with code

jim60105/docker-whisperXSpeech Recognition
425

Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)

Marker-Inc-Korea/AutoRAGAutoML
4682

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

dkozlov/awesome-knowledge-distillationModel Compression
3833

Awesome Knowledge Distillation

AceDataCloud/NexiorImage Generation
351

Consumer AI app for chat, image generation, video generation, and music creation powered by Ace Data Cloud APIs.

dstackai/dstackFine-tuning
2080

Control plane for agents and engineers to provision compute and run training and inference across NVIDIA, AMD, TPU, and Tenstorrent GPUs—on clouds, Kubernetes, and bare-metal clusters.

f/prompts.chatPrompt Engineering
157484

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

Comfy-Org/ComfyUI_frontendGenerative AI
1735

Official front-end implementation of ComfyUI

microsoft/presidioTransformers
7490

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.