WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups over vLLM-optimized baselines.

Python 644 45 Updated Mar 3, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,139 622 Updated Mar 13, 2026

ModelEngine-Group / unified-cache-management

Persist and reuse KV Cache to speedup your LLM.

Python 276 73 Updated May 8, 2026

bansky-cl / diffusion-nlp-paper-arxiv

Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".

Python 294 16 Updated May 6, 2026

MIRALab-USTC / LLM-AttentionPredictor

The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Chen Chen, Lei Chen, Xianzhi Yu, Wulong Liu, Jianye HAO, Mingx…

Python 28 5 Updated Jul 15, 2025

vllm-project / semantic-router

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 4,104 661 Updated May 8, 2026

SJTU-DENG-Lab / Discrete-Diffusion-Forcing

Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference

Python 249 17 Updated Feb 3, 2026

NVlabs / Fast-dLLM

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 959 119 Updated May 6, 2026

stepfun-ai / StepMesh

C++ 363 40 Updated Jan 28, 2026

FMInference / H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Python 517 80 Updated Aug 1, 2024

assafelovic / gpt-researcher

An autonomous agent that conducts deep research on any data using any LLM providers

Python 26,935 3,614 Updated Apr 16, 2026

QuivrHQ / quivr

Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: …

Python 39,137 3,751 Updated Jul 9, 2025

bojone / NBCE

Naive Bayes-based Context Extension

Python 328 22 Updated Dec 9, 2024

langchain-ai / langchain

The agent engineering platform. Available in TypeScript!

Python 136,112 22,496 Updated May 8, 2026

kmeng01 / memit

Mass-editing thousands of facts into a transformer memory (ICLR 2023)

Python 548 75 Updated Jan 31, 2024

gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 42,529 3,439 Updated May 7, 2026

tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,938 2,192 Updated Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cc dawncc

Achievements

Achievements

Block or report dawncc

Stars

NVIDIA / kvpress

EvoAgentX / EvoAgentX

facebookresearch / HyperAgents

HKUDS / OpenSpace

JarvisPei / EvolveClaw

TreeAI-Lab / Awesome-KV-Cache-Management

EvoAgentX / Awesome-Self-Evolving-Agents

PrefectHQ / fastmcp

ByteDance-Seed / Bagel

dmscode / Obsidian-Templates

z-lab / dflash

sunshy-1 / SpecTTS-Bench

hemingkx / Spec-Bench

Tencent / WeDLM