Skip to content
View ggerganov's full-sized avatar

Sponsors

Organizations

@ggml-org

Block or report ggerganov

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fast state-of-the-art image and video segmentation in portable C/C++

C++ 294 25 Updated Apr 10, 2026

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

Rust 721 45 Updated May 7, 2026

Portable C++17 implementation of ACE-Step 1.5 AI Music Generator using GGML. Text + lyrics in, stereo 48kHz MP3 or WAV out. Runs on CPU, CUDA, ROCm, Metal, Vulkan.

C++ 289 46 Updated May 6, 2026

A C++17 single-file header-only wrapper for llama.cpp

C++ 22 Updated May 4, 2026

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

TypeScript 2,050 188 Updated May 6, 2026

A free, open source, and extensible speech-to-text application that works completely offline.

Rust 21,310 1,761 Updated Apr 30, 2026

A cosy home for your LLMs.

Swift 1,250 65 Updated Apr 27, 2026

Audio playback and capture library written in C, in a single source file.

C 6,740 554 Updated May 6, 2026

Local LLM-assisted text completion for Qt Creator.

C++ 58 5 Updated Apr 16, 2026

Simple GUI around whisper.cpp for voice-to-text on Linux

Python 70 14 Updated Apr 22, 2026

Local LLM-assisted text completion for Qt Creator.

C++ 66 11 Updated Apr 16, 2026

MLPerf Client is a benchmark for Windows, Linux and macOS, focusing on client form factors in ML inference scenarios.

C++ 84 6 Updated Apr 20, 2026

Low-latency AI engine for mobile devices & wearables

C 4,715 370 Updated May 7, 2026

Emacs package for LLM-assisted code/text completion

Emacs Lisp 42 2 Updated Nov 12, 2025

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

C++ 3,851 289 Updated May 8, 2026

The application performs real-time inference on audio from an ALSA capture device

C++ 39 1 Updated Jun 19, 2025

TTS support with GGML

C++ 240 30 Updated Oct 5, 2025
Python 539 57 Updated Apr 18, 2026

LLM plugin for interacting with llama-server models

Python 30 6 Updated May 28, 2025

Running any GGUF SLMs/LLMs locally, on-device in Android

Kotlin 807 136 Updated Apr 25, 2026

DINOv2 inference engine written in C/C++ using ggml and OpenCV.

C++ 92 7 Updated May 6, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 27,207 1,983 Updated Jan 9, 2026

Real-time webcam demo with SmolVLM and llama.cpp server

HTML 5,549 896 Updated May 12, 2025

📎 Clippy, now with some AI

TypeScript 1,275 68 Updated Nov 15, 2025

Command to convert from color text (ANSI or 256) to image.

Go 257 23 Updated Apr 27, 2026

Simple frontend for LLMs built in react-native.

TypeScript 2,386 206 Updated May 7, 2026

Speech-to-text transcription VST3/ARA plugin

TypeScript 59 7 Updated Apr 13, 2026
Python 18 1 Updated Jul 12, 2025
Next