Lists (3)
Sort Name ascending (A-Z)
Stars
JoyMed: A Leading Medical Foundation Model with Adaptive Reasoning
🦞面向生产的 OpenClaw 记忆架构:日增量、周精炼、幂等去重与 watchdog 稳定性守护。
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Mac Mouse Fix - Make Your $10 Mouse Better Than an Apple Trackpad!
Your all-in-one port for papers, citations, and research insights.
Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning
[CVPR2026 🎉] Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.
A 2D customized lip-sync model for high-fidelity real-time driving.
WebRTC/RTSP/RTMP/HTTP/HLS/HTTP-FLV/WebSocket-FLV/HTTP-TS/HTTP-fMP4/WebSocket-TS/WebSocket-fMP4/GB28181/SRT/STUN/TURN server and client framework based on C++11
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Wan: Open and Advanced Large-Scale Video Generative Models
Enjoy the magic of Diffusion models!
Scalable and memory-optimized training of diffusion models
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
[SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A feature-rich command-line audio/video downloader
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
The best OSS video generation models, created by Genmo
WebRTC and ORTC implementation for Python using asyncio