Stars
MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7, achieves 74.0 and 75.3 on the BrowseComp and BrowseComp Zh, respectively.
[CVPR'26] LEAD: Minimizing Learner–Expert Asymmetry in End-to-End Driving
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —
[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
A high-throughput and memory-efficient inference and serving engine for LLMs
[REALM25 @ ACL25] - "StateAct" Official Paper Repo (SOTA LLM Agent)
A benchmark for evaluating learning agents based on just language feedback
NVIDIA Linux open GPU kernel module source
Official PyTorch implementation for "Large Language Diffusion Models"
An open-source library for GPU-accelerated robot learning and sim-to-real transfer.
[ICLR 2023 Oral] The official implementation of SQL and EQL in "Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization"
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
List of Tech Company OAs. Save your time from finding them all over the internet.
Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.
The Research Tree - A playground for research at the intersection of Continual, Reinforcement, and Self-Supervised Learning.
Author's PyTorch implementation of BCQ for continuous and discrete actions
A library for generative social simulation
official implementation for our paper Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning (NeurIPS 2023)
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry