Why Your Agent Probably Needs Skills
Kiln has added agent skills support! Here's what skills are, why progressive disclosure works, and how to use them in your agents.
Introducing Kiln Prompt Optimizer
Kiln will now automatically find the ideal prompt for your task, guided by your own evals. Performance gains that blow away manual optimization, and even beat fine-tuning.
The Requirements Layer Your AI System is Missing
Most AI teams think they have a testing problem. But the real issue is they aren't defining what "good" actually means. Learn how specs can transform your AI development workflow.
Kiln Eval Builder & AI Assistant for Evals and Synthetic Data
Introducing Kiln Eval Builder — a guided assistant that helps you build accurate evals, synthetic data, and judge prompts in minutes instead of hours.
RAG Isn't One Size Fits All: Here's How to Tune It for Your Use Case
Great RAG comes from a tight iteration loop. Learn how to systematically improve each layer of your RAG system using Kiln and LanceDB.
New in Kiln: RAG Q&A Evals, Tool Use Evals, Reranking, Semantic Chunking, an MCP server, and more!
Our new v0.23.0 app release includes powerful new features for RAG, evals, and more!
Introducing Kiln Agent Builder
Kiln now supports building agentic systems in as little as 5 minutes. Also new is Local RAG: build RAG systems without any third party services.
New in Kiln: Search Tools/RAG, Document Library, and Synthetic Data V3
Introducing our biggest release ever! Build state-of-the-art RAG systems in minutes, with drag-and-drop!
New in Kiln: Tool Calling and MCP
Kiln v0.20.1 adds tool calling, MCP support, and more!
End to End AI Project with Evals, Synthetic Data, and Fine-Tuning
A video walk through of Kiln project from start to finish. The project we build is open source and available on GitHub.
I wrote 2000 LLM test cases so you don't have to
How thousands of LLM test cases make Kiln AI easier to use.
Kiln Release v0.18.1: Evals V2, Synthetic Data V2, Kiln Issues, New Models, and more!
Our new release is our biggest ever! Read about how we improved evals, synthetic data, and a whole new concept: Kiln Issues.
Many Small Evals Beat One Big Eval, Every Time
Evaluating AI products is notoriously difficult. In this article we walk through how to make it approachable, maintainable, and iterative using many small evals instead of one large eval.
Distillation in Practice: Ablating Gemma 3 27B with synthetic data from Sonnet 4
A hands-on experiment fine-tuning Gemma 3 27B using synthetic data from Sonnet 4, achieving performance that rivals GPT-4o on specific tasks while being smaller and cheaper to run.
When Fine-Tuning Actually Makes Sense: A Developer's Guide
Most teams are curious about fine-tuning and how it might help their AI products, but they don't know what to expect from the process, how to measure success, or where to start.
Kiln v0.16.0 Released
Qwen 3, Gemma 3, Evals Guides, and Fine-Tuning Updates!
Fine-Tuning for Prompt Engineers
Learn how to fine-tune LLMs without coding in under 20 minutes. Turn existing prompts into custom models that are faster, cheaper, and often better than prompting larger models.
Kiln v0.13.1 Released
Vertex, Azure, QwQ, Gemma, Hugging Face, W&B, Anthropic, Import CSV, and More!
Kiln v0.12.1 Released
Powerful Evaluation Toolkit & Support for Distilling Sonnet 3.7 Thinking
Video Guide: Create LLM Evals in under 20 minutes
Easily create your own LLM evals in minutes. Use powerful evaluation methods like LLM-as-Judge and G-eval from Kiln's UI without coding.
Kiln v0.11.1 Released
A new release, faster synthetic data, bug fixes, and Kiln news.
Kiln Hits 1,000 GitHub Stars
That was fast. Thank you for the support!
Fine-Tuning Video Guide: Distill DeepSeek R1 into Llama 8B
Create your own reasoning model in under 30 minutes with Kiln AI.
The Vector Institute for AI Research has Adopted Kiln
The Vector Institute, a research lab founded by Geoff Hinton, has started using Kiln for their AI research.
Guide: Fine Tune 9 LLM Models in 18 Minutes
A step-by-step guide to easily fine-tune from scratch with Kiln. No coding required.
Build AI that actually works.
Ship custom AI products with evals, fine-tuning, and prompt optimization built in.