Kiln AI Blog

Notes on building AI that actually works.

Why Your Agent Probably Needs Skills

Kiln has added agent skills support! Here's what skills are, why progressive disclosure works, and how to use them in your agents.

Mar 18, 2026

Introducing Kiln Prompt Optimizer

Feb 23, 2026

Kiln will now automatically find the ideal prompt for your task, guided by your own evals. Performance gains that blow away manual optimization, and even beat fine-tuning.

The Requirements Layer Your AI System is Missing

Feb 9, 2026

Most AI teams think they have a testing problem. But the real issue is they aren't defining what "good" actually means. Learn how specs can transform your AI development workflow.

Kiln Eval Builder & AI Assistant for Evals and Synthetic Data

Feb 4, 2026

Introducing Kiln Eval Builder — a guided assistant that helps you build accurate evals, synthetic data, and judge prompts in minutes instead of hours.

RAG Isn't One Size Fits All: Here's How to Tune It for Your Use Case

Dec 1, 2025

Great RAG comes from a tight iteration loop. Learn how to systematically improve each layer of your RAG system using Kiln and LanceDB.

New in Kiln: RAG Q&A Evals, Tool Use Evals, Reranking, Semantic Chunking, an MCP server, and more!

Nov 14, 2025

Our new v0.23.0 app release includes powerful new features for RAG, evals, and more!

Introducing Kiln Agent Builder

Oct 14, 2025

Kiln now supports building agentic systems in as little as 5 minutes. Also new is Local RAG: build RAG systems without any third party services.

New in Kiln: Search Tools/RAG, Document Library, and Synthetic Data V3

Sep 23, 2025

Introducing our biggest release ever! Build state-of-the-art RAG systems in minutes, with drag-and-drop!

New in Kiln: Tool Calling and MCP

Sep 9, 2025

Kiln v0.20.1 adds tool calling, MCP support, and more!

End to End AI Project with Evals, Synthetic Data, and Fine-Tuning

Aug 6, 2025

A video walk through of Kiln project from start to finish. The project we build is open source and available on GitHub.

I wrote 2000 LLM test cases so you don't have to

Jul 21, 2025

How thousands of LLM test cases make Kiln AI easier to use.

Kiln Release v0.18.1: Evals V2, Synthetic Data V2, Kiln Issues, New Models, and more!

Jul 17, 2025

Our new release is our biggest ever! Read about how we improved evals, synthetic data, and a whole new concept: Kiln Issues.

Many Small Evals Beat One Big Eval, Every Time

Jun 27, 2025

Evaluating AI products is notoriously difficult. In this article we walk through how to make it approachable, maintainable, and iterative using many small evals instead of one large eval.

Distillation in Practice: Ablating Gemma 3 27B with synthetic data from Sonnet 4

May 30, 2025

A hands-on experiment fine-tuning Gemma 3 27B using synthetic data from Sonnet 4, achieving performance that rivals GPT-4o on specific tasks while being smaller and cheaper to run.

When Fine-Tuning Actually Makes Sense: A Developer's Guide

May 28, 2025

Most teams are curious about fine-tuning and how it might help their AI products, but they don't know what to expect from the process, how to measure success, or where to start.

Kiln v0.16.0 Released

May 17, 2025

Qwen 3, Gemma 3, Evals Guides, and Fine-Tuning Updates!

Fine-Tuning for Prompt Engineers

Mar 21, 2025

Learn how to fine-tune LLMs without coding in under 20 minutes. Turn existing prompts into custom models that are faster, cheaper, and often better than prompting larger models.