We're hiring! Come build with us

Working at Zep

Join a high-agency team taking on unsolved challenges at the frontier of AI.

Our Vision

We're Building AI's Context Engineering Foundation

Context engineering is the art of providing AI with the right information at the right time. For AI agents to deliver personalized, accurate experiences, they need systematic access to user preferences, business data, and temporal relationships beyond static facts.

We're developing the foundational infrastructure that orchestrates context retrieval and assembly from user memory and business data. This context engineering foundation will power the next generation of AI applications that truly understand users and business scenarios.

Backed by Leading Investors

Root Ventures
Y Combinator
Engineering Capital
Team at Zep

Great Healthcare

Platinum medical, dental, and vision insurance

Compensation

Highly competitive salary and equity compensation. 401K plan + employer matching. Unlimited PTO.

Flexible WFH Policy

Flexible in office culture in San Francisco. Remote work options and periodic travel to San Francisco if based outside the Bay Area

Cell Phone Benefits

Monthly stipend toward your mobile plan.

Open Roles

Open positions from Work at a Startup.

View All on YC

Senior Applied Research Engineer

San Francisco, United States / Remote (US)

Full-timeEngineeringBackend6+ years
$180K - $250K
1.00% - 1.50% equity
Amazon Web Services (AWS)C++GoPythonRustTorch/PyTorchLLMsUS citizen/visa only

Zep is the memory and context layer for AI agents. As a Senior Applied Research Engineer, you'll explore novel approaches to memory, context, and context generation, then own those ideas all the way to production.

This is a research role with a hard applied bent. We're not hiring ML researchers chasing publications. We're hiring engineers who can run rigorous experiments, train and evaluate models, and ship the result as production code our customers depend on.

What you'll do

  • Explore novel approaches to memory, context, and context generation. Define the problem, run the experiments, ship the result.
  • Own research to production end-to-end: dataset creation and curation, experiment design, evaluation, training and finetuning, and production deployment.
  • Train, finetune, and evaluate models on Zep's domain. Build the eval harnesses that catch regressions before they ship.
  • Work with our model serving stack to operate inference at low latency and reasonable cost on AWS.

What we're looking for

  • 6+ years of production engineering with a strong backend systems background. You've shipped services with real throughput and latency requirements.
  • Master's in Computer Science or equivalent.
  • Strong research skills: methodology, dataset creation and curation, experiment design, and evaluation. You can frame an open problem and design experiments that actually answer the question.
  • Hands-on experience with model finetuning. Working familiarity with transformer architectures, training and finetuning workflows, and evaluation. PyTorch and OpenAI Triton for experimentation.
  • Working experience with model serving technologies: vLLM, SGLang, or Triton Inference Server. You've operated inference in production.
  • Python, plus high proficiency in one of Rust, C++, or Go. You can work in critical-path code and on performance. Python-only is not enough.
  • Hands-on AWS experience in production: deployments, monitoring, scaling, cost and reliability tradeoffs.

Nice to have

  • Published or open-source work in retrieval, memory systems, or LLM evaluation.

Tech stack: Python, Rust/C++/Go, PyTorch, vLLM/SGLang, AWS.

This role is probably NOT a fit if:

  • You're an ML researcher or model trainer who hasn't shipped research to production.
  • Your background is primarily Python application work without lower-level systems experience.
  • You haven't operated production backend systems with real latency or throughput requirements.

Interview Process

We respect your time and keep our interview process tight and focussed.

Screening Call (w/ Daniel, our Founder) → Team Calls (2-3 hours back-to-back, may include a presentation) → Decision Call (Daniel, again)

Senior AI Engineer

San Francisco, United States / Remote (US)

Full-timeEngineeringBackend6+ years
$180K - $250K
1.00% - 1.50% equity
Amazon Web Services (AWS)GoPythonTypeScriptLLMsAI AgentsUS citizen/visa only

Zep is the memory and context layer for AI agents. As a Senior AI Engineer, you'll build low-latency backend systems, operate them in production on AWS, and ship LLM-powered capabilities our customers depend on.

You'll have the opportunity to work on Graphiti (25K+ GitHub stars), Zep’s popular open-source context graph framework.

This is a senior backend role centered on running LLM workloads at significant scale. We're not hiring ML researchers or data scientists. We're hiring engineers who have already lived through the messy reality of taking an LLM application from demo to production.

What you'll do

  • Ship product features end-to-end across backend services, APIs, data flows, and the supporting UI where it makes sense.
  • Build and operate LLM-powered systems: extraction pipelines, evaluation harnesses, and reliability improvements running at scale.
  • Contribute to system design for new components. Write the code, document the decisions, iterate.
  • Improve production quality across performance, observability, and operational runbooks on AWS.

What we're looking for

  • 6+ years of production engineering with a strong backend systems background. You've shipped services with real throughput and latency requirements.
  • Master's in Computer Science or equivalent.
  • Go and Python experience in real systems. You can work in critical-path code and on performance.
  • Hands-on AI agent and LLM application experience. You've shipped a non-trivial agentic system to production. Not a prototype, not a thin wrapper over a chat-completion API. We expect concrete examples: multi-turn agent loops with tool calling, retrieval and context pipelines you tuned against real failures, eval harnesses you built to catch regressions, or production memory and state systems for agents.
  • Working familiarity with the agent ecosystem: at least one of LangGraph, Google ADK, Mastra, or other agent SDKs, vector stores, and eval tooling.
  • Extremely comfortable with spec-driven agent coding, coding harnesses, and guiding agents to build complex product.
  • Hands-on AWS experience in production: deployments, monitoring, scaling, cost and reliability tradeoffs.

Nice to have

  • TypeScript experience for frontend or SDK work.

Tech stack: Go, Python, TypeScript, AWS.

This role is probably NOT a fit if:

  • Your LLM experience is single-turn chat completions or RAG-as-a-feature.
  • Your background is primarily in ML research or model training rather than shipping agent systems in production.
  • You haven't operated production backend systems with real latency or throughput requirements.

Interview Process

We respect your time and keep our interview process tight and focussed.

Screening Call (w/ Daniel, our Founder) → Team Calls (2-3 hours back-to-back, may include a presentation) → Decision Call (Daniel, again)

Lead Forward Deployed Engineer

San Francisco, United States / Remote (US)

Full-timeEngineeringFull stack6+ years
$175K - $250K
0.50% - 1.50% equity
PythonTypeScriptLLMsAI AgentsUS citizen/visa only

Zep is the memory and context layer for AI agents. As Lead Forward Deployed Engineer, you'll embed with customer engineering teams to integrate Zep into their production agent systems: diagnosing context-quality failures, designing memory architectures around their data, and shipping the integrations that make their agents actually work in the wild.

This is an applied AI engineering role with a customer surface. We're not looking for ML researchers or data scientists. We're looking for engineers who have already lived through the messy reality of taking an agent from demo to production.

What you'll do

  • Own end-to-end delivery for strategic deployments: scope, design, build, rollout, stabilize.
  • Embed with customer engineers to integrate Zep into real systems: data, APIs, auth, infra.
  • Ship production code: integrations, reference implementations, performance and reliability fixes.
  • Help level up the FDE function: coach newer FDEs on execution, review designs and code when useful, and capture repeatable patterns.

What we're looking for

  • 6+ years of production engineering. You can own both architecture and implementation, and you've shipped systems that real customers depend on.
  • Hands-on AI agent / LLM application experience. You've shipped a non-trivial agentic system to production. That is, not a prototype, not a thin wrapper over a chat-completion API. We expect concrete examples: multi-turn agent loops with tool calling, retrieval and context pipelines you tuned against real failures, eval harnesses you built to catch regressions, or production memory and state systems for agents.
  • Working familiarity with the agent ecosystem: at least one of LangChain / LlamaIndex / model-provider SDKs, vector stores (pgvector, Pinecone, Weaviate), and eval tooling (Braintrust, LangSmith, custom harnesses).
  • Experience across diverse customer technology stacks and cloud platforms (AWS or GCP). Proficiency with Docker and networking fundamentals.
  • Fast debugging and strong operational instincts in complex, real-world environments.
  • Leadership through hands-on work; excellent communication for customer sessions and coaching junior engineers.

Tech stack: Python, TypeScript, AWS or GCP, Docker.

This role is probably NOT a fit if:

  • Your LLM experience is single-turn chat completions or RAG-as-a-feature.
  • You're an ML researcher or model trainer looking to move into agents — this role is for engineers already deep in agent production.
  • You haven't worked directly with customers on integration or delivery.

Interview Process

We respect your time and keep our interview process tight and focussed.

Screening Call (w/ Daniel, our Founder) → Team Calls (2-3 hours back-to-back, may include a presentation) → Decision Call (Daniel, again)

Our Process

At Zep, we believe in moving quickly when we spot talent.

1

Introductory Video Call

A short call with Daniel, our founder.

2

Team Interview

An opportunity to assess how well you fit into our collaborative, team-focused environment.

3

Final Interview with our CEO

A one-on-one discussion about your role, goals, and potential contributions to Zep's growth.