Jay Kreps

Mountain View, California, United States
12K followers 500+ connections

View mutual connections with Jay

Jay can introduce you to 10+ people at Anthropic

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to view profile

Anthropic

University of California, Santa Cruz

About

I am a co-founder and CEO at Confluent a company built around realtime data streams and…

Articles by Jay

The Arrival of Streaming: Confluent Raises $50 Million Series C

Mar 7, 2017

The Arrival of Streaming: Confluent Raises $50 Million Series C

I have some exciting news to share: we’ve completed a $50 million investment led by our newest partner, Sequoia Capital…

31 Comments
Announcing Confluent, A Company for Apache Kafka And Realtime Data

Nov 6, 2014

Announcing Confluent, A Company for Apache Kafka And Realtime Data

Today I’m excited to announce that Neha Narkhede, Jun Rao, and I are forming a start-up called Confluent around…

84 Comments

Activity

12K followers

Jay Kreps reposted this
Report this post
Troy Doles

Troy Doles

2mo

Jay Kreps reposted this
WarpStream now officially supports low latency topics.

WarpStream

WarpStream

3mo

Jay Kreps reposted this
Two new features make WarpStream the fastest, most reliable diskless Kafka product available ⤵️ ⚡️ ️Lightning Topics: Ultra-low Produce latency with a p50 of 33ms and p99 of 50ms. 🪂 Ripcord Mode: Continue processing Produce requests even when the Control Plane is unavailable. https://lnkd.in/g7RxJKnU

The Art of Being Lazy(log): Lower Latency and Higher Availability With Delayed Sequencing

The Art of Being Lazy(log): Lower Latency and Higher Availability With Delayed Sequencing
3 Comments
Jay Kreps

Jay Kreps

5mo
Report this post
Jay Kreps shared this
I'm excited for what's next for Confluent at IBM. Here is the letter I wrote to our employees. https://lnkd.in/d9iUMK_T

IBM to Acquire Confluent

IBM to Acquire Confluent
88 Comments
Jay Kreps reposted this
Report this post
Daniela Amodei

Daniela Amodei

7mo

Jay Kreps reposted this
I'm delighted to share that Rahul Patil is joining Anthropic as our CTO. Rahul has built and scaled critical infrastructure at Stripe, AWS, and Microsoft over the past 20 years—the kind of systems that enterprises rely on. In meeting with our 300,000+ business customers, we've heard a consistent theme: when you're deploying AI at scale, trust and reliability aren't nice-to-haves, they're requirements. Rahul's track record of building systems that are as reliable as our models is exactly what we need as we continue into our next phase of growth. What excites me most about Rahul joining is how deeply his values align with ours. He understands that behind every system we build are real people and real businesses counting on us to get it right. I'm excited to see what we'll build together! Welcome, Rahul!

Rahul Patil

Rahul Patil

7mo

Jay Kreps reposted this
I am excited to join a new mission and calling—Anthropic! AI possibilities seem endless, and it is going to be an extraordinary adventure of discoveries and effort to make these possibilities real. More importantly, it is going to require us to make a conscientious set of decisions every day to safely navigate this massive transformation and ensure that responsible AI wins. I am grateful to join the humble, brilliant, hardworking and conscientious crew at Anthropic that has ignited much of the imagination and excitement around the world! To every single Stripe team member who built Stripe before me and with me—thank you for an incredibly transformative 5+ years! I leave with immense gratitude and a profound belief in Stripe's mission. It has been an absolute privilege to serve our users and to work alongside such a brilliant team, building foundational infrastructure for the internet economy. I can’t wait to see what you all build next!
33 Comments
Jay Kreps reposted this
Report this post
Jay Kreps reposted this

Confluent

Confluent

1y

Jay Kreps reposted this
Data Streaming 🤝 Data Intelligence Confluent and Databricks are teaming up to empower organizations with AI-ready real-time data. Our seamless integrations between Tableflow, Delta Tables, and Unity Catalog will connect operational and analytical systems, to power AI-driven applications. Learn how together we’re bridging the data divide in Jay Kreps and Ali Ghodsi’s blog ➡️ https://cnfl.io/3WVTJsG

How Confluent and Databricks Are Unlocking Real-Time AI

How Confluent and Databricks Are Unlocking Real-Time AI
16 Comments
Jay Kreps

Jay Kreps

1y
Report this post
Jay Kreps shared this
The analytics world is moving from BI to AI. This isn’t just about enabling insights, it’s about powering actions. To make this real analytical data needs to work at the speed of operational applications. Confluent and Databricks are partnering to make this happen: connecting Delta Lake and Unity Catalog to real-time streams with Tableflow. Read more from Ali Ghodsi and me here:

How Confluent and Databricks Are Unlocking Real-Time AI

How Confluent and Databricks Are Unlocking Real-Time AI
18 Comments
Jay Kreps reposted this
Report this post
Jay Kreps reposted this

Jack Vanlightly

Jack Vanlightly

1y

Jay Kreps reposted this
Yesterday in my celebration of Confluent Freight Clusters becoming generally available in AWS, I covered Virtual Consensus—today, let’s dive deeper into a crucial distinction: 👉 Failure-free ordering vs. Fault-tolerant consensus Decoupling these concepts can change how we think about consensus protocols. This is another one for those nerds that like replication protocols 😄 https://lnkd.in/d4nqJfsy

Steady on! Separating Failure-Free Ordering from Fault-Tolerant Consensus — Jack Vanlightly

Steady on! Separating Failure-Free Ordering from Fault-Tolerant Consensus — Jack Vanlightly
3 Comments
Jay Kreps reposted this
Report this post
Jay Kreps reposted this

Confluent

Confluent

1y

Jay Kreps reposted this
Did you know over 80% of Apache Kafka infrastructure costs can be tied to networking? Freight Clusters change the game by eliminating these costs—and now, Freight is generally available on AWS! Head to our blog to learn how this 🆕 cluster type paves the way to onboarding more data-intensive workloads without breaking your budget. 👉 https://cnfl.io/40XX9x7

public_profile__posts
1 Comment
Jay Kreps reposted this
Report this post
Jay Kreps reposted this

Jack Vanlightly

Jack Vanlightly

1y

Jay Kreps reposted this
Speculation is growing that Snowflake is planning to acquire Redpanda—but why? What justifies the rumored high price tag? When you consider market trends, the Snowflake vs. Databricks rivalry, and the AI shift, the rationale starts to become clear. Here’s my take. https://lnkd.in/dHXKxJa3
7 Comments
Jay Kreps

Jay Kreps

1y
Report this post
Jay Kreps shared this
https://lnkd.in/dtfTtUR8

Dario Amodei — Machines of Loving Grace

Dario Amodei — Machines of Loving Grace
6 Comments

Jay Kreps liked this
Report this post
Jay Kreps liked this

Daniela Amodei

Daniela Amodei

3mo

Jay Kreps liked this
I'm horrified and sad to see what has happened in Minnesota. Freedom of speech, civil liberties, the rule of law, and human decency are cornerstones of American democracy. What we've been witnessing over the past days is not what America stands for. My career has been dedicated to the idea that it is possible to build a kinder, brighter future. I still believe this is true. I am heartened to see leaders from across the political spectrum speaking up to say that what they saw was wrong, and appreciate their calls for a full and transparent investigation.
127 Comments
Jay Kreps liked this
Report this post
Jay Kreps liked this

Almog Gavra

Almog Gavra

4mo

Jay Kreps liked this
My brain's main memory isn't big enough to keep Jeff Dean's "latency numbers every programmer should know" paged in. Instead I keep a smaller set (and added a few). Some personal anecdotes for why: - Only once in my career did branch prediction cost matter. I was working on search at LinkedIn and we had a huge if/else clause of ~128 equality statements in order to deserialize records. We reworked it to use a lookup in a static array based on a hash of the the key and it made a significant performance because this code was run thousands of times per query. The other 99.99% of the time I don't care about the cost of branch prediction. - The cost of mutex locking for uncontested locks is basically free in an I/O bound system so I don't care much about the cost. Program safety & correctness comes first, so use locks freely and optimize later. More likely your code will slow down from unnecessary contention (or worse, a deadlock) than from the cost of acquiring the mutex itself. - I don't use spinning disks much anymore, so I've paged those numbers out. I also don't build global data systems that replicate cross-regions (though that is interesting in its own right). - Jeff Dean's numbers don't include inter-vs-intra AZ network latencies within a single data center. I've added those here because they're very important when considering tradeoffs in highly available cloud native systems. - The numbers also didn't include S3/S3 Express for obvious reasons. Those are critical to my day-to-day job. I'm curious to hear from you if you work in domains other than database design (which is mostly I/O bound): what latency matters to you?

public_profile__reactions
11 Comments
Jay Kreps liked this
Report this post
Xavier (Xavi) Amatriain

Xavier (Xavi) Amatriain

5mo

Jay Kreps liked this
Thrilled to announce that I will be joining Expedia Group as Chief AI & Data Officer! This opportunity is incredibly exciting for me because, in many ways, it takes me back to the roots of my career: applying state-of-the-art AI to consumer-facing products that impact millions of people. I also have a deep personal connection to this vertical. As a Barcelona native, I grew up living and breathing tourism, a passion that has only grown through my own extensive travels around the world. I love this industry, and the chance to shape its future is a privilege. Finally, I am convinced that travel is one of the verticals most ripe for AI disruption right now. There is a reason why almost every Generative AI demo starts with a travel planning use case—it is the perfect problem space. But to move beyond demos and actually make travel better, you need two things: scale of data and deep domain expertise. Expedia has gathered both over many years. I look forward to working with the team to move towards proactive, predictive, and deeply personalized travel experiences. Let’s go! ✈️

Expedia Group

Expedia Group

5mo

Jay Kreps liked this
We’re excited to share that Xavier (Xavi) Amatriain is joining Expedia Group as our first Chief AI & Data Officer. Xavi brings more than two decades of experience building world-class AI systems at Google, LinkedIn, Netflix, and multiple startups. He will lead our long-term AI and data strategy, accelerating how we use generative, agentic, and adaptive technologies to power proactive, predictive, and deeply personalized travel experiences on a global scale. As our CTO, Ramana Thumu, put it: Xavi’s depth in AI and machine learning will help redefine how people experience travel — from how they discover and plan trips to how we anticipate their needs in real time. His leadership strengthens our ambition to build the world’s most intelligent travel platform. Xavi’s arrival underscores our commitment to attracting top AI talent, investing in foundational platforms, and tackling travel’s toughest problems with real-world, high-scale impact. https://bit.ly/4rwzW0e

public_profile__reactions
188 Comments
Jay Kreps liked this
Report this post
Daniela Amodei

Daniela Amodei

7mo

Jay Kreps liked this
Today we launched 'Keep thinking' - and honestly this campaign feels like coming full circle. Four years ago, we founded Anthropic because we couldn't stop thinking about a fundamental question: how do we build AI systems that are both incredibly capable AND genuinely beneficial? That question kept us up at night (it still does!), but it also brought together the most thoughtful, determined people I've ever worked with. Now I see that same spirit reflected in our Claude community. You're the ones who look at complex problems and say 'let's figure this out.' You're using Claude not just as a tool, but as a thinking partner to tackle everything from enterprise transformation to personal creative projects. In meeting with customers over the past year, I've been consistently moved by your stories - the engineer who rewrote their entire codebase to be more accessible, the doctor researching rare diseases, the teacher creating personalized learning plans. You embody what 'Keep thinking' means. Thank you for being part of this journey and thinking alongside us. 🙏

Anthropic

Anthropic

7mo

Jay Kreps liked this
There's never been a better time to be a problem solver. Keep thinking.

public_profile__reactions
31 Comments

See all activities

Experience

Anthropic

San Francisco Bay Area
-

San Francisco Bay Area
-

Mountain View, CA
-
-
-
-
-

Education

University of California, Santa Cruz

-

2003 - 2005
-

1999 - 2002

Publications

Building a Replicated Logging System with Apache Kafka

Very Large Data Base Endowment Inc. (VLDB Endowment) August 5, 2015
Apache Kafka is a scalable publish-subscribe messaging system
with its core architecture as a distributed commit log.
It was originally built at LinkedIn as its centralized event
pipelining platform for online data integration tasks. Over
the past years developing and operating Kafka, we extend
its log-structured architecture as a replicated logging backbone
for much wider application scopes in the distributed
environment. In this abstract, we will talk about our…

Apache Kafka is a scalable publish-subscribe messaging system
with its core architecture as a distributed commit log.
It was originally built at LinkedIn as its centralized event
pipelining platform for online data integration tasks. Over
the past years developing and operating Kafka, we extend
its log-structured architecture as a replicated logging backbone
for much wider application scopes in the distributed
environment. In this abstract, we will talk about our design
and engineering experience to replicate Kafka logs for various
distributed data-driven systems at LinkedIn, including
source-of-truth data storage and stream processing.

Other authors
See publication
The "Big Data" ecosystem at LinkedIn

SIGMOD 2013 - Special Interest Group on Management Of Data 2013
Other authors
Serving Large-scale Batch Computed Data with Project Voldemort

FAST 2012 January 14, 2012
Project Voldemort is a general purpose distributed storage and serving system inspired by Amazon's Dynamo. We present a novel pipeline for computing, deploying and serving massive read-only data sets that we have integrated into Voldemort. This pipeline builds on the inherent fault-tolerance and horizontal scalability of the Dynamo architecture to solve a common problem: performing massive data loads into an online system without impacting serving performance. The data generation is done…

Project Voldemort is a general purpose distributed storage and serving system inspired by Amazon's Dynamo. We present a novel pipeline for computing, deploying and serving massive read-only data sets that we have integrated into Voldemort. This pipeline builds on the inherent fault-tolerance and horizontal scalability of the Dynamo architecture to solve a common problem: performing massive data loads into an online system without impacting serving performance. The data generation is done offline using Hadoop, and our system effectively bridges the gap between batch-oriented clusters and real-time serving systems. As a production system at LinkedIn, this has helped us rapidly build out various data-intensive social products that are computed offline, and then publish the multi-TB result data to the live production throughout the day.

Other authors
See publication
Kafka: A Distributed Messaging System for Log Processing

NetDB 2011 April 13, 2011
Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and…

Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and scalable. Our experimental results show that Kafka has superior performance when compared to two popular messaging systems. We have been using Kafka in production for some time and it is processing hundreds of gigabytes of new data each day.

Other authors
See publication
Aerial LiDAR Data Classification Using Support Vector Machines

Third International Symposium on 3D Data Processing, Visualization, and Transmission Jun 2006
We classify 3D aerial LiDAR scattered height data into buildings, trees, roads, and grass using the support vector machine (SVM) algorithm. To do so we use five features: height, height variation, normal variation, LiDAR return intensity, and image intensity. We also use only LiDAR- derived features to organize the data into three classes (the road and grass classes are merged). We have implemented and experimented with several variations of the SVM algorithm with soft-margin classification to…

We classify 3D aerial LiDAR scattered height data into buildings, trees, roads, and grass using the support vector machine (SVM) algorithm. To do so we use five features: height, height variation, normal variation, LiDAR return intensity, and image intensity. We also use only LiDAR- derived features to organize the data into three classes (the road and grass classes are merged). We have implemented and experimented with several variations of the SVM algorithm with soft-margin classification to allow for the noise in the data. We have applied our results to classify aerial LiDAR data collected over approximately 8 square miles. We visualize the classification results along with the associated confidence using a variation of the SVM algorithm producing probabilistic classifications. We observe that the results are stable and robust. We compare the results against the ground truth and obtain higher than 90% accuracy and convincing visual results.

Other authors
See publication
Avatara: OLAP for Web-scale Analytics Products

VLDB 2012 - International Conference on Very Large Databases
Other authors
See publication

Projects

Apache Incubator Samza

Jul 2013
Samza provides a system for processing stream data from publish-subscribe systems such as Apache Kafka. The developer writes a stream processing task, and executes it as a Samza job. Samza then routes messages between stream processing tasks and the publish-subscribe systems that the messages are addressed to.

Other creators
See project

Organizations

Apache Software Foundation

Member

Jul 2014 - Present

Recommendations received

Mitch Stuart

“Jay is a brilliant and innovative architect and developer who has made a tremendous impact here. Too many engineers (like me!) focus on "how" to do things - Jay is smart enough (and brave enough!) to step back and ask "why". Jay is able to analyze complex business and technical requirements and develop a holistic, elegant, and flexible design to serve our current and future needs. Although Jay works in deeply complex technical areas, he has a talent for communicating in a clear and nurturing way, which allows his work to be leveraged many times over. We are lucky to have him here.”

1 person has recommended Jay

Join now to view

View Jay’s full profile

See who you know in common
Get introduced
Contact Jay directly

Join to view full profile

Explore more posts

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses

See all courses

Jay Kreps

Mountain View, California, United States 12K followers 500+ connections

About

Articles by Jay

The Arrival of Streaming: Confluent Raises $50 Million Series C

Announcing Confluent, A Company for Apache Kafka And Realtime Data

Activity

12K followers

Troy Doles

WarpStream

Jay Kreps

Daniela Amodei

Rahul Patil

Confluent

Jay Kreps

Jack Vanlightly

Confluent

Jack Vanlightly

Jay Kreps

Daniela Amodei

Almog Gavra

Xavier (Xavi) Amatriain

Expedia Group

Daniela Amodei

Anthropic

Experience

-

-

-

-

-

-

-

Education

-

-

Publications

Very Large Data Base Endowment Inc. (VLDB Endowment) August 5, 2015

The "Big Data" ecosystem at LinkedIn

SIGMOD 2013 - Special Interest Group on Management Of Data 2013

FAST 2012 January 14, 2012

NetDB 2011 April 13, 2011

Third International Symposium on 3D Data Processing, Visualization, and Transmission Jun 2006

VLDB 2012 - International Conference on Very Large Databases

Projects

Jul 2013

Organizations

Apache Software Foundation

Member

Recommendations received

Mitch Stuart

View Jay’s full profile

Explore more posts

Explore top content on LinkedIn

Add new skills with these courses

Data Platforms: Spark to Snowflake

Advanced Data Engineering with Snowflake

Data Pipeline Automation with GitHub Actions Using R and Python

Mountain View, California, United States
12K followers 500+ connections