Benjamin Wagner

San Francisco Bay Area
3K followers 500+ connections

View mutual connections with Benjamin

Benjamin can introduce you to 10+ people at Firebolt

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to view profile

Firebolt

Technical University Munich

Activity

We contributed to Apache Arrow making Parquet bloom filters smaller and more effective. Full write up by Adrian Garcia Badaracco on what it means…

We contributed to Apache Arrow making Parquet bloom filters smaller and more effective. Full write up by Adrian Garcia Badaracco on what it means…

Liked by Benjamin Wagner
Come find us at Iceberg Summit! Coolest looking team out here 🔥🔥 Win a PS5!!

Come find us at Iceberg Summit! Coolest looking team out here 🔥🔥 Win a PS5!!

Liked by Benjamin Wagner
🚨 #icebergSummit Speaker Spotlight!🚨 What happens when #apacheIceberg meets trillions of rows and petabytes of data? Spoiler: things start…

🚨 #icebergSummit Speaker Spotlight!🚨 What happens when #apacheIceberg meets trillions of rows and petabytes of data? Spoiler: things start…

Liked by Benjamin Wagner

Experience

Firebolt

San Francisco Bay Area
-

San Francisco Bay Area
-

Munich, Bavaria, Germany
-

Munich, Bavaria, Germany
-

Munich, Bavaria, Germany
-

Munich, Bavaria, Germany
-

Berlin und Umgebung, Deutschland

Education

Technical University Munich

2020 - 2023

Master’s Thesis: “Incremental Fusion: Unifying Compiled and Vectorized Query Execution”
2017 - 2020
2017 - 2019

Publications

Assembling a Query Engine From Spare Parts

CDMS @ VLDB '22, September 9, 2022, Sydney, Australia
Building a new cloud data warehouse is a daunting challenge, requiring massive investments into both the query engine and surrounding cloud infrastructure. Given the mature space, it might seem like a Herculean task to enter the market as a small startup.
At Firebolt we assembled a working, high-performance cloud data warehouse in less than 18 months. We achieved this by building our query engine on top of existing projects and then investing heavily into differentiating features. This paper…

Building a new cloud data warehouse is a daunting challenge, requiring massive investments into both the query engine and surrounding cloud infrastructure. Given the mature space, it might seem like a Herculean task to enter the market as a small startup.
At Firebolt we assembled a working, high-performance cloud data warehouse in less than 18 months. We achieved this by building our query engine on top of existing projects and then investing heavily into differentiating features. This paper presents our decision-making and learned lessons along the way.

Other authors
See publication
Incremental Fusion: Unifying Compiled and Vectorized Query Execution

ICDE'24, May 13-17, 2024, Utrecht, Netherlands
Modern high-performance analytical query engines follow one of two execution paradigms. Vectorized engines implement an interpreter for relational algebra operators that operates on batches of tuples to maximize performance. Compiling engines, on the other hand, generate optimized and specialized
code for every query. This paper unifies these two approaches. We present Incremental Fusion, a novel execution paradigm for modern, high-performance query engines. An Incremental Fusion engine…

Modern high-performance analytical query engines follow one of two execution paradigms. Vectorized engines implement an interpreter for relational algebra operators that operates on batches of tuples to maximize performance. Compiling engines, on the other hand, generate optimized and specialized
code for every query. This paper unifies these two approaches. We present Incremental Fusion, a novel execution paradigm for modern, high-performance query engines. An Incremental Fusion engine performs operator-fusing code generation – with a twist: The compiling engine generates its own vectorized interpreter. The engine uses a finite set of building blocks below relational algebra for code generation. It can enumerate each building block and generate a vectorized primitive for it. The vectorized interpreter becomes a free byproduct of carefully choosing the right abstraction for code generation. This allows an Incremental Fusion engine to dynamically switch between vectorized interpretation and operator-fusing code generation. We demonstrate Incremental Fusion in our open-source prototype engine InkFuse.
We measure InkFuse against the state-of-the-art vectorized and compiling engines DuckDB and Umbra. InkFuse is able to achieve competitive performance both for low-latency processing, and compute-intensive long-running queries.

Other authors
See publication
Self-Tuning Query Scheduling for Analytical Workloads

SIGMOD ’21, June 20–25, 2021, Virtual Event, China
Most database systems delegate scheduling decisions to the operating system. While such an approach simplifies the overall database design, it also entails problems. Adaptive resource allocation becomes hard in the face of concurrent queries. Furthermore, incorporating domain knowledge to improve query scheduling is difficult. To mitigate these problems, many modern systems employ forms of task-based parallelism. The execution of a single query is broken up into small, independent chunks of…

Most database systems delegate scheduling decisions to the operating system. While such an approach simplifies the overall database design, it also entails problems. Adaptive resource allocation becomes hard in the face of concurrent queries. Furthermore, incorporating domain knowledge to improve query scheduling is difficult. To mitigate these problems, many modern systems employ forms of task-based parallelism. The execution of a single query is broken up into small, independent chunks of work (tasks). Now, fine-grained scheduling decisions based on these tasks are the responsibility of the database system. Despite being commonplace, little work has focused on the opportunities arising from this execution model.

In this paper, we show how task-based scheduling in database systems opens up new areas for optimization. We present a novel lock-free, self-tuning stride scheduler that optimizes query latencies for analytical workloads. By adaptively managing query priorities and task granularity, we provide high scheduling elasticity. By incorporating domain knowledge into the scheduling decisions, our system is able to cope with workloads that other systems struggle with. Even at high load, we retain near optimal latencies for short running queries. Compared to traditional database systems, our design often improves tail latencies by more than 10x

Other authors
See publication

Languages

German

Native or bilingual proficiency
English

Professional working proficiency
French

Elementary proficiency

View Benjamin’s full profile

See who you know in common
Get introduced
Contact Benjamin directly

Join to view full profile

Other similar profiles

Vikas Maurya

Vikas Maurya

Munich

Connect
Dennis Grinwald

Dennis Grinwald

Germany

Connect
Taslima Akter Projucti

Taslima Akter Projucti

Berlin Metropolitan Area

Connect
Anas Gouda

Anas Gouda

Dortmund

Connect
Anirban Ghosh

Anirban Ghosh

Bonn

Connect
Dion Klajiqi

Dion Klajiqi

Hamburg

Connect
Richard Nai

Richard Nai

Santa Ana, CA

Connect
Oliver Borchert

Oliver Borchert

Munich

Connect
Pramod Vadiraja

Pramod Vadiraja

Dresden

Connect
Pranav Ragupathy

Pranav Ragupathy

Berlin

Connect
Hana Mekić

Hana Mekić

Munich

Connect
Maqbool ur Rahim Khan

Maqbool ur Rahim Khan

Berlin

Connect
Saurav Pahuja

Saurav Pahuja

Bremen

Connect
Ashwin Raaghav Narayanan

Ashwin Raaghav Narayanan

Munich

Connect
Mohsin Lakhani

Mohsin Lakhani

Berlin

Connect
Abdallah Bashir

Abdallah Bashir

San Francisco Bay Area

Connect
Mohamed Gafar

Mohamed Gafar

Greater Munich Metropolitan Area

Connect
Nayan Man Singh Pradhan

Nayan Man Singh Pradhan

Berlin

Connect
Tathagata Bandyopadhyay

Tathagata Bandyopadhyay

Tübingen

Connect
Nikolaus Demmel

Nikolaus Demmel

Greater Munich Metropolitan Area

Connect

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Benjamin Wagner

San Francisco Bay Area 3K followers 500+ connections

Activity

We contributed to Apache Arrow making Parquet bloom filters smaller and more effective. Full write up by Adrian Garcia Badaracco on what it means…

Liked by Benjamin Wagner

Come find us at Iceberg Summit! Coolest looking team out here 🔥🔥 Win a PS5!!

Liked by Benjamin Wagner

🚨 #icebergSummit Speaker Spotlight!🚨 What happens when #apacheIceberg meets trillions of rows and petabytes of data? Spoiler: things start…

Liked by Benjamin Wagner

Experience

Firebolt

-

-

-

-

-

-

Education

Technical University Munich

Publications

Assembling a Query Engine From Spare Parts

CDMS @ VLDB '22, September 9, 2022, Sydney, Australia

Incremental Fusion: Unifying Compiled and Vectorized Query Execution

ICDE'24, May 13-17, 2024, Utrecht, Netherlands

Self-Tuning Query Scheduling for Analytical Workloads

SIGMOD ’21, June 20–25, 2021, Virtual Event, China

Languages

German

Native or bilingual proficiency

English

Professional working proficiency

French

Elementary proficiency

View Benjamin’s full profile

Other similar profiles

Vikas Maurya

Dennis Grinwald

Taslima Akter Projucti

Anas Gouda

Anirban Ghosh

Dion Klajiqi

Richard Nai

Oliver Borchert

Pramod Vadiraja

Pranav Ragupathy

Hana Mekić

Maqbool ur Rahim Khan

Saurav Pahuja

Ashwin Raaghav Narayanan

Mohsin Lakhani

Abdallah Bashir

Mohamed Gafar

Nayan Man Singh Pradhan

Tathagata Bandyopadhyay

Nikolaus Demmel

Explore top content on LinkedIn

San Francisco Bay Area
3K followers 500+ connections