Tomas Talius

Tomas Talius

Sammamish, Washington, United States
7K followers 500+ connections

About

Head of Engineering for Google BigQuery.

Activity

Join now to see all activity

Experience

  • Google Graphic

    Google

    Kirkland, Washington, United States

  • -

  • -

    Redmond, Washington, United States

  • -

    Redmond, WA, USA

  • -

    Redmond, WA

  • -

    Redmond, WA

  • -

    Redmond, WA

  • -

    Redmond, WA

  • -

    Vilnius, Lithuania

Publications

  • Hyperspace: The Indexing Subsystem of Azure Synapse

    Hyperspace: The Indexing Subsystem of Azure Synapse

    Microsoft recently introduced Azure Synapse Analytics, which offers an integrated experience across data ingestion, storage, and
    querying in Apache Spark and T-SQL over data in the lake, including files and warehouse tables. In this paper, we present our
    experiences with designing and implementing Hyperspace, the indexing subsystem underlying Synapse. Hyperspace enables users
    to build multiple types of secondary indexes on their data, maintain them through a multi-user concurrency…

    Microsoft recently introduced Azure Synapse Analytics, which offers an integrated experience across data ingestion, storage, and
    querying in Apache Spark and T-SQL over data in the lake, including files and warehouse tables. In this paper, we present our
    experiences with designing and implementing Hyperspace, the indexing subsystem underlying Synapse. Hyperspace enables users
    to build multiple types of secondary indexes on their data, maintain them through a multi-user concurrency model, and leverage
    them automatically—without any change to their application code—
    for query/workload acceleration. Many requirements of Hyperspace are based on feedback from several enterprise customers. We
    present the details of Hyperspace’s underlying design, the userfacing APIs, its concurrency control protocol for index access, its
    index-aware query processing techniques, and its maintenance
    mechanisms for handling index updates. Evaluations over standard
    industry benchmarks and real customer workloads show that Hyperspace can accelerate query execution by up to 10x and in certain
    real-world workloads, even up to two orders of magnitude.

    Other authors
    See publication
  • Transaction Log Based Application Error Recovery and Point In-Time Query.

    VLDB

    · Database backups have traditionally been used as the primary mechanism to recover from hardware and user errors. High availability solutions maintain redundant copies of data that can be used to recover from most failures except user or application errors. Database backups are neither space nor time efficient for recovering from user errors which typically occur in the recent past and affect a small portion of the database. Moreover periodic full backups impact user workload and increase…

    · Database backups have traditionally been used as the primary mechanism to recover from hardware and user errors. High availability solutions maintain redundant copies of data that can be used to recover from most failures except user or application errors. Database backups are neither space nor time efficient for recovering from user errors which typically occur in the recent past and affect a small portion of the database. Moreover periodic full backups impact user workload and increase storage costs. In this paper we present a scheme that can be used for both user and application error recovery starting from the current state and rewinding the database back in time using the transaction log. While we provide a consistent view of the entire database as of a point in time in the past, the actual prior versions are produced only for data that is accessed. We make the as of data accessible to arbitrary point in time queries by integrating with the database snapshot feature in Microsoft SQL Server.

    Other authors
    See publication
  • Adapting Microsoft SQL Server for cloud computing

    International Conference on Data Engineering - ICDE

    Cloud SQL Server is a relational database system designed to scale-out to cloud computing workloads. It uses Microsoft SQL Server as its core. To scale out, it uses a partitioned database on a shared-nothing system architecture. Transactions are constrained to execute on one partition, to avoid the need for two-phase commit. The database is replicated for high availability using a custom primary-copy replication scheme. It currently serves as the storage engine for Microsoft's Exchange Hosted…

    Cloud SQL Server is a relational database system designed to scale-out to cloud computing workloads. It uses Microsoft SQL Server as its core. To scale out, it uses a partitioned database on a shared-nothing system architecture. Transactions are constrained to execute on one partition, to avoid the need for two-phase commit. The database is replicated for high availability using a custom primary-copy replication scheme. It currently serves as the storage engine for Microsoft's Exchange Hosted Archive and SQL Azure.

    Other authors
    See publication

Patents

Recommendations received

More activity by Tomas

View Tomas’ full profile

  • See who you know in common
  • Get introduced
  • Contact Tomas directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses