The Dutch Seminar on Data Systems Design | 2nd Seminar

The Dutch Seminar on Data Systems Design (DSDSD) is an initiative to bring together research groups working on data systems in Dutch universities and research institutes. They hold bi-weekly talks on Fridays from 4:00PM to 5:00PM CET for and by researchers and practitioners designing (and implementing) data systems.

The 2nd seminar of DSDSD will feature talks by Andy Pavlo and Peter Boncz.

Talk 1

Title: OtterTune: An Automatic Database Configuration Tuning Service

Speaker: Andy Pavlo (Carnegie Mellon University)

Abstract: Database management systems (DBMS) expose dozens of configurable knobs that control their runtime behavior. Setting these knobs correctly for an application’s workload can improve the performance and efficiency of the DBMS. But such tuning requires considerable efforts from experienced administrators, which is not scalable for large DBMS fleets. This problem has led to research on using machine learning (ML) to devise strategies to optimize DBMS knobs for any application automatically.

In this talk, he will present an overview of OtterTune and discuss the challenges one must overcome to deploy an ML-based service for DBMSs. He will also highlight the insights we learned from real-world installations of OtterTune.

Bio: Andy Pavlo is an Associate Professor of Databaseology in the Computer Science Department at Carnegie Mellon University. He is also the co-founder of OtterTune.

Talk 2

Title: Three techniques for exploiting string compression in data systems

Speaker: Peter Boncz (CWI & VU)

Abstract: Actual data in real-life database often is in the form of strings. Strings take significantly more volume than fixed-size data, causing I/O, network traffic, memory traffic and cache traffic. Further, operations on strings tend to be significantly more expensive than operations on e.g. integers, which CPUs do support quite efficiently (let alone GPUs, TPUs – which even do not acknowledge the existense of string data). Despite this, most academic algorithms and benchmarking focuses on operations on fixed-size data. As such, string processing in data systems deserves more attention and can have significant impact on practice.

In this talk, he will discuss three techniques which can significantly improve the performance of handling large volumes of string data, and discuss how integrating these techniques affects data systems design.

Bio: Peter Boncz holds appointments as tenured researcher at CWI and professor at VU University Amsterdam. His academic background is in core database architecture, with the MonetDB the systems outcome of his PhD – MonetDB much later won the 2016 ACM SIGMOD systems award. He has a track record in bridging the gap between academia and commercial application, receiving the Dutch ICT Regie Award 2006 for his role in the CWI spin-off company Data Distilleries. In 2008 he co-founded Vectorwise around the analytical database system by the same name which pioneered vectorized query execution. He is co-recipient of the 2009 VLDB 10 Years Best Paper Award, and in 2013 received the Humboldt Research Award for his research on database architecture. He also works on graph data management, founding in 2013 the Linked Database Benchmark Council (LDBC), a benchmarking organization for graph database systems.

Read more about the event and how to join on the DSDSD website.