We are pleased to announce this year's joint EDBT/ICDT keynote speakers: Rasmus Pagh, Graham Cormode, Christoph Koch and Wolfgang Lehner.


Large-scale similarity joins with guarantees [slides]

Rasmus Pagh
IT University of Copenhagen, Denmark

Rasmus Pagh

Abstract. The ability to handle noisy or imprecise data is becoming increasingly important in computing. In the database community the notion of similarity join has been studied extensively, yet existing solutions have offered weak performance guarantees. Either they are based on deterministic filtering techniques that often, but not always, succeed in reducing computational costs, or they are based on randomized techniques that have improved guarantees on computational cost but come with a probability of not returning the correct result.

The aim of this talk is to give an overview of randomized techniques for high-dimensional similarity search, and discuss recent advances towards making these techniques more widely applicable by eliminating probability of error and improving the locality of data access.

Bio. Rasmus Pagh graduated from Aarhus University in 2002, and is now a full professor at the IT University of Copenhagen. His work is centered around efficient algorithms for big data, with an emphasis on randomized techniques. His publications span theoretical computer science, databases, information retrieval, knowledge discovery, and parallel computing. His most well-known work is the cuckoo hashing algorithm (2001), which has led to new developments in several fields. In 2014 he received the best paper award at the WWW Conference for a paper with Pham and Mitzenmacher on similarity estimation, and started a 5-year research project funded by the European Research Council on scalable similarity search.


The confounding problem of private data release [slides]

Graham Cormode
University of Warwick, UK.

Graham Cormode

Abstract. The demands to make data available are growing ever louder, including open data initiatives and "data monetization". But the problem of doing so without disclosing confidential information is a subtle and difficult one. Is "private data release" an oxymoron? This talk will delve into the motivations of data release, explore the challenges, and outline some of the current statistical approaches developed in response to this confounding problem.

Bio. Graham Cormode is a Professor in Computer Science at the University of Warwick in the UK, where he works on research topics in data management, privacy and big data analysis. Previously, he was a principal member of technical staff at AT&T Labs-Research. His work has been recognized by two best paper awards, a ten-year test-of-time award, and a Royal Society Wolfson Research Merit Award. He is an associate editor for ACM Transactions on Database Systems, and formerly for IEEE Transactions on Knowledge and Data Engineering.


Abstraction without regret in database systems [slides]

Christoph Koch
EPFL, Switzerland

Christoph Koch

Abstract. It has been said that all problems in computer science can be solved by adding another level of indirection, except for performance problems, which are solved by removing levels of indirection. Compilers are our tools for removing levels of indirection automatically. However, we do not trust them when it comes to systems building. Performance-critical systems such as database systems are built in low-level programming languages such as C. Some of the downsides of this compared to using modern high-level programming languages are very well known: utterly buggy systems, poor programmer productivity, a talent bottleneck, and cruelty to programming language researchers.

I want to add bad performance to this list, and propose the following heretical thesis. Modern compiler technology can be competitive with and outperform human experts at low-level systems programming. Performance-critical software systems are a limited-enough domain for us to encode systems programming skills as compiler optimizations, allowing us to develop such systems in high-level programming languages without a performance penalty. In a large system, a human expert's occasional stroke of creativity producing an original and very specific coding trick is outweighed by a compiler's superior stamina. I will present a preliminary study of this thesis in the domain of database systems, both of the OLAP and OLTP flavors. Along the way, I will show what state of the art research in domain-specific languages, optimizing compilers, and program synthesis can do for database researchers.

Bio. Christoph Koch is a professor of computer science at EPFL and a PI of NCCR MARVEL, a Swiss national research center for computational materials science. His research focus is on data management and data analysis. Until 2010, he was an associate professor of computer science at Cornell University. Previously to this, he obtained his PhD in Artificial Intelligence from TU Vienna and CERN (2001) and has held faculty positions at TU Vienna and Saarland University. He has won best paper awards at PODS 2002, ICALP 2005, SIGMOD 2011, and VLDB 2014, an Outrageous Ideas and Vision Paper Award at CIDR 2013, a Google Research Award (in 2009), and an ERC Grant (in 2011). He (co-)chaired the program committees of DBPL 2005, WebDB 2008, ICDE 2011, and VLDB 2013, and was PC vice-chair of ICDE 2008 and ICDE 2009. He has served on the editorial board of ACM Transactions on Internet Technology, as Editor-in-Chief of PVLDB, as well as in numerous program committees.


Next-Generation Hardware for Data Management – more a Blessing than a Curse? [slides]

Wolfgang Lehner
Technische Universität Dresden

Wolfang Lehner

Abstract. Recent hardware developments have touched almost all components of a computing system: the existence of many and potentially heterogeneous cores, the availability of volatile and non-volatile main memories with an ever growing capacity, and the emergence of economically affordable, high-speed/low-latency interconnects are only a few prominent examples. Every single development as well as their combination has a massive impact on the design of modern computing systems. However, it is still an open question, if, how, and at which level of detail, a database system has to explicitly be aware of those developments and exploit them using specifically designed algorithms and data structures. Within the talk I will try to give an answer to this question and argue for a clear roadmap of HW/SW-DB-CoDesign especially providing an outlook to upcoming technologies and discussion of their non-functional properties like energy-efficiency and resilience behavior.

Bio. Wolfgang Lehner is full professor and head of the database technology group at the TU Dresden, Germany. His research is dedicated to database system architecture specifically looking at crosscutting aspects from algorithms down to hardware-related aspects in main-memory centric settings. He is part of TU Dresden's excellence cluster with research topics in energy-aware scheduling, resilient data structures on unreliable hardware, and orchestration of wildly heterogeneous systems; he is also a principal investigator of Germany's national "Competence Center for Scalable Data Services and Solutions" (ScaDS); Wolfgang also maintains a close research relationship with the SAP HANA development team. He serves the community in many PCs, is an elected member of the VLDB Endowment, serves on the review board of the German Research Foundation (DFG), and is an appointed member of the Academy of Europe.