Winners of the ADS Thesis Awards 2022
The ADS Thesis Awards aim to promote excellence in Data Science and AI from students at the Bachelor and Master level in all Amsterdam-based knowledge institutes. The goal of the awards are:
- Reward and champion high-quality thesis work.
- Promote women and underrepresented minorities and encourage them to continue their education.
- Encourage diversity in Data Science and AI research.
- Advance Amsterdam and the ADS network as an innovation hub by showcasing excellent theses.
The Selection Committee was incredibly impressed by the number and quality of the nominations. All submissions were judged on how they advance Data Science and/or AI through:
- Innovative scientific and technical contributions;
- High societal and economical impact from the findings and results of the thesis;
- Promotion of open science via FAIR data principles, and the availability of high-quality open-source code, results, traces, frameworks, etc.
A summary of each of the winning theses can be found below along with a link to the full thesis.
BSc Thesis Awards
- Philipp Sommerhalter (student) and Alexandru Iosup (supervisor). Labels, Cards, and Simulation-Based Analysis for Energy Efficiency and Sustainability in Data Centers (VU Amsterdam)
In the current context of climate change and political turmoil, this thesis addresses the lack of proper energy efficiency metrics characterizing Data centers, as these account for up to 2% of global energy consumption. Philipp performed a short survey of candidate metrics and built an instrument to produce and report metrics. His key conceptual contribution is the design of a “zoomable” energy card to be used in combination with the detailed OpenDC simulator for ICT operations. The grid analysis tool delivers unique insights into DC sustainability. The reviewers were extremely impressed with the maturity and depth of the research work, especially at such an early academic stage: reviewing the literature, acquiring, processing, and interpreting information on the current practices. The method of research is well-structured and appropriately used. The proposed reporting method is intuitively convincing, matching the expected breadth of a BSc project. The writing goes above and beyond expectations, reaching grant-level quality. In terms of the societal impact, the reviewers were particularly impressed that Philipp had already carried out the initial steps toward central stakeholders (SURF, EGI). Specifically, intrinsic to including many stakeholder types in the method to answer one of the research questions, the results of the work have a high potential to translate into impactful practical tools. The reviewers considered that this thesis sets an exemplary case for BSc theses in terms of promoting open science. The solution is available at https://github.com/philippsommer27/opendc-eesr. The data and results of the experiment can be found at https://github.com/philippsommer27/experiments-bsc-thesis-2022.
- Sarah Kwakkelaar (student) and Jan-Christoph Kalo (supervisor). Investigating Learned Stereotyping Biases Within Multilingual BERT (VU Amsterdam)
In her thesis Sarah explores societal biases in language model behaviours, with a clear and novel contribution: comparing models trained on English language only versus models trained on multiple languages. In many studies monolingual models are used while multilingual models may manifest bias differently. In this research the student compared BERT to Multilingual-BERT. The results uncover a promising direction for limiting the biases of language models. Namely, the research suggests that multilingual models are less impacted by societal biases. An interesting starting point for further research. The reviewers consider the thesis to be outstanding for the novelty and relevance of its contributions. It opens new directions for both research (e.g., exploring the impact of languages on bias) and practice (e.g., preferring multilingual models as the default choice). The research presented in the thesis has high societal relevance since it targets a technology that has been repeatedly identified as critical for our society. Language models are widely used (e.g., search, translation, recommendation) in applications that can have deep impacts in personal and professional lives (e.g., job or news recommendations). The thesis suggests a promising approach to limit the biases language model may introduce, while remaining relatively easy to implement in practice (e.g., prefer multilingual models when available). Much future work is needed to validate this approach and scope its applicability, as any new research, but if successful the practical impacts would be considerable. The complete materials are published in a public repository, and the instructions for reproducibility are extensive and very detailed, making this a great example of open science. https://zenodo.org/record/7108081
MSc Thesis Awards
- Julia Sudnik (student), Laura Hollink and Jacco van Ossenbruggen (supervisors). The Effect of Feedback Loops on Fairness in Rankings from Recommendation Systems (VU Amsterdam and CWI)
The focus of the thesis lies on feedback loops in recommender systems, and their detrimental effects on minority groups. With a focus on book recommendations and female writers as the minority groups, Julia has designed a clever experimental set-up with several models and fairness metrics. The thesis has a clear structure and storyline, in which the student shows to be well on top of both the technical matter, the scientific literature and the societal implications. Innovative work with potentially a high societal impact. The reviewers are impressed by the work and consider the choice of methods to be excellent. The findings provide new directions for addressing and researching how feedback loops impact bias in recommender systems. The research has high societal relevance. Recommender systems are consulted by many consumers of many different products, and will incorporate bias by their nature. Since most of those systems make use of feedback in the form of clicks and likes, an existing bias is easily enlarged. This is particularly problematic when minority groups are at the wrong end of the spectrum. The thesis takes a clear approach to make such a bias visible, and provides an insightful discussion of the implications of this work by the use of different definitions of fairness, and providing detailed insights on the tradeoffs and factors to consider to address such complex issues. The study is focused on a publicly available dataset. The software materials are published in an open repository, and FAIR data was used. https://github.com/sudnikii/master-thesis-ai
- Krijn Doekemeijer (student) and Animesh Trivedi (supervisor). TropoDB: Design, Implementation and Evaluation of an Optimised KV-Store for NVMe Zoned Namespace Devices (VU Amsterdam)
Krijn delivered an absolutely thorough study into the performance and suitability of an emerging storage interface that leverages application-level knowledge to better employ flash storage devices. By the year 2025, we are expected to generate close to 200 Zettabytes of data yearly – large enough to cover the whole Pacific Ocean 200 times over if data bytes were visualized as grains of rice. In this context, his work on efficient data storage systems becomes crucial, as the world-first, complete ZNS-optimized data store. Starting from scratch, Krijn had to solve many unforeseeable scientific and engineering challenges associated with the development of new application-optimized data structures for ZNS devices. The reviewers were thoroughly impressed with the huge amount of work squeezed within 6 months of GIT-evidenced activity. The size of this thesis (200+ pages) could discourage the weak-hearted from even attempting to browse it, however, it is simply a reflection of the sheer amount of work. The context is beautifully introduced for experts and non-experts, and there are proper background and related works sections. The research questions are clearly formulated and mapped over the chapters. The research methods are appropriate and come with citations to related work. The thesis writing style is concise and clear. Even the Appendix contains a nugget, Krijn’s Self-reflection section. The reviewers fully appreciated the clarity and extent of this thesis’ societal impact, as it opens many avenues of future research and practice. The extensive benchmarking alone is a valuable societal contribution. The reviewers also commend Krijn on an excellent job promoting open science through extensive documentation and code artefacts fully published at https://github.com/atlarge-research/tropodb.
CWI researcher Peter Boncz has been appointed as one of the 2022 fellows of the Association of Computing Machinery (ACM).
A symposium entitled ‘From Systems and Networks to Complex Cyber Infrastructures’ took place on 9th December 2022 in honor of Cees de Laat (UvA IvI) and Leon Gommans (UvA IvI).
CWI spin-off company DuckDB Labs helped create startup MotherDuck which aims to connect DuckDB to the cloud. MotherDuck sports some big names: its CEO is Jordan Tigani, founding engineer at Google’s BigQuery, Google’s fully managed data analysis platform. A big part of the $47.5 million funding comes from Andreessen Horowitz, a prominent venture capital firm, specialized in technology startups.