Critically Assessing AI Tools and Cultural Data for Digital Humanities Applications
An increasing amount of our rich cultural heritage are available in digital formats. Humanities scholars have therefore added AI and other computational techniques to their research methods. In her PhD thesis, Myriam Traub explores sources of bias in data and tools used by humanities scholars.
Assessing whether insights gained from using AI or other computational techniques constitute a meaningful and interesting trend or merely reflect an error, limitation or bias in the tools or data used proves to be surprisingly difficult.
This difficulty involves well known quality issues, such as errors in optical character recognition (OCR). These errors are easy to spot by scholars and are widely recognized as a problem in the community. Even for these problems, however, little is known about how they impact other digital methods used “downstream”, such as named entity recognition, sentiment analysis, word embeddings and other frequently-used AI methods. When any specific method is provided with erroneous or biased data as input, how it may or may not, influence the outcome of the culturally-oriented research projects in which they are deployed is entirely unclear.
However, there are other sources of bias of which only a few users are aware. For example, while algorithmic bias in full text search has been studied in the information retrieval community for more than a decade, there is little awareness around this topic when it comes to using search tools in a non-commercial digital library. Since no technology can be assumed to be neutral a priori, for these lesser known sources of tool bias it is of key importance to measure the amount of bias and to be able to assess its impact on the research conducted using the tools.
Exploring sources of bias
Myriam Traub explores sources of bias in data and tools used by humanities scholars and addresses a number of these in her PhD thesis “Measuring Tool Bias & Improving Data Quality for Digital Humanities Research”, which she defended on Monday 11th May, 2020, at Utrecht University. Myriam’s work was carried out at CWI in the SealincMedia research project as part of the national COMMIT/ program, with the Dutch National Library, Rijksmuseum and other partners. She interviewed humanities scholars on their use of digital methods and the role of these methods in the overall research process. Traub studied retrievability bias in the search engine of the Dutch historic newspaper archive, the impact of partially fixing OCR errors by using human computation, and the potential of crowd sourcing on difficult tasks that are traditionally seen as limited to domain experts.
Traub’s research enabled a better understanding of the role AI and other computational methods play in current humanities research. In particular, she shows that in addition to the quest for better performing tools and higher quality data, what we also need are better techniques to measure limitations in tools and data and for conveying the results of these computational measures to humanities scholars interested in the historical artifacts or events expressed in the data.
It is clear that there is a need for more intense, multidisciplinary collaboration between humanities scholars, data custodians and tool developers to better understand each others’ assumptions, approaches and requirements. This will help to build not only the technical research infrastructure humanities scholars need, but also the human infrastructure where scholars need to be trained in the skills necessary to routinely make critical assessments of the fitness of digital data and tools available in the technical infrastructure.
SEALINCMedia Rijksmuseum Use Case video explainer (scroll down to SEALINCMedia)