A Practitioner’s Perspective on Fairness in AI
The use of machine learning models to support decision making can have harmful consequences on individuals, particularly, when such models are used in critical applications in the social domain such as justice, health, education and resource allocation (e.g. welfare). These harms can be caused by different types of biases that can originate at any point of the algorithmic pipeline. In order to prevent discrimination and algorithmically reinforced bias, the Fair AI community has been developing best practices and new methods to ‘fix the bias’ in the data and therefore ensure fair outcomes.
Fair AI: an active community
We encounter many “AI gone wrong” narratives in critical domains, ranging from hospital triage algorithms systematically discriminating against minorities¹, to welfare fraud detection algorithms that disproportionately fail among sub-populations². In response to these issues, fairness-driven solutions are emerging, for example, by deriving new algorithms that ensure fairness, or by identifying conditions under which certain fairness constraints can be guaranteed. New frameworks are also being developed to draw attention to best practices on how to avoid bias during the modelling process³. There are also a number of user-friendly tools⁴ that allow for testing a system against possible bias. Once bias is detected, it can be mitigated by either modifying the training data, or by constraining the inner-workings of machine learning algorithms to satisfy a given fairness definition, for example, ensuring that two groups get the same outcome distributions⁵.
Ensuring Fair AI in practice is challenging
In our experience as practicing data scientists, the existing tools that deal with bias and ensuring fairness in machine learning suffer from an important limitation⁶: their validity and performance has been tested under unrealistic scenarios.
Typically, the validity and performance of these solutions are tested on publicly available datasets such as COMPAS or the German credit datasets⁷ where the sources of bias are easily identifiable since they are linked to correlations with sensitive attributes such as gender, ethnicity, or age, which are available in the data sets. This means that the data can be corrected or the outcomes of a given model can be altered to fix the bias that originates from these correlations. In practice, sensitive attributes cannot be included directly in the modeling process, for legal or for confidentiality reasons, but other features that correlate with them might be used in the model, leading to indirect discrimination. These correlations can be unknown at the time of development, especially when sensitive attributes are missing. It is therefore challenging and sometimes impossible to apply fairness corrections in real-world models.
Another challenge practitioners need to shoulder is that testing whether an algorithm is fair or not requires the knowledge of a desirable ground truth against which one can evaluate fairness of the outcomes. Furthermore, as practitioners, we need to acknowledge the broader context of high-impact decision making systems and for -ultimately- beneficial fairness interventions, we need to investigate the long-term impacts of these systems.
The way forward
So what can we do to make fair AI achievable in practice?
Mobilise the AI community to provide more realistic datasets. To achieve this, one approach is to facilitate (confidentiality-preserving) data exchange between industry and academia. Making relevant and realistic data sets available to the fair AI community will not only open new research avenues, it will also help create tangible solutions to fairness issues that can be used by practitioners, and consequently help solve the bias and discrimination issues.
Encourage tackling some of the real-world challenges. Consider bias bounties⁸ where developers are offered monetary incentives to test the systems for bias in a similar way to the bug bounties offered in security software.
Embrace inclusive communication between stakeholders and AI developers. Interpreting the real-world impact of an AI-driven solution can be challenging and requires open collaboration between the technical and functional teams throughout the AI lifecycle.
We are undoubtedly living through crucial times in terms of developments in the fair AI field. In order to foster more fairness-enabling practices and to facilitate bridging the gap between theory and practice, we invite all interested readers to share their experiences, ideas and suggestions with us on this dedicated slack channel using the link below:
https://join.slack.com/t/fair-ai/shared_invite/zt-edj2mqip-JZzmVGgduEjFiEpU3EXlyg
We are building an inclusive community, whether you work for a company, startup or a university, whether you are a concerned citizen, policy or law expert, or a student interested in Fair AI, join us!
References
- https://theappeal.org/politicalreport/algorithms-of-inequality-covid-ration-care/
- https://www.hrw.org/news/2019/11/08/welfare-surveillance-trial-netherlands
- https://cloud.google.com/inclusive-ml
- https://pair-code.github.io/what-if-tool/
- https://aif360.mybluemix.net/
- Note that we are not addressing here the more general issue of fixing an identified source of bias algorithmically, instead of addressing the actual societal issues leading to the bias, as we want to focus here on the province of the data scientists, in the real world.
- Popular fairness-datasets: http://www.fairness-measures.org/Pages/Datasets
- https://venturebeat.com/2020/04/17/ai-researchers-propose-bias-bounties-to-put-ethics-principles-into-practice/