The news is flooded with announcements heralding new and innovative ways in which artificial intelligence (AI) will change our lives. In truth, AI already permeates our day to day in innumerable ways, from a content recommender system on your favourite social media site, to assisting with heart surgery.
In spite of its increasing ubiquity, a major unanswered question continues to overshadow this technology: can we create AI that is fair and unbiased? And would we even recognise it if we had it?
The public envisions AI as neutral – devoid of emotion, only capable of rational decision making. However, AI is ultimately created by human beings, and trained on data that human beings have sourced or curated, which can result in unintended biases.
Take, for example, the recent decision by the Metropolitan Police to implement widespread use of live facial recognition in London. A watchdog investigation found that while the algorithm was effective in identifying white men, it failed on numerous occasions to identify female or BAME subjects. Its inaccuracy when it comes to those groups was likely due to the limited data used to train the system. This is a textbook case of unintended biases creeping into an AI system to create unfair results.
AI perpetuates existing prejudices
This is not just an isolated problem, some of the most technologically advanced companies in the world have also fallen foul of this bias. A study at Carnegie Mellon University in 2015 found that Google Ads for high paying jobs significantly reduced when the gender was set to female. Once again, this wasn’t because the designers of the algorithm were sexist, but because of the data that was used in training it—historic advertising data–was. If AI systems are trained using such biased data sets, then the outcomes will most likely be unfair (unless algorithms can be designed to take account of these biases during training).
Finding an unbiased data set is not an easy challenge. Even data drawn from the justice system, which many might hope should be the most “fair” given the high stakes, has proved problematic. A controversial example of this was the United States Department of Justice’s attempt to use machine learning to predict recidivism rates among prisoners. This risk reduction tool, known as COMPAS, took swathes of parole data, fed it into an algorithm and asked it to make predictions about re-offending. The outputs, were consistent with the training data but they were biased against low income and minority communities, presumably due to the use of historic data. In fact there is very little evidence to suggest these groups recidivism rates were higher. One study found that “Compas software is no more accurate than predictions made by people with little or no criminal justice expertise”. Unfortunately this isn’t just a hypothetical issue – people are in prison in the US today because of COMPAS’s decisions.
Solutions to fairness and bias in AI
Removing bias, either through better data sets or better algorithms, is a much more challenging problem than one might think. Nevertheless, there are some proactive approaches AI developers can take.
Firstly, technology can be developed to try to take account of potential types of bias. For example, Accenture has developed a tool to “beat” unfair AI, allowing users to choose the ethical bar for their AI system and to set their preferred trade-off between “fairness” and “accuracy”. Another approach being taken is to use synthesized data sets to train machine learning algorithms, rather than relying on historic potentially “prejudiced” data. This could prevent real-world bias from infecting the processes of new AI systems–or at the very least provide a clear audit trail of how it was created.
Secondly, AI bias could be tackled through freer sharing of data sets among the tech community. The more information an AI has, the more likely it is that inherent biases is any data set tend to get cancelled out, and the greater the accuracy of its outputs. Unfortunately, as discussed in a previous post, there are structural barriers to widespread data sharing. However, while some of these barriers are probably unavoidable (i.e the regulatory need to protect personally identifiable information) other barriers could be overcome if the industry chose to adopt a culture of sharing for the greater good (i.e. more widespread sharing of the anonymous data sets amassed by tech giants). Freeing up data in order to tackle AI bias is another good reason to rethink data privacy laws.
Finally, there are “human” steps that could be taken to reduce the bias of AI systems. Society needs to decide what “fairness” actually means in a given situation and therefore which definition should be used by an algorithm in each use case–there are currently at least 21 different types of fairness used, that are mostly incompatible with each other. Also increased eduction and awareness, creating methods of “best practice” for AI development, investment in bias and fairness research—would all help to understand and address these challenges.
However, by holding machines to higher standards than we currently use ourselves, there is a real opportunity for AI to positively transform society far more quickly and comprehensively than can be achieved using only socially led initiatives.