Opening the black boxes: algorithmic bias and the need for accountability
Here on Privacy News Online we’ve written a number of stories about the privacy implications of DNA. There’s an important case going through the Californian courts at the moment that involves DNA and privacy, but whose ramifications go far beyond those issues:
“In this case, a defendant was linked to a series of rapes by a DNA matching software program called TrueAllele. The defendant wants to examine how TrueAllele takes in a DNA sample and analyzes potential matches, as part of his challenge to the prosecution’s evidence. However, prosecutors and the manufacturers of TrueAllele’s software argue that the source code is a trade secret, and therefore should not be disclosed to anyone.”
The Electronic Frontier Foundation (EFF) points out that there are two big problems here. One is the basic right of somebody accused of a crime to be able to examine and challenge the evidence that is being used against them. In this case, that’s not possible, because the manufacturer of the TrueAllele software is unwilling to allow the source code that determines whether or not there is a DNA match to be released. Particularly egregious is the fact that the company is claiming that its right to maintain a supposed trade secret outweighs the accused’s right to a fair trial.
But beyond that issue, there is another that is certain to have a big impact on the world of privacy. It involves the increasing use of algorithms to make judgements about us. An algorithm is just a fancy way of saying a set of rules, usually implemented as software encoding mathematical equations. The refusal by TrueAllele’s manufacturer is therefore a refusal to permit the accused in the Californian case to examine and challenge the algorithmic rules that are being applied.
If this position is allowed to stand, we run the risk of turning algorithms into black boxes whose results we are forced to accept, but whose workings we may not query. In particular, we won’t know what personal information has been used in the decision-making process, and thus how our privacy is being affected.
It’s not just outright errors in rules that are a problem. As a recent article in MIT Technology Review pointed out, even more insidious, because more subtle, is the presence of algorithmic bias:
“Algorithmic bias is shaping up to be a major societal issue at a critical moment in the evolution of machine learning and AI. If the bias lurking inside the algorithms that make ever-more-important decisions goes unrecognized and unchecked, it could have serious negative consequences, especially for poorer communities and minorities. The eventual outcry might also stymie the progress of an incredibly useful technology.”
Algorithmic bias can enter systems in two main ways. One, is through the algorithm’s basic rules, which may contain incorrect assumptions that skew the output results. Another is through the use of biased training data. Many of the latest algorithm-based systems draw their power from being trained on large holdings of real-world data. This allows hidden patterns to be detected and used for future analysis of new data. But if the training data has inherent biases, those will be propagated into the algorithm’s output.
Even though algorithmic systems are being rolled out rapidly, and across a wide range of sectors, people are only beginning to grapple with the deep problems they can bring, and to try to come up with solutions. For example, AlgorithmWatch is:
“a non-profit initiative to evaluate and shed light on algorithmic decision making processes that have a social relevance, meaning they are used either to predict or prescribe human action or to make decisions automatically.”
Its algorithmic decision making (ADM) manifesto states: “The fact that most ADM procedures are black boxes to the people affected by them is not a law of nature. It must end.” Another initiative helping to open up those black boxes is AI Now, one of whose aims is tackling the problem of data bias:
“Data reflects the social and political conditions in which it is collected. AI is only able to “see” what is in the data it’s given. This, along with many other factors, can lead to biased and unfair outcomes. AI Now researches and measures the nature of such bias, how bias is defined and by whom, and the impact of such bias on diverse populations.”
There’s already a book on the issues raised by algorithmic bias, called “Weapons of Math Destruction“, whose author now heads up a new company working in this field, O’Neil Risk Consulting & Algorithmic Auditing (ORCAA). As well as auditing algorithms, ORCAA also offers risk evaluation. That’s an important point. Inscrutable algorithms are not just a problem for the people whose lives they may affect dramatically – as the case in California makes plain. They may also lead to costly legal action against companies whose algorithms turn out to contain unsuspected biases that have resulted in erroneous or unfair decisions. The sooner we come up with a legal framework allowing or even requiring the outside review of key algorithms, the better it will be for the public, for companies, and for society as a whole.
Featured image by shahzairul.