AI-based predictive policing systems widely used in US and UK, despite concerns about privacy, and flaws in the approach

Posted on Mar 27, 2019 by Glyn Moody
Share Tweet

Back in 2017, Privacy News Online wrote about a massive police system being built in China that would allow “predictive policing” – the ability to spot criminals before they even commit a crime. As we warned then, China often turns out to provide an early glimpse of what will later happen in the West, and so it is proving. Predictive policing using AI-based approaches is starting to catch on in the US and the UK.

For example, Motherboard found that numerous US police forces in cities and municipalities that are home to over 1 million people use a system from a company called PredPol. According to the home page, “PredPol uses a machine-learning algorithm to calculate predictions. 3 data points – crime type, crime location and crime date/time – are used in prediction calculation.” As a result, the company says, local police can “Predict where and when specific crimes are most likely to occur” and “Proactively patrol to help reduce crime rates and victimization”.

Over in the UK, the human rights group Liberty found that 14 police forces are using, have previously used or are planning to use algorithms which map future crime or predict who will commit or be a victim of crime, drawing on existing police data. As well as providing details of predictive policing in the UK, Liberty’s 48-page report “Policing by Machine” offers a good explanation of why the practice is dangerous for freedom and privacy:

Predictive policing programs encourage reliance on “big data” – the enormous quantities of personal information accumulated about us in the digital age. A culture of big data alllows the state to monitor us even more closely and build up intrusive profiles from thousands of pieces of information. This chills our freedom of expression, making us feel we are being watched and forcing us to self-censor.

The ability to control what information is made available about us is integral to our lives. But a dangerous emerging narrative requires us to justify our desire for privacy, rather than requiring that the state – including the police – provide a sound legal basis for the interference.

Although some police forces in the UK were using systems from PredPol, it seems not to be as widely deployed as in the US. However, PredPol has plans for expanding its operations into a completely new market: the private sector. Around the world, other companies are also starting to offer programs that claim to provide predictive capabilities in real-life:

Vaak, a Japanese startup, has developed artificial intelligence software that hunts for potential shoplifters, using footage from security cameras for fidgeting, restlessness and other potentially suspicious body language.

In the UK, local government is also starting to deploy the technology widely. For example, one system analyzes data such as social support, school attendance, crime, homelessness, teenage pregnancy and mental health from 54,000 local families to predict which children could suffer from domestic violence, sexual abuse or go missing. The idea is that predictions can be used to intervene early, and thus more effectively.

That’s clearly a worthwhile goal. Similarly, it’s laudable that public bodies like the police are exploring the use of technology to improve their efficiency, which may well save lives. However, the past teaches us that more technology is not necessarily better. Key questions about predictive policing and related approaches are whether they actually work, and what the problems might be.

As a follow-up to its report on the use of PredPol in the US, Motherboard explored the algorithm that lies at the heart of the software. It turns out to be simple – extremely simple: “Basically, PredPol takes an average of where arrests have already happened, and tells police to go back there.”

There’s a serious problem with this approach. It creates a system with a positive feedback loop that sends police back to some areas more often because it sent them there more often in the past. As a result, PredPol’s predictions are likely to be self-fulfilling: once a location is “predicted” to have more crime than others, police will be sent there more often, find more crime, and make more arrests – increasing the likelihood that the area will always be marked as a crime spot.

It turns out that this is a widespread issue with predictive systems. The quality of the data that is used to drive them is crucial. If care is not taken, hidden biases already present in that input will be perpetuated. Moreover, there’s the additional problem that the use of fashionable technologies such as machine learning will lend the output a veneer of plausibility and even infallibility. Some people might think: if our advanced AI system says it, it must be true. A more rigorous discussion of these issues can be found in an academic paper “Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice”. The researchers write:

Deploying predictive policing systems in jurisdictions with extensive histories of unlawful police practices presents elevated risks that dirty data will lead to flawed, biased, and unlawful predictions which in turn risk perpetuating additional harm via feedback loops throughout the criminal justice system.

That’s an important reminder that predictive systems are not magic pixie dust that can be sprinkled on day-to-day policing to solve deep social problems. They certainly have their place, and will doubtless be steadily improved in the future as AI techniques are refined. Meanwhile, those who use them, or propose using them, must be aware of the novel problems they can give rise to. More generally, people must be sensitive to the harm predictive policing can cause to the privacy of those who are potentially affected by it.

Featured image by West Midlands Police.

About Glyn Moody

Glyn Moody is a freelance journalist who writes and speaks about privacy, surveillance, digital rights, open source, copyright, patents and general policy issues involving digital technology. He started covering the business use of the Internet in 1994, and wrote the first mainstream feature about Linux, which appeared in Wired in August 1997. His book, "Rebel Code," is the first and only detailed history of the rise of open source, while his subsequent work, "The Digital Code of Life," explores bioinformatics - the intersection of computing with genomics.

VPN Service