What can we learn from the Clearview “end of privacy” story?

Posted on Jan 31, 2020 by Glyn Moody
Share Tweet

A couple of weeks ago, a story in the New York Times put facial recognition, and the serious problems it raises, firmly into the mainstream. It concerned the start-up Clearview AI, which, as the headline breathlessly informed us, “might end privacy as we know it.” The reason for this worrying description is not any breakthrough in AI technology by Clearview, but the fact that the company claims to have created a database of more than three billion facial images collected from Facebook, YouTube, Venmo and millions of other websites, apparently by scraping Web sites without asking anyone’s permission. A background document explains that “Clearview’s data is all gathered from publicly available sources, including news sites, social media, mugshots and more”. The company also says that more than 600 law enforcement agencies have started using Clearview – one reason why it is problematic.

Reactions to the news have been swift, and have come from many quarters. For example, one source of images, Twitter, sent a cease-and-desist letter, accusing Clearview of violating Twitter’s policies. Facebook said it was “reviewing the situation” and “will take appropriate action if we find they are violating our rules.” One problem is a ruling by the Ninth Circuit Court of Appeals last year that automated scraping of publicly accessible data probably does not violate the Computer Fraud and Abuse Act. That could make it harder to stop companies like Clearview from collecting facial images from public sites without permission.

Democratic Senator Edward Markey wrote a letter to the company, which summarizes the key concerns with the product. He was worried, he said, that Clearview “could eliminate public anonymity in the United States”, and that its technology is capable of “dismantling Americans’ expectation that they can move, assemble, or simply appear in public without being identified”. In addition, a lawsuit seeking class-action status was filed in Illinois; the New York Police Department said that Clearview played no role in identifying a terrorist, despite claims to that effect by the company; and New Jersey’s Attorney General halted police use of Clearview for the moment.

Inevitably, the news about Clearview’s huge database, gathered without the permission of the people involved, has led to calls for facial recognition to be banned or at least regulated, as it already is in the EU under the GDPR:

EU data protection rules clearly cover the processing of biometric data, which includes facial images: ‘relating to the physical, physiological or behavioural characteristics of a natural person, which allow or confirm the unique identification of that natural person’ (GDPR Art. 2(14)). The GDPR generally forbids the processing of biometric data for uniquely identifying purposes unless one can rely on one of the ten exemptions listed in Art. 9(2).

One of the world’s leading security experts, Bruce Schneier, wrote an opinion piece pointing out that banning facial recognition on its own won’t achieve much – there are plenty of other ways of identifying individuals, as this blog has noted many times over the years. Another serious problem is how that personal data is used, Schneier says:

we need better rules about when and how it is permissible for companies to discriminate. Discrimination based on protected characteristics like race and gender is already illegal, but those rules are ineffectual against the current technologies of surveillance and control. When people can be identified and their data correlated at a speed and scale previously unseen, we need new rules.

Finally, it is important to note that despite the sudden media attention, Clearview is not special. It does not appear to have developed any secret technology that makes its service particularly powerful. It simply went further than others in creating a huge facial recognition database. Major players like Google and Facebook could easily do the same, but have chosen not to, because of privacy concerns.

Moreover, on the issue of scraping sites, the original New York Times article quotes the founder of Clearview as saying “A lot of people are doing it,” and that “Facebook knows.” Indeed, in 2018 Facebook shut down 66 accounts, pages and apps that it said were linked to Russian firms that build facial recognition software for the Russian government. According to a news report in the New York Times, one company involved, Fubutech:

scraped data from the web, particularly Google search and the Russian search engine Yandex, to build a database of Russian citizens and their images that the government can use for facial recognition.

It would be naive in the extreme to think that the Russian government only scrapes images of its own citizens, and carefully avoids collecting information about anyone else. Given how cheap and easy it was for Clearview to build up a database of 3 billion faces, it is highly likely that Russia – and probably China and others – have facial recognition collections as large, or even larger, that include images of billions of people living all around the world. The more facial images a person has posted online, the more likely it is they are already sitting in a huge government database somewhere. It may not be the “end of privacy”, as the New York Times article would have it, but it’s clearly a serious development in terms of surveillance, and for our ability to move and act freely in the outside world. And since the intelligence services of foreign governments will certainly not be worried about breaking privacy laws in Western countries, there is really not very much anyone can do about that situation.

Featured image from pxfuel.

About Glyn Moody

Glyn Moody is a freelance journalist who writes and speaks about privacy, surveillance, digital rights, open source, copyright, patents and general policy issues involving digital technology. He started covering the business use of the Internet in 1994, and wrote the first mainstream feature about Linux, which appeared in Wired in August 1997. His book, "Rebel Code," is the first and only detailed history of the rise of open source, while his subsequent work, "The Digital Code of Life," explores bioinformatics - the intersection of computing with genomics.

VPN Service