What do you get if you put DNA and facial recognition together? Today, it’s China; tomorrow, maybe everywhere else

Posted on Dec 27, 2019 by Glyn Moody
Share Tweet

Two themes crop up again and again on this blog: facial recognition and DNA sequencing. Both technologies on their own are powerful, and steadily becoming greater threats to privacy. So what happens when they are put together? A story in the New York Times means we don’t have to guess, because China is already doing it:

Chinese scientists are trying to find a way to use a DNA sample to create an image of a person’s face. The technology, which is also being developed in the United States and elsewhere, is in the early stages of development and can produce rough pictures good enough only to narrow a manhunt or perhaps eliminate suspects. But given the crackdown in Xinjiang, experts on ethics in science worry that China is building a tool that could be used to justify and intensify racial profiling and other state discrimination against Uighurs.

In the long term, experts say, it may even be possible for the Communist government to feed images produced from a DNA sample into the mass surveillance and facial recognition systems that it is building, tightening its grip on society by improving its ability to track dissidents and protesters as well as criminals.

Although China seems to be the first government aiming to make this combination of facial recognition and DNA a practical tool for controlling its population, scientists in the West have been working on similar techniques for years. For example, research from 2014 looked at how a set of 20 genes affected facial features. This allowed a person’s rough facial appearance to be derived from a few genetic markers.

The field of using DNA to predict physical characteristics has developed to such an extent that it has a name: it is now called DNA phenotyping. A phenotype is the physical manifestation of a genotype, a particular DNA sequence. In April 2018, a site was launched that allows anyone to plug DNA data from special regions of the human genome into the HIrisPlex-S system developed by researchers. Based on the DNA entered, this is able to provide immediate predictions for eye, hair and skin colors. These are not precise estimates, but offer the probability that a given DNA sequence will display certain physical traits, for example blue or brown eyes, and the likely hair color.

A more ambitious approach was taken by a team led by Craig Venter, famous for his work on sequencing the human genome two decades ago. His team carried out “whole-genome sequencing“, which as its name implies involves obtaining nearly every chemical letter in a person’s DNA sequence. In Venter’s 2017 work, the DNA of 1061 people was sequenced, and their physical traits analyzed. The team then used statistics to develop a model that attempts to predict physical traits according to the DNA sequence. An article published soon afterwards criticized the study on technical grounds: “The results of the authors are unremarkable“. The author, Yaniv Erlich, also wrote:

To work in the real world, an adversary using the Venter technique would have to create population-scale database that includes height, face morphology, digital voice signatures and demographic data of every person they want to identify.

That is precisely what the Chinese government has been doing in Xinjiang with the turkic-speaking Uyghur population. It’s probably something that other governments aspire to, which may be why they are also keen on building up comprehensive medical databases, which would contain much of the information they require. However, there is an interesting alternative that has just appeared, and which governments can start deploying now, even without massive biometric databases at their disposal. As the recent paper in Nature Communications explains:

In contrast to DNA phenotyping, the idea is not to predict facial characteristics from DNA, but instead to predict DNA aspects from 3D facial shape using face-to-DNA classifiers; hence, all information is estimated from existing 3D facial images in a database.

That is, instead of trying to predict someone’s appearance from their DNA, the researchers did the reverse: they tried to work out what specific set of DNA sequences a person is likely to have based on their appearance. At first sight, that might not seem a useful approach. If DNA has been found at a crime site, investigators want to know what the person responsible looks like. But the power of this approach is that it allows suspects to be eliminated using an objective test. The scientists have developed software that measures a face and then compares it with sample DNA – for example, that found at a crime scene. If the face is incompatible with the DNA, then that suspect is unlikely to be a possible match. That may not be as good as starting from DNA and producing a face, but for police forces with a group of possible suspects, it could save them valuable time in investigations by narrowing down the field without the need for extensive questioning.

Nonetheless, the potential power of being able to take DNA found at a site or on an object and to create a physical representation of the person it comes from is such that governments around the world will doubtless encourage research in this area. Initially, that would presumably be for police and intelligence services. But the fear would be that it might be deployed more generally to track innocent citizens. As the recent New York Times story indicates, China is already actively pursuing that approach, untroubled by the major privacy issues it raises.

Featured image by CNN.

About Glyn Moody

Glyn Moody is a freelance journalist who writes and speaks about privacy, surveillance, digital rights, open source, copyright, patents and general policy issues involving digital technology. He started covering the business use of the Internet in 1994, and wrote the first mainstream feature about Linux, which appeared in Wired in August 1997. His book, "Rebel Code," is the first and only detailed history of the rise of open source, while his subsequent work, "The Digital Code of Life," explores bioinformatics - the intersection of computing with genomics.

VPN Service

Leave a Reply