A scientist at Israeli genealogy firm MyHeritage has published a paper revealing that public genealogy databases can identify relatives – third cousins and closer – in nearly two-thirds of people of European descent.
This means that if your second cousin sent a DNA sample to MyHeritage or a company like it, that sample can be used to identify you too, especially if triangulated with other identifying information like geographic area or approximate age.
Yaniv Erlich, chief science officer at MyHeritage, estimates that as these databases continue to grow, investigators will soon have the ability to identify anyone in the US within a particular ethnicity given a sample of their DNA.
The phenomenon isn’t limited to databases built from voluntary submissions like MyHeritage. Erlich’s researchers were able to identify an anonymous woman whose genetic data was part of the 1,000 Genomes Project, a research database designed as a detailed catalogue of human genetic variation.
Driven by curiosity, boredom, or the desire for a connection with their forbears, users of genetic testing services often think nothing of submitting a DNA sample to a private corporation for analysis. The privacy implications of Erlich’s research suggest they might want to think twice.
Earlier this year, California police charged a retired cop in a series of grisly rapes and murders that took place during the 1970s and ‘80s. They credited a public genealogy database called GEDMatch with providing the break that finally cracked the case of the Golden State Killer. In just four months in 2018, detectives solved 13 cases using the same technique, which Erlich calls “long-range familial DNA searches.” While many of these were cold cases, one was very recent, indicating the technique is becoming a standard component of police procedure. One DNA sequencing company has already positioned itself as a liaison between the forensic sector and the consumer sector, uploading 100 “cold cases” to consumer genetic databases.
Just because this technique is solving crime now, however, doesn’t mean the technology is the sole province of the “good guys.” The Golden State Killer may have “left his DNA all over the place,” but so do we. Erlich has been warning the public since at least 2013 that the DNA we submit in good faith for use in genetic testing or research is not subject to the privacy protections we are accustomed to regarding other personal data.
Erlich uses the example of law enforcement tracking down a protester who unwittingly left DNA at a political demonstration. All genetic data should be encrypted, he says, and only transferrable with the express consent of the owner. Unfortunately, given how many years such data has been floating around unprotected, passed between companies without a care for privacy, it is far too late to put the genie back in the bottle.