Forensic Investigative Genetic Genealogy

Forensic Investigative Genetic Genealogy

Thoughts on the use of genetic data from commercial genealogy databases to help solve cold cases in the Netherlands

The Dutch Public Prosecution Service (OM) and the Netherlands Forensic Institute (NFI) want to use American genetic genealogy databases for the first time in solving two Dutch cases that have gone cold. Do we not have sufficient resources and DNA-based technologies to solve our own cases in the Netherlands? And are we ready to face the possible legal consequences of such a pilot study as regards the protection of genetic data privacy of the individuals concerned?

When, on 6 March 2023, Dutch newspaper NRC Handelsblad printed on its front page that the OM wants to make use of commercial genetic genealogy databases to solve two cold cases in a pilot study, I said to myself, ‘the time has finally come’. I have been conducting research on the implications of the right to genetic data privacy. This concerns the use of these non-forensic DNA databases in the light of case law of the European Court of Human Rights. I am, therefore, excited at the prospect of hearing what the Dutch courts have to say about the issue, given that I am doing my law doctorate in the Netherlands. It will give me the local perspective on this issue, although the focus of my research is on the said supranational European court whose rulings have possible repercussions for its Member States, including the Netherlands.

In that regard, I offer some thoughts on the issue to contribute to the ongoing discussion. Since the method is relatively new, I present answers to what could be common questions raised in relation to the law enforcement use of the method in solving criminal cases.

What is it called?

There is currently no commonly accepted term for the method. One of its first names was more of a description: ‘forensic use of genetic genealogy databases’. Later, it was simply referred to as ‘forensic genealogy’ or ‘forensic investigative genealogical searching’. The US Department of Justice issued an interim policy in 2019 to regulate its use and called it ‘forensic genetic genealogical DNA analysis and searching’. Two dominant terms are currently used interchangeably to refer to it: ‘forensic genetic genealogy’ (FGG) and ‘investigative genetic genealogy’ (IGG). Just last month, these two terms were combined by the group chaired by one of the leading experts in this field, Ray Wickenheiser of the New York State Police Crime Laboratory System, to become forensic investigative genetic genealogy (FIGG). For the purposes of this blog, I will use FIGG.

What does it do?

The method first generates a DNA profile from a crime scene sample. This profile is then uploaded to a publicly-available (commercial) genetic genealogy database – the Dutch OM plans to use GEDMatch and FamilyTreeDNA, given their law enforcement-friendly policies. The uploaded profile is compared to the stored DNA profiles of its subscribers. Investigational leads can then be generated according to biological relatedness based on the amount of shared genetic data. These databases accumulate data from consumers of direct-to-consumer genetic testing (DTC-GT) companies such as MyHeritage, 23andMe, and Ancestry.com, which promise to reveal your genetic ancestry and even some genetic predispositions. To identify a criminal suspect, the method is followed up by direct DNA comparison between the DNA samples of the investigational lead(s) and those found at the crime scenes using standard DNA match profiling. The latter is called confirmatory matching.

What makes it different from DNA profiles currently used by law enforcement?

DNA profiles currently used by law enforcement come from short tandem repeats (STRs) found mainly in non-coding regions of the DNA, i.e. parts of the DNA that do not reveal much sensitive data, like a person’s predisposition to a particular disease. This genetic data is what is used in confirmatory matching as mentioned above. Law enforcement stores the data in what are called forensic DNA databases, which are used to search for suspect leads using genetic data generated from crime scene samples. What makes FIGG different is that it uses single nucleotide polymorphism (SNP) data. As the name implies, these are variations in the DNA sequence involving a single nucleotide – i.e. one of the four nucleotides that make up the DNA molecule: guanine (G), adenine (A), thymine (T) or cytosine (C). This is used to identify genetic relatives who share the common variations. Such genetic data, however, can reveal more sensitive data about that person. For that reason, FIGG is currently used only as a method of last resort, i.e. for cases that have gone cold. Moreover, FIGG requires much more effort to solve each case compared to a direct STR comparison between crime scene DNA profiles and known individual DNA profiles in a law enforcement DNA database.

What makes FIGG useful in solving cold cases?

Its more extensive reach. DNA-STR profile data can only provide connections, and therefore investigative leads, through direct matches as well as indirect matches to close relatives (like parent-child, siblings, uncle-nephew). DNA-SNP profile data, however, can go as far as distant cousins. This helped in solving the Golden State Killer case, who was indirectly identified using FIGG through his probable 4th cousin.

Why is the Golden State Killer case very famous in relation to FIGG?

Although it was not the first case using FIGG, it was the first one that led to the arrest and eventual conviction of a suspect of more than fifty rapes and murders who had escaped arrest for decades. He was apparently a police officer who had been smart enough to escape inclusion in the forensic DNA database maintained by law enforcement. But he could not escape the publicly-available DNA databases – in this case, GEDMatch – where at least one of his distant genetic relatives had submitted a sample.

Does that not make my biological relatives ‘genetic informants’?

Not exactly. As Christi Guerrini et al. point out, it is not accurate to use the terms ‘genetic witness’ or ‘genetic informant’ to refer to a person’s relatives whose DNA profiles are stored in these commercial databases, as they may imply that genetic relatedness in itself is evidence against the crime committed by a relative. As mentioned above, FIGG is only used to generate investigational leads, more akin to a ‘tip’ – in this case, a ‘scientific tip’ – to help law enforcement focus on specific individuals in their investigation. Without other corroborating evidence, and a confirmed match as mentioned above, the lead generated using FIGG should be discarded – well, at least in the US where its own Department of Justice prohibits the arrest of anyone based solely on such investigational leads. Since the OM does not have similar guidelines on the use of FIGG, the answer is not that clear in the Dutch context. Based on the report published in the NRC, the parties involved in the Dutch pilot study using FIGG are not relying on a specific law or regulation on FIGG, but rather on a more general law on the use of DNA data in criminal investigations.

Isn’t genetic data a type of sensitive data protected by the GDPR?

Yes, it is. However, when it comes to police use of genetic data, the more appropriate EU legislation is the Law Enforcement Directive (LED). Article 10 LED provides that processing of genetic data:

‘shall be allowed only where strictly necessary, subject to appropriate safeguards for the rights and freedoms of the data subject, and only: (a) where authorized by Union or Member State law;

(b) to protect the vital interests of the data subject or of another natural person; or

(c) where such processing relates to data which are manifestly made public by the data subject.’

Is it then strictly necessary to solve the two cases? It appears to be so since they only want to use FIGG after they have used all other investigative means to solve these cases to no avail. Are there appropriate safeguards in place to protect the rights and freedoms of those whose data are to be used for the comparative search? One of the proponents explained that they will handle the genetic data carefully, that they will use a code, that they will remove the data once they get the results, which means that they do not intend to store genetic data from Dutch sources in those American databases. But what technical checks do they have to employ in order to assure that desired outcome? Does the uploading of data in an American company, albeit temporary, constitute transfer of data to a third country? Do they have specific measures in case of a data breach in those genealogy websites while they are doing the search, like what previously happened in GEDMatch itself? Obviously, the NRC article is not the venue to enumerate all these safeguards and explain them, but I am particularly interested to know them, not only because they form part of my current research – albeit following the ECHR regime – but also because it can have serious repercussions on the right to genetic data privacy of those involved. The presence or absence of these safeguards could define the appropriateness of the use of FIGG in solving these cases as far as the Netherlands is concerned.

Is the consent of the subscribers of these commercial databases enough?

According to the NRC article, the answer of one of its proponents appears to be a yes, since they precisely chose GEDMatch and FamilyTreeDNA whose subscribers are given a choice to explicitly indicate their consent to law enforcement use of their genetic data. However, an intrinsic characteristic of genetic data is that it is shared – ‘dividual’, as one author puts it – which means that you may still be identified even if you do not agree with law enforcement use of your genetic data, if one of your genetic relatives consents to it. In other words, you can still be identified nonetheless, like what happened in the Golden State Killer case. Do genetic relatives then have a say as to what I do with my DNA since they share part of it and could be identified through it? I am curious to know how the Dutch courts will resolve this difficult question and try to balance it with the prosecution’s desire to solve cold cases through FIGG.

Is there no previous experience with this issue at the EU level?

Yes, in Sweden. FIGG was used to solve a double-murder case that happened in Linköping in 2004. The Swedish police had previously investigated the case for 15 years to no avail. This included conducting more than 9,000 interrogations and mass DNA screening of 6,000 men, similar to that employed in the Nicky Verstappen and Marianne Vaatstra murder cases in the Netherlands. However, despite its success in identifying the criminal perpetrator, the Swedish Authority for Privacy Protection prevented those involved from using FIGG to solve other cold cases until their law is changed, as the current one does not allow genetic data to be handled that way. If you read the report of the Swedish Police Authority about it, they appear to have conducted extensive legal inquiries to assure that the use of FIGG complied with Swedish law. I am afraid that something similar will happen in the Dutch pilot study, given that it is not based on a specific law or regulation similar to the one issued by the US Department of Justice on FIGG use.

Perhaps the Dutch Government should first draw up a specific law – or at least a regulation – outlining the legal safeguards to be followed by those involved using FIGG before such a pilot study is conducted to solve cold cases in the Netherlands. Otherwise, it may be standing on a shaky foundation. That’s just my two cents.


Add a comment