The largest study of primate genomes carried out in the world, in which the DNA of 141,456 people has been compared with that of 233 other species, has revealed four million genetic mutations in the human genome that can be harmful to health.
Beyond its medical implications, the primate genome project clarifies how this order of mammals, of which the human species is a part, has evolved and reveals what genetic differences distinguish Homo sapiens from its closest relatives.
Specifically, eighty genes have been identified that have been modified throughout the evolution of hominids and that distinguish humans from other species. Some of these genes regulate the development and function of the brain.
“Our results show that by studying primates we can better understand humans”, says Tomà s Marquès-Bonet, Icrea researcher at the Institute of Evolutionary Biology (IBE) in Barcelona, ​​who has co-led the international primate genome consortium.
The results of the research are presented this week in eight scientific articles in the journal Science , which dedicates a special issue to the project, and two additional articles in Science Advances . This is the first time that Science dedicates a Special issue to research led from Spain.
The project was born in 2018 as a result of a call from a vice president of Illumina, the world’s leading company in genomic sequencing, to Tomà s Marquès-Bonet, the researcher who had sequenced the most primate genomes at that time.
Illumina was using the publicly available Marquès-Bonet data for genomic analyzes that indicate a person’s risk of certain diseases. But it lacked statistical power to distinguish irrelevant mutations from those that are potentially harmful.
For his part, the IBE researcher (a joint center of the Universitat Pompeu Fabra and the CSIC) was interested in sequencing more genomes to better understand the evolution of primates and to contribute to their conservation.
They agreed to collaborate on a project in which Marquès-Bonet has coordinated a global network of primatologists and Illumina has developed artificial intelligence algorithms to interpret information from genomes. “The great challenge of genomics is no longer sequencing the genomes but interpreting them,” explains Marquès-Bonet, who has led the project together with Kyle Farh from Illumina and Jeffrey Rogers from the Baylor School of Medicine in Houston (Texas, USA). .).
During the five years that the project has lasted, in which scientific institutions from more than twenty countries have participated, the genomes of 233 primate species have been sequenced, which represents 45% of the 521 currently existing species. These 233 species represent all of the 16 primate families (such as hominids or gibbons) and 86% of the genera (such as humans, chimpanzees, gorillas and orangutans, all of them from the hominid family).
80% of the genomes analyzed in the project have been sequenced at the National Center for Genomic Analysis (CNAG) in Barcelona.
By comparing the genomes of more than 140,000 people -which had been previously sequenced- with those of the other 233 primate species, 4.3 million mutations have been identified in the human genome that modify the structure of proteins. If the change in structure leads to a change in the function of the proteins, these mutations could have health consequences.
6% of these mutations (about 260,000) have also been found in other primate species. According to the researchers, these are mutations that have not been eliminated by natural selection and therefore should not cause disease.
The remaining 94% (over 4 million) have not been found in any other primate species. If they have appeared at some point in evolution in the last 60 million years, natural selection appears to have eliminated them. Therefore, if they have recently reappeared in the human species, they may be causing disease.
Based on this data, Illumina is developing the PrimateAI artificial intelligence system to pinpoint what health risks a person has based on analysis of their genome. Currently, “clinical sequencing tests often do not provide definitive diagnoses, a frustrating result for both patients and physicians,” the researchers write in Science .
The PrimateAI system will have to clarify, among the four million mutations identified, which ones can have serious health consequences and which ones are irrelevant.
A first analysis of PrimateAI applied to neurodevelopmental disorders has discovered 16 genes associated with intellectual disability that had not been identified until now. Future research into the proteins produced by these genes may shed light on how they affect nervous system development and potentially open the way to new treatments.
“Primates have a physiology very similar to ours,” concludes Marquès-Bonet. “This project represents an important step forward in interpreting mutations in the human genome.”