Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

The origin and fate of new mutations within species is the fundamental process underlying evolution. However, while much attention has been focused on characterizing the presence, frequency, and phenotypic impact of genetic variation, the evolutionary histories of most variants are largely unexplored. We have developed a nonparametric approach for estimating the date of origin of genetic variants in large-scale sequencing data sets. The accuracy and robustness of the approach is demonstrated through simulation. Using data from two publicly available human genomic diversity resources, we estimated the age of more than 45 million single-nucleotide polymorphisms (SNPs) in the human genome and release the Atlas of Variant Age as a public online database. We characterize the relationship between variant age and frequency in different geographical regions and demonstrate the value of age information in interpreting variants of functional and selective importance. Finally, we use allele age estimates to power a rapid approach for inferring the ancestry shared between individual genomes and to quantify genealogical relationships at different points in the past, as well as to describe and explore the evolutionary history of modern human populations.

Original publication

DOI

10.1371/journal.pbio.3000586

Type

Journal article

Journal

PLoS Biol

Publication Date

01/2020

Volume

18

Keywords

Age Factors, Alleles, Computer Simulation, Continental Population Groups, Datasets as Topic, Evolution, Molecular, Gene Frequency, Genetic Speciation, Genetic Variation, Genetics, Population, Genome, Human, High-Throughput Nucleotide Sequencing, Humans, Pedigree, Phylogeny, Polymorphism, Single Nucleotide, Sequence Analysis, DNA, Statistics as Topic, Time Factors