Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'

Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing...

Full description

Bibliographic Details
Main Authors: Deelen, Patrick, Menelaou, Androniki, van Leeuwen, Elisabeth M, Kanterakis, Alexandros, van Dijk, Freerk, Medina-Gomez, Carolina, Francioli, Laurent C, Hottenga, Jouke Jan, Karssen, Lennart C, Estrada, Karol, Kreiner-Møller, Eskil, Rivadeneira, Fernando, van Setten, Jessica, Gutierrez-Achury, Javier, Westra, Harm-Jan, Franke, Lude, van Enckevort, David, Dijkstra, Martijn, Byelas, Heorhiy, van Duijn, Cornelia M, de Bakker, Paul I W, Wijmenga, Cisca, Swertz, Morris A
Format: Online
Language:English
Published: Nature Publishing Group 2014
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4200431/
id pubmed-4200431
recordtype oai_dc
spelling pubmed-42004312014-11-01 Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands' Deelen, Patrick Menelaou, Androniki van Leeuwen, Elisabeth M Kanterakis, Alexandros van Dijk, Freerk Medina-Gomez, Carolina Francioli, Laurent C Hottenga, Jouke Jan Karssen, Lennart C Estrada, Karol Kreiner-Møller, Eskil Rivadeneira, Fernando van Setten, Jessica Gutierrez-Achury, Javier Westra, Harm-Jan Franke, Lude van Enckevort, David Dijkstra, Martijn Byelas, Heorhiy van Duijn, Cornelia M de Bakker, Paul I W Wijmenga, Cisca Swertz, Morris A Article Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with ‘true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05–0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r2, increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r2 improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r2 increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results. Nature Publishing Group 2014-11 2014-06-04 /pmc/articles/PMC4200431/ /pubmed/24896149 http://dx.doi.org/10.1038/ejhg.2014.19 Text en Copyright © 2014 Macmillan Publishers Limited http://creativecommons.org/licenses/by-nc-sa/3.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Deelen, Patrick
Menelaou, Androniki
van Leeuwen, Elisabeth M
Kanterakis, Alexandros
van Dijk, Freerk
Medina-Gomez, Carolina
Francioli, Laurent C
Hottenga, Jouke Jan
Karssen, Lennart C
Estrada, Karol
Kreiner-Møller, Eskil
Rivadeneira, Fernando
van Setten, Jessica
Gutierrez-Achury, Javier
Westra, Harm-Jan
Franke, Lude
van Enckevort, David
Dijkstra, Martijn
Byelas, Heorhiy
van Duijn, Cornelia M
de Bakker, Paul I W
Wijmenga, Cisca
Swertz, Morris A
spellingShingle Deelen, Patrick
Menelaou, Androniki
van Leeuwen, Elisabeth M
Kanterakis, Alexandros
van Dijk, Freerk
Medina-Gomez, Carolina
Francioli, Laurent C
Hottenga, Jouke Jan
Karssen, Lennart C
Estrada, Karol
Kreiner-Møller, Eskil
Rivadeneira, Fernando
van Setten, Jessica
Gutierrez-Achury, Javier
Westra, Harm-Jan
Franke, Lude
van Enckevort, David
Dijkstra, Martijn
Byelas, Heorhiy
van Duijn, Cornelia M
de Bakker, Paul I W
Wijmenga, Cisca
Swertz, Morris A
Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'
author_facet Deelen, Patrick
Menelaou, Androniki
van Leeuwen, Elisabeth M
Kanterakis, Alexandros
van Dijk, Freerk
Medina-Gomez, Carolina
Francioli, Laurent C
Hottenga, Jouke Jan
Karssen, Lennart C
Estrada, Karol
Kreiner-Møller, Eskil
Rivadeneira, Fernando
van Setten, Jessica
Gutierrez-Achury, Javier
Westra, Harm-Jan
Franke, Lude
van Enckevort, David
Dijkstra, Martijn
Byelas, Heorhiy
van Duijn, Cornelia M
de Bakker, Paul I W
Wijmenga, Cisca
Swertz, Morris A
author_sort Deelen, Patrick
title Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'
title_short Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'
title_full Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'
title_fullStr Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'
title_full_unstemmed Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands'
title_sort improved imputation quality of low-frequency and rare variants in european samples using the ‘genome of the netherlands'
description Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with ‘true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05–0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r2, increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r2 improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r2 increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results.
publisher Nature Publishing Group
publishDate 2014
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4200431/
_version_ 1613145927166459904