A blended genome and exome sequencing method captures genetic variation in an unbiased, high-quality, and cost effective manner
Boltz, Toni A; Chu, Benjamin B; Liao, Calwing; Sealock, Julia M; Ye, Robert; Majara, Lerato; Fu, Jack M; Service, Susan; Zhan, Lingyu; Medland, Sarah E; Chapman, Sinéad B; Rubinacci, Simone; DeFelice, Matthew; Grimsby, Jonna L; Abebe, Tamrat; Alemayehu, Melkam; Ashaba, Fred K; Atkinson, Elizabeth G; Bigdeli, Tim; Bradway, Amanda B; Brand, Harrison; Chibnik, Lori B; Fekadu, Abebaw; Gatzen, Michael; Gelaye, Bizu; Gichuru, Stella; Gildea, Marissa L; Hill, Toni C; Huang, Hailiang; Hubbard, Kalyn M; Injera, Wilfred E.; James, Roxanne; Joloba, Moses; Kachulis, Christopher; Kalmbach, Phillip R; Kamulegeya, Rogers; Kigen, Gabriel; Kim, Soyeon
Date:
2024-09-09
Abstract:
We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass
whole genome (1-4x mean depth) and deep whole exome (30-40x mean depth) data in a single sequencing
run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome
sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the
globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the
Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included
participants across African, African American, and Latin American populations. We evaluated the accuracy of
BGE imputed genotypes against raw genotype calls from the Illumina Global Screening Array. All PUMAS
cohorts had R2 concordance ≥95% among SNPs with MAF≥1%, and never fell below ≥90% R2 for SNPs with
MAF<1%. Furthermore, concordance rates among local ancestries within two recently admixed cohorts were
consistent among SNPs with MAF≥1%, with only minor deviations in SNPs with MAF<1%. We also
benchmarked the discovery capacity of BGE to access protein-coding copy number variants (CNVs) against
deep whole genome data, finding that deletions and duplications spanning at least 3 exons had a positive
predicted value of ~90%. Our results demonstrate BGE scalability and efficacy in capturing SNPs, indels, and
CNVs in the human genome at 28% of the cost of deep whole-genome sequencing. BGE is poised to enhance
access to genomic testing and empower genomic discoveries, particularly in underrepresented populations.
Description:
We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass
whole genome (1-4x mean depth) and deep whole exome (30-40x mean depth) data in a single sequencing
run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome
sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the
globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the
Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included
participants across African, African American, and Latin American populations. We evaluated the accuracy of
BGE imputed genotypes against raw genotype calls from the Illumina Global Screening Array. All PUMAS
cohorts had R2 concordance ≥95% among SNPs with MAF≥1%, and never fell below ≥90% R2 for SNPs with
MAF<1%. Furthermore, concordance rates among local ancestries within two recently admixed cohorts were
consistent among SNPs with MAF≥1%, with only minor deviations in SNPs with MAF<1%. We also
benchmarked the discovery capacity of BGE to access protein-coding copy number variants (CNVs) against
deep whole genome data, finding that deletions and duplications spanning at least 3 exons had a positive
predicted value of ~90%. Our results demonstrate BGE scalability and efficacy in capturing SNPs, indels, and
CNVs in the human genome at 28% of the cost of deep whole-genome sequencing. BGE is poised to enhance
access to genomic testing and empower genomic discoveries, particularly in underrepresented populations.
Show full item record