Doctor of Philosophy, Johns Hopkins University (2013)
Master of Science, Johns Hopkins University (2010)
Bachelor of Arts, Cornell University (2008)
Carlos Bustamante, Postdoctoral Faculty Sponsor
Relative performance of gene- and pathway-level methods as secondary analyses for genome-wide association studies.
2015; 16: 34-?
Despite the success of genome-wide association studies (GWAS), there still remains "missing heritability" for many traits. One contributing factor may be the result of examining one marker at a time as opposed to a group of markers that are biologically meaningful in aggregate. To address this problem, a variety of gene- and pathway-level methods have been developed to identify putative biologically relevant associations. A simulation was conducted to systematically assess the performance of these methods. Using genetic data from 4,500 individuals in the Wellcome Trust Case Control Consortium (WTCCC), case-control status was simulated based on an additive polygenic model. We evaluated gene-level methods based on their sensitivity, specificity, and proportion of false positives. Pathway-level methods were evaluated on the relationship between proportion of causal genes within the pathway and the strength of association.The gene-level methods had low sensitivity (20-63%), high specificity (89-100%), and low proportion of false positives (0.1-6%). The gene-level program VEGAS using only the top 10% of associated single nucleotide polymorphisms (SNPs) within the gene had the highest sensitivity (28.6%) with less than 1% false positives. The performance of the pathway-level methods depended on their reliance upon asymptotic distributions or if significance was estimated in a competitive manner. The pathway-level programs GenGen, GSA-SNP and MAGENTA had the best performance while accounting for potential confounders.Novel genes and pathways can be identified using the gene and pathway-level methods. These methods may provide valuable insight into the "missing heritability" of traits and provide biological interpretations to GWAS findings.
View details for DOI 10.1186/s12863-015-0191-2
View details for PubMedID 25887572
Genome-wide association study of hepatitis C virus- and cryoglobulin-related vasculitis
GENES AND IMMUNITY
2014; 15 (7): 500-505
The host genetic basis of mixed cryoglobulin vasculitis is not well understood and has not been studied in large cohorts. A genome-wide association study was conducted among 356 hepatitis C virus (HCV) RNA-positive individuals with cryoglobulin-related vasculitis and 447 ethnically matched, HCV RNA-positive controls. All cases had both serum cryoglobulins and a vasculitis syndrome. A total of 899 641 markers from the Illumina HumanOmni1-Quad chip were analyzed using logistic regression adjusted for sex, as well as genetically determined ancestry. Replication of select single-nucleotide polymorphisms (SNPs) was conducted using 91 cases and 180 controls, adjusting for sex and country of origin. The most significant associations were identified on chromosome 6 near the NOTCH4 and MHC class II genes. A genome-wide significant association was detected on chromosome 6 at SNP rs9461776 (odds ratio=2.16, P=1.16E-07) between HLA-DRB1 and DQA1: this association was further replicated in additional independent samples (meta-analysis P=7.1 × 10(-9)). A genome-wide significant association with cryoglobulin-related vasculitis was identified with SNPs near NOTCH4 and MHC Class II genes. The two regions are correlated and it is difficult to disentangle which gene is responsible for the association with mixed cryoglobulinemia vasculitis in this extended major histocompatibility complex region.
View details for DOI 10.1038/gene.2014.41
View details for Web of Science ID 000343960500009
View details for PubMedID 25030430
Admixture analysis of spontaneous hepatitis C virus clearance in individuals of African descent.
Genes and immunity
2014; 15 (4): 241-246
Hepatitis C virus (HCV) infects an estimated 3% of the global population with the majority of individuals (75-85%) failing to clear the virus without treatment, leading to chronic liver disease. Individuals of African descent have lower rates of clearance compared with individuals of European descent and this is not fully explained by social and environmental factors. This suggests that differences in genetic background may contribute to this difference in clinical outcome following HCV infection. Using 473 individuals and 792,721 single-nucleotide polymorphisms (SNPs) from a genome-wide association study (GWAS), we estimated local African ancestry across the genome. Using admixture mapping and logistic regression, we identified two regions of interest associated with spontaneous clearance of HCV (15q24, 20p12). A genome-wide significant variant was identified on chromosome 15 at the imputed SNP, rs55817928 (P=6.18 × 10(-8)) between the genes SCAPER and RCN. Each additional copy of the African ancestral C allele is associated with 2.4 times the odds of spontaneous clearance. Conditional analysis using this SNP in the logistic regression model explained one-third of the local ancestry association. Additionally, signals of selection in this area suggest positive selection due to some ancestral pathogen or environmental pressure in African, but not in European populations.
View details for DOI 10.1038/gene.2014.11
View details for PubMedID 24622687
Variants in HAVCR1 Gene Region Contribute to Hepatitis C Persistence in African Americans
JOURNAL OF INFECTIOUS DISEASES
2014; 209 (3): 355-359
To confirm previously identified polymorphisms in HAVCR1 that were associated with persistent hepatitis C virus (HCV) infection in individuals of African and of European descent, we studied 165 subjects of African descent and 635 subjects of European descent. Because the association was only confirmed in subjects of African descent (rs6880859; odds ratio, 2.42; P = .01), we then used 379 subjects of African descent (142 with spontaneous HCV clearance) to fine-map HAVCR1. rs111511318 was strongly associated with HCV persistence after adjusting for IL28B and HLA (adjusted P = 8.8 × 10(-4)), as was one 81-kb haplotype (adjusted P = .0006). The HAVCR1 genomic region is an independent genetic determinant of HCV persistence in individuals of African descent.
View details for DOI 10.1093/infdis/jit444
View details for Web of Science ID 000329921700009
View details for PubMedID 23964107
Genome-Wide Association Study of Spontaneous Resolution of Hepatitis C Virus Infection: Data From Multiple Cohorts
ANNALS OF INTERNAL MEDICINE
2013; 158 (4): 235-?
Chinese translationHepatitis C virus (HCV) infections occur worldwide and either spontaneously resolve or persist and markedly increase the person's lifetime risk for cirrhosis and hepatocellular carcinoma. Although HCV persistence occurs more often in persons of African ancestry and persons with genetic variants near interleukin-28B (IL-28B), the genetic basis is not well-understood.To evaluate the host genetic basis for spontaneous resolution of HCV infection.2-stage, genome-wide association study.13 international multicenter study sites.919 persons with serum HCV antibodies but no HCV RNA (spontaneous resolution) and 1482 persons with serum HCV antibodies and HCV RNA (persistence).Frequencies of 792 721 single nucleotide polymorphisms (SNPs).Differences in allele frequencies between persons with spontaneous resolution and persistence were identified on chromosomes 19q13.13 and 6p21.32. On chromosome 19, allele frequency differences localized near IL-28B and included rs12979860 (overall per-allele OR, 0.45; P = 2.17 × 10-30) and 10 additional SNPs spanning 55 000 base pairs. On chromosome 6, allele frequency differences localized near genes for HLA class II and included rs4273729 (overall per-allele OR, 0.59; P = 1.71 × 10-16) near DQB1*03:01 and an additional 116 SNPs spanning 1 090 000 base pairs. The associations in chromosomes 19 and 6 were independent and additive and explain an estimated 14.9% (95% CI, 8.5% to 22.6%) and 15.8% (CI, 4.4% to 31.0%) of the variation in HCV resolution in persons of European and African ancestry, respectively. Replication of the chromosome 6 SNP, rs4272729, in an additional 745 persons confirmed the findings (P = 0.015).Epigenetic effects were not studied.IL-28B and HLA class II are independently associated with spontaneous resolution of HCV infection, and SNPs marking IL-28B and DQB1*03:01 may explain approximately 15% of spontaneous resolution of HCV infection.
View details for Web of Science ID 000315580300014
View details for PubMedID 23420232
Polymorphisms in Toll-like receptor genes influence antibody responses to cytomegalovirus glycoprotein B vaccine.
BMC research notes
2012; 5: 140-?
Congenital Cytomegalovirus (CMV) infection is an important medical problem that has yet no current solution. A clinical trial of CMV glycoprotein B (gB) vaccine in young women showed promising efficacy. Improved understanding of the basis for prevention of CMV infection is essential for developing improved vaccines.We genotyped 142 women previously vaccinated with three doses of CMV gB for single nucleotide polymorphisms (SNPs) in TLR 1-4, 6, 7, 9, and 10, and their associated intracellular signaling genes. SNPs in the platelet-derived growth factor receptor (PDGFRA) and integrins were also selected based on their role in binding gB. Specific SNPs in TLR7 and IKBKE (inhibitor of nuclear factor kappa-B kinase subunit epsilon) were associated with antibody responses to gB vaccine. Homozygous carriers of the minor allele at four SNPs in TLR7 showed higher vaccination-induced antibody responses to gB compared to heterozygotes or homozygotes for the common allele. SNP rs1953090 in IKBKE was associated with changes in antibody level from second to third dose of vaccine; homozygotes for the minor allele exhibited lower antibody responses while homozygotes for the major allele showed increased responses over time.These data contribute to our understanding of the immunogenetic mechanisms underlying variations in the immune response to CMV vaccine.
View details for DOI 10.1186/1756-0500-5-140
View details for PubMedID 22414065
Identification of functional genetic variation in exome sequence analysis.
2011; 5: S13-?
Recent technological advances have allowed us to study individual genomes at a base-pair resolution and have demonstrated that the average exome harbors more than 15,000 genetic variants. However, our ability to understand the biological significance of the identified variants and to connect these observed variants with phenotypes is limited. The first step in this process is to identify genetic variation that is likely to result in changes to protein structure and function, because detailed studies, either population based or functional, for each of the identified variants are not practicable. Therefore algorithms that yield valid predictions of a variant's functional significance are needed. Over the past decade, several programs have been developed to predict the probability that an observed sequence variant will have a deleterious effect on protein function. These algorithms range from empirical programs that classify using known biochemical properties to statistical algorithms trained using a variety of data sources, including sequence conservation data, biochemical properties, and functional data. Using data from the pilot3 study of the 1000 Genomes Project available through Genetic Analysis Workshop 17, we compared the results of four programs (SIFT, PolyPhen, MAPP, and VarioWatch) used to predict the functional relevance of variants in 101 genes. Analysis was conducted without knowledge of the simulation model. Agreement between programs was modest ranging from 59.4% to 71.4% and only 3.5% of variants were classified as deleterious and 10.9% as tolerated across all four programs.
View details for DOI 10.1186/1753-6561-5-S9-S13
View details for PubMedID 22373437