Research interests: pharmacogenomics; systems pharmacology; clinical biomarker discovery; rare variant discovery and interpretation for cancer and other diseases

Education & Certifications

  • MA, University of Edinburgh, Business and Economics (2007)

Stanford Advisors

All Publications

  • SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Anand, S., Kalesinskas, L., Smail, C., Tanigawa, Y. 2019; 24: 184–95


    Genetic variations of the human genome are linked to many disease phenotypes. While whole-genome sequencing and genome-wide association studies (GWAS) have uncovered a number of genotype-phenotype associations, their functional interpretation remains challenging given most single nucleotide polymorphisms (SNPs) fall into the non-coding region of the genome. Advances in chromatin immunoprecipitation sequencing (ChIP-seq) have made large-scale repositories of epigenetic data available, allowing investigation of coordinated mechanisms of epigenetic markers and transcriptional regulation and their influence on biological function. To address this, we propose SNPs2ChIP, a method to infer biological functions of non-coding variants through unsupervised statistical learning methods applied to publicly-available epigenetic datasets. We systematically characterized latent factors by applying singular value decomposition to ChIP-seq tracks of lymphoblastoid cell lines, and annotated the biological function of each latent factor using the genomic region enrichment analysis tool. Using these annotated latent factors as reference, we developed SNPs2ChIP, a pipeline that takes genomic region(s) as an input, identifies the relevant latent factors with quantitative scores, and returns them along with their inferred functions. As a case study, we focused on systemic lupus erythematosus and demonstrated our method's ability to infer relevant biological function. We systematically applied SNPs2ChIP on publicly available datasets, including known GWAS associations from the GWAS catalogue and ChIP-seq peaks from a previously published study. Our approach to leverage latent patterns across genome-wide epigenetic datasets to infer the biological function will advance understanding of the genetics of human diseases by accelerating the interpretation of non-coding genomes.

    View details for PubMedID 30864321

  • SNPs2ChIP: Latent Factors of ChIP-seq to infer functions of non-coding SNPs Anand, S., Kalesinskas, L., Smail, C., Tanigawa, Y., Altman, R. B., Dunker, A. K., Hunter, L., Ritchie, M. D., Murray, T., Klein, T. E. WORLD SCIENTIFIC PUBL CO PTE LTD. 2019: 184–95
  • Proficiency Testing of Standardized Samples Shows Very High Interlaboratory Agreement for Clinical Next-Generation Sequencing-Based Oncology Assays. Archives of pathology & laboratory medicine Merker, J. D., Devereaux, K., Iafrate, A. J., Kamel-Reid, S., Kim, A. S., Moncur, J. T., Montgomery, S. B., Nagarajan, R., Portier, B. P., Routbort, M. J., Smail, C., Surrey, L. F., Vasalos, P., Lazar, A. J., Lindeman, N. I. 2018


    CONTEXT.: Next-generation sequencing-based assays are being increasingly used in the clinical setting for the detection of somatic variants in solid tumors, but limited data are available regarding the interlaboratory performance of these assays.OBJECTIVE.: To examine proficiency testing data from the initial College of American Pathologists (CAP) Next-Generation Sequencing Solid Tumor survey to report on laboratory performance.DESIGN.: CAP proficiency testing results from 111 laboratories were analyzed for accuracy and associated assay performance characteristics.RESULTS.: The overall accuracy observed for all variants was 98.3%. Rare false-negative results could not be attributed to sequencing platform, selection method, or other assay characteristics. The median and average of the variant allele fractions reported by the laboratories were within 10% of those orthogonally determined by digital polymerase chain reaction for each variant. The median coverage reported at the variant sites ranged from 1922 to 3297.CONCLUSIONS.: Laboratories demonstrated an overall accuracy of greater than 98% with high specificity when examining 10 clinically relevant somatic single-nucleotide variants with a variant allele fraction of 15% or greater. These initial data suggest excellent performance, but further ongoing studies are needed to evaluate the performance of lower variant allele fractions and additional variant types.

    View details for PubMedID 30376374