Education & Certifications
Master of Science, University of Connecticut, Computer Science and Engineering (2018)
Bachelor of Science, University of Connecticut, Computer Science and Engineering (2018)
SaGePhy: An improved phylogenetic simulation framework for gene and subgene evolution.
Bioinformatics (Oxford, England)
SaGePhy is a software package for improved phylogenetic simulation of gene and subgene evolution. SaGePhy can be used to generate species trees, gene trees, and subgene or (protein) domain trees using a probabilistic birth-death process that allows for gene and subgene duplication, horizontal gene and subgene transfer, and gene and subgene loss. SaGePhy implements a range of important features not found in other phylogenetic simulation frameworks/software. These include (i) simulation of subgene or domain level evolution inside one or more gene trees, (ii) simultaneous simulation of both additive and replacing horizontal gene/subgene transfers, and (iii) probabilistic sampling of species tree and gene tree nodes, respectively, for gene-family and domain-family birth. SaGePhy is open-source, platform independent, and written in Java and Python.Executables, source code (open-source under the revised BSD licence), and a detailed manual are freely available from http://compbio.engr.uconn.edu/software/sagephy/.
View details for DOI 10.1093/bioinformatics/btz081
View details for PubMedID 30715213
RANGER-DTL 2.0: rigorous reconstruction of gene-family evolution by duplication, transfer and loss
2018; 34 (18): 3214–16
RANGER-DTL 2.0 is a software program for inferring gene family evolution using Duplication-Transfer-Loss reconciliation. This new software is highly scalable and easy to use, and offers many new features not currently available in any other reconciliation program. RANGER-DTL 2.0 has a particular focus on reconciliation accuracy and can account for many sources of reconciliation uncertainty including uncertain gene tree rooting, gene tree topological uncertainty, multiple optimal reconciliations and alternative event cost assignments. RANGER-DTL 2.0 is open-source and written in C++ and Python.Pre-compiled executables, source code (open-source under GNU GPL) and a detailed manual are freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/.Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/bty314
View details for Web of Science ID 000446433800022
View details for PubMedID 29688310
View details for PubMedCentralID PMC6137995
On the impact of uncertain gene tree rooting on duplication-transfer-loss reconciliation
BMC. 2018: 290
Duplication-Transfer-Loss (DTL) reconciliation is a powerful and increasingly popular technique for studying the evolution of microbial gene families. DTL reconciliation requires the use of rooted gene trees to perform the reconciliation with the species tree, and the standard technique for rooting gene trees is to assign a root that results in the minimum reconciliation cost across all rootings of that gene tree. However, even though it is well understood that many gene trees have multiple optimal roots, only a single optimal root is randomly chosen to create the rooted gene tree and perform the reconciliation. This remains an important overlooked and unaddressed problem in DTL reconciliation, leading to incorrect evolutionary inferences. In this work, we perform an in-depth analysis of the impact of uncertain gene tree rooting on the computed DTL reconciliation and provide the first computational tools to quantify and negate the impact of gene tree rooting uncertainty on DTL reconciliation.Our analysis of a large data set of over 4500 gene families from 100 species shows that a large fraction of gene trees have multiple optimal rootings, that these multiple roots often, but not always, appear closely clustered together in the same region of the gene tree, that many aspects of the reconciliation remain conserved across the multiple rootings, that gene tree error has a profound impact on the prevalence and structure of multiple optimal rootings, and that there are specific interesting patterns in the reconciliation of those gene trees that have multiple optimal roots.Our results show that unrooted gene trees can be meaningfully reconciled and high-quality evolutionary information can be obtained from them even after accounting for multiple optimal rootings. In addition, the techniques and tools introduced in this paper make it possible to systematically avoid incorrect evolutionary inferences caused by incorrect or uncertain gene tree rooting. These tools have been implemented in the phylogenetic reconciliation software package RANGER-DTL 2.0, freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/ .
View details for DOI 10.1186/s12859-018-2269-0
View details for Web of Science ID 000442105800011
View details for PubMedID 30367593