Academic Appointments


  • Research Engineer, Bioengineering

All Publications


  • Clinical Data: Sources and Types, Regulatory Constraints, Applications CTS-CLINICAL AND TRANSLATIONAL SCIENCE Ahalt, S. C., Chute, C. G., Fecho, K., Glusman, G., Hadlock, J., Taylor, C., Pfaff, E. R., Robinson, P. N., Solbrig, H., Ta, C., Tatonetti, N., Weng, C., Bizon, C., Cox, S., Krishnamurthy, A., Stillwell, L., Xu, H., Champion, J., Peden, D. B., Arunachalam, S., Robinson, M., Rensi, S., Biomed Data Translator Consortium 2019; 12 (4): 329–33

    View details for DOI 10.1111/cts.12638

    View details for Web of Science ID 000477770200002

    View details for PubMedID 31074176

    View details for PubMedCentralID PMC6617834

  • Machine learning in chemoinformatics and drug discovery. Drug discovery today Lo, Y., Rensi, S. E., Torng, W., Altman, R. B. 2018

    Abstract

    Chemoinformatics is an established discipline focusing on extracting, processing and extrapolating meaningful data from chemical structures. With the rapid explosion of chemical 'big' data from HTS and combinatorial synthesis, machine learning has become an indispensable tool for drug designers to mine chemical information from large compound databases to design drugs with important biological properties. To process the chemical data, we first reviewed multiple processing layers in the chemoinformatics pipeline followed by the introduction of commonly used machine learning models in drug discovery and QSAR analysis. Here, we present basic principles and recent case studies to demonstrate the utility of machine learning techniques in chemoinformatics analyses; and we discuss limitations and future directions to guide further development in this evolving field.

    View details for PubMedID 29750902

  • Chemical reaction vector embeddings: towards predicting drug metabolism in the human gut microbiome. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Mallory, E. K., Acharya, A., Rensi, S. E., Turnbaugh, P. J., Bright, R. A., Altman, R. B. 2018; 23: 56–67

    Abstract

    Bacteria in the human gut have the ability to activate, inactivate, and reactivate drugs with both intended and unintended effects. For example, the drug digoxin is reduced to the inactive metabolite dihydrodigoxin by the gut Actinobacterium E. lenta, and patients colonized with high levels of drug metabolizing strains may have limited response to the drug. Understanding the complete space of drugs that are metabolized by the human gut microbiome is critical for predicting bacteria-drug relationships and their effects on individual patient response. Discovery and validation of drug metabolism via bacterial enzymes has yielded >50 drugs after nearly a century of experimental research. However, there are limited computational tools for screening drugs for potential metabolism by the gut microbiome. We developed a pipeline for comparing and characterizing chemical transformations using continuous vector representations of molecular structure learned using unsupervised representation learning. We applied this pipeline to chemical reaction data from MetaCyc to characterize the utility of vector representations for chemical reaction transformations. After clustering molecular and reaction vectors, we performed enrichment analyses and queries to characterize the space. We detected enriched enzyme names, Gene Ontology terms, and Enzyme Consortium (EC) classes within reaction clusters. In addition, we queried reactions against drug-metabolite transformations known to be metabolized by the human gut microbiome. The top results for these known drug transformations contained similar substructure modifications to the original drug pair. This work enables high throughput screening of drugs and their resulting metabolites against chemical reactions common to gut bacteria.

    View details for PubMedID 29218869

  • Chemical reaction vector embeddings: towards predicting drug metabolism in the human gut microbiome Mallory, E. K., Acharya, A., Rensi, S. E., Turnbaugh, P. J., Bright, R. A., Altman, R. B., Altman, R. B., Dunker, A. K., Hunter, L., Ritchie, M. D., Murray, T., Klein, T. E. WORLD SCIENTIFIC PUBL CO PTE LTD. 2018: 56–67
  • Shallow Representation Learning via Kernel PCA Improves QSAR Modelability JOURNAL OF CHEMICAL INFORMATION AND MODELING Rensi, S. E., Altman, R. B. 2017; 57 (8): 1859–67

    Abstract

    Linear models offer a robust, flexible, and computationally efficient set of tools for modeling quantitative structure-activity relationships (QSARs) but have been eclipsed in performance by nonlinear methods. Support vector machines (SVMs) and neural networks are currently among the most popular and accurate QSAR methods because they learn new representations of the data that greatly improve modelability. In this work, we use shallow representation learning to improve the accuracy of L1 regularized logistic regression (LASSO) and meet the performance of Tanimoto SVM. We embedded chemical fingerprints in Euclidean space using Tanimoto (a.k.a. Jaccard) similarity kernel principal component analysis (KPCA) and compared the effects on LASSO and SVM model performance for predicting the binding activities of chemical compounds against 102 virtual screening targets. We observed similar performance and patterns of improvement for LASSO and SVM. We also empirically measured model training and cross-validation times to show that KPCA used in concert with LASSO classification is significantly faster than linear SVM over a wide range of training set sizes. Our work shows that powerful linear QSAR methods can match nonlinear methods and demonstrates a modular approach to nonlinear classification that greatly enhances QSAR model prototyping facility, flexibility, and transferability.

    View details for PubMedID 28727421

    View details for PubMedCentralID PMC5942586

  • Flexible Analog Search with Kernel PCA Embedded Molecule Vectors. Computational and structural biotechnology journal Rensi, S., Altman, R. B. 2017; 15: 320-327

    Abstract

    Studying analog series to find structural transformations that enhance the activity and ADME properties of lead compounds is an important part of drug development. Matched molecular pair (MMP) search is a powerful tool for analog analysis that imitates researchers' ability to select pairs of compounds that differ only by small well-defined transformations. Abstraction is a challenge for existing MMP search algorithms, which can result in the omission of relevant, inexact MMPs, and inclusion of irrelevant, contextually dissimilar MMPs. In this work, we present a new method for MMP search that returns approximate results and enables flexible control over abstraction of contextual information. We illustrate the concepts and mechanics of our method with a series of exemplar MMP queries, and then benchmark search accuracy using MMPs found by fragment indexing. We show that we can search for MMPs in a context dependent manner, and accurately approximate context independent fragment index based MMP search over a range of fingerprint and dataset conditions. Our method can be used to search for pairwise correspondences among analog sets and bolster MMP datasets where data is missing or incomplete.

    View details for DOI 10.1016/j.csbj.2017.03.003

    View details for PubMedID 28458783

  • A biotic game design project for integrated life science and engineering education. PLoS biology Cira, N. J., Chung, A. M., Denisin, A. K., Rensi, S., Sanchez, G. N., Quake, S. R., Riedel-Kruse, I. H. 2015; 13 (3)

    Abstract

    Engaging, hands-on design experiences are key for formal and informal Science, Technology, Engineering, and Mathematics (STEM) education. Robotic and video game design challenges have been particularly effective in stimulating student interest, but equivalent experiences for the life sciences are not as developed. Here we present the concept of a "biotic game design project" to motivate student learning at the interface of life sciences and device engineering (as part of a cornerstone bioengineering devices course). We provide all course material and also present efforts in adapting the project's complexity to serve other time frames, age groups, learning focuses, and budgets. Students self-reported that they found the biotic game project fun and motivating, resulting in increased effort. Hence this type of design project could generate excitement and educational impact similar to robotics and video games.

    View details for DOI 10.1371/journal.pbio.1002110

    View details for PubMedID 25807212

  • A Biotic Game Design Project for Integrated Life Science and Engineering Education PLOS BIOLOGY Cira, N. J., Chung, A. M., Denisin, A. K., Rensi, S., Sanchez, G. N., Quake, S. R., Riedel-Kruse, I. H. 2015; 13 (3)

    Abstract

    Engaging, hands-on design experiences are key for formal and informal Science, Technology, Engineering, and Mathematics (STEM) education. Robotic and video game design challenges have been particularly effective in stimulating student interest, but equivalent experiences for the life sciences are not as developed. Here we present the concept of a "biotic game design project" to motivate student learning at the interface of life sciences and device engineering (as part of a cornerstone bioengineering devices course). We provide all course material and also present efforts in adapting the project's complexity to serve other time frames, age groups, learning focuses, and budgets. Students self-reported that they found the biotic game project fun and motivating, resulting in increased effort. Hence this type of design project could generate excitement and educational impact similar to robotics and video games.

    View details for DOI 10.1371/journal.pbio.1002110

    View details for Web of Science ID 000352095700019

    View details for PubMedID 25807212

    View details for PubMedCentralID PMC4373802