Current Research and Scholarly Interests
Computational drug discovery
Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
ACS CENTRAL SCIENCE
2017; 3 (10): 1103–13
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step toward solving the challenging problem of computational retrosynthetic analysis.
View details for DOI 10.1021/acscentsci.7b00303
View details for Web of Science ID 000413697100010
View details for PubMedID 29104927
View details for PubMedCentralID PMC5658761
Is Multitask Deep Learning Practical for Pharma?
Journal of chemical information and modeling
2017; 57 (8): 2068–76
Multitask deep learning has emerged as a powerful tool for computational drug discovery. However, despite a number of preliminary studies, multitask deep networks have yet to be widely deployed in the pharmaceutical and biotech industries. This lack of acceptance stems from both software difficulties and lack of understanding of the robustness of multitask deep networks. Our work aims to resolve both of these barriers to adoption. We introduce a high-quality open-source implementation of multitask deep networks as part of the DeepChem open-source platform. Our implementation enables simple python scripts to construct, fit, and evaluate sophisticated deep models. We use our implementation to analyze the performance of multitask deep networks and related deep models on four collections of pharmaceutical data (three of which have not previously been analyzed in the literature). We split these data sets into train/valid/test using time and neighbor splits to test multitask deep learning performance under challenging conditions. Our results demonstrate that multitask deep networks are surprisingly robust and can offer strong improvement over random forests. Our analysis and open-source implementation in DeepChem provide an argument that multitask deep networks are ready for widespread use in commercial drug discovery.
View details for DOI 10.1021/acs.jcim.7b00146
View details for PubMedID 28692267
Lighting up sugars: fluorescent BODIPY-gluco-furanose and -septanose conjugates linked by direct B-O-C bonds
ORGANIC & BIOMOLECULAR CHEMISTRY
2016; 14 (23): 5205-5209
We report the first O-BODIPY-glucose conjugates, in which the sugar is directly attached to the BODIPY boron through covalent B-O-C bonds. The reaction of Cl-BODIPY with glucose in acetonitrile produced the 1 : 1 α-glucofuranose BODIPY (1), 1 : 2 α-glucofuranose BODIPY (2) and 1 : 2 α-glucoseptanose BODIPY (3) esters. Compound 3 is a rare instance of the unnatural septanose form of glucose, and the first example of a septanose borate.
View details for DOI 10.1039/c6ob00726k
View details for Web of Science ID 000378512400002
View details for PubMedID 27205874