Clinical Associate Professor, Medicine - Biomedical Informatics Research
M.S., University of Missouri-Kansas City, Computer Science (2002)
Postdoctoral Fellow, University of Kansas Medical Center, Molecular Modeling (2001)
Ph.D., Case Western Reserve University, Biochemistry (1999)
M.D., JIPMER, Biochemistry (1993)
M.B.B.S, Jawaharlal Institute of Post-Graduate Medical Education and Research (JIPMER), Medicine (1990)
Current Research and Scholarly Interests
Method development and insightful informatics based on my training as a physician, biochemist and computer scientist: Methods for representing, capturing and integrating emerging or expert biomedical knowledge to improve computational predictions of biological and clinical relevance. Methods for evaluating predictions based on machine learning. Interventional and causal predictions. Informatics research on problems in oncology, radiology and allergy/immunology.
- Semantic Changepoint Detection for Finding Potentially Novel Research Publications Pacific Symposium on Biocomputing World Scientific Publishing Company. 2021: 107–118
- Informatics Analysis of Cross-Reactivity of Food Allergens in South Asian Cuisine MOSBY-ELSEVIER. 2019: AB238
Exposure to NO2, CO, and PM2.5 is linked to regional DNA methylation differences in asthma
2018; 10: 2
DNA methylation of CpG sites on genetic loci has been linked to increased risk of asthma in children exposed to elevated ambient air pollutants (AAPs). Further identification of specific CpG sites and the pollutants that are associated with methylation of these CpG sites in immune cells could impact our understanding of asthma pathophysiology. In this study, we sought to identify some CpG sites in specific genes that could be associated with asthma regulation (Foxp3 and IL10) and to identify the different AAPs for which exposure prior to the blood draw is linked to methylation levels at these sites. We recruited subjects from Fresno, California, an area known for high levels of AAPs. Blood samples and responses to questionnaires were obtained (n = 188), and in a subset of subjects (n = 33), repeat samples were collected 2 years later. Average measures of AAPs were obtained for 1, 15, 30, 90, 180, and 365 days prior to each blood draw to estimate the short-term vs. long-term effects of the AAP exposures.Asthma was significantly associated with higher differentially methylated regions (DMRs) of the Foxp3 promoter region (p = 0.030) and the IL10 intronic region (p = 0.026). Additionally, at the 90-day time period (90 days prior to the blood draw), Foxp3 methylation was positively associated with NO2, CO, and PM2.5 exposures (p = 0.001, p = 0.001, and p = 0.012, respectively). In the subset of subjects retested 2 years later (n = 33), a positive association between AAP exposure and methylation was sustained. There was also a negative correlation between the average Foxp3 methylation of the promoter region and activated Treg levels (p = 0.039) and a positive correlation between the average IL10 methylation of region 3 of intron 4 and IL10 cytokine expression (p = 0.030).Short-term and long-term exposures to high levels of CO, NO2, and PM2.5 were associated with alterations in differentially methylated regions of Foxp3. IL10 methylation showed a similar trend. For any given individual, these changes tend to be sustained over time. In addition, asthma was associated with higher differentially methylated regions of Foxp3 and IL10.
View details for PubMedID 29317916
- TOCSOC: A temporal ontology for comparing the survival outcomes of clinical trials in oncology International Conference on Biological Ontology CEUR Workshop Proceedings Vol. 2285. 2018
Automated Prediction of Hepatic Arterial Stenosis.
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
2017; 2017: 58–65
Several thousand life-saving liver transplants are performed each year. One of the most common causes of early transplant failure is arterial stenosis of the anastomotic junction. Early detection of transplant arterial stenosis can help prevent transplant failure and the need to re-transplant. Doppler ultrasound is the most common screening method, but it suffers from poor specificity. Positive screening cases proceed to angiography which is an invasive and expensive procedure. A more accurate test could decrease the number of normal patients who would have to undergo this invasive diagnostic procedure. We present a turnkey clinical decision support tool for automated prediction of stenosis based on Fourier spectrum analysis of Doppler sonograms to compute a Stenosis Index that has been shown to have higher accuracy than traditional measures. The results of the automated approach compare favorably with the manual approach. Software is available from the authors on request.
View details for PubMedID 28815106
View details for PubMedCentralID PMC5543337
Evaluating proteins for potential allergenicity using bioinformatic approaches.
Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology
2017; 119 (3): 197–98
View details for PubMedID 28890013
Engineered Tissue Inhibitor of Metalloproteinases-3 Variants Resistant to Endocytosis Have Prolonged Chondroprotective Activity.
journal of biological chemistry
2016; 291 (42): 22160-22172
Tissue inhibitor of metalloproteinases-3 (TIMP-3) is a central inhibitor of matrix-degrading and sheddase families of metalloproteinases. Extracellular levels of the inhibitor are regulated by the balance between its retention on the extracellular matrix and its endocytic clearance by the scavenger receptor low density lipoprotein receptor-related protein 1 (LRP1). Here, we used molecular modeling to predict TIMP-3 residues potentially involved in binding to LRP1 based on the proposed LRP1 binding motif of 2 lysine residues separated by about 21 Å and mutated the candidate lysine residues to alanine individually and in pairs. Of the 22 mutants generated, 13 displayed a reduced rate of uptake by HTB94 chondrosarcoma cells. The two mutants (TIMP-3 K26A/K45A and K42A/K110A) with lowest rates of uptake were further evaluated and found to display reduced binding to LRP1 and unaltered inhibitory activity against prototypic metalloproteinases. TIMP-3 K26A/K45A retained higher affinity for sulfated glycosaminoglycans than K42A/K110A and exhibited increased affinity for ADAMTS-5 in the presence of heparin. Both mutants inhibited metalloproteinase-mediated degradation of cartilage at lower concentrations and for longer than wild-type TIMP-3, indicating that their increased half-lives improved their ability to protect cartilage. These mutants may be useful in treating connective tissue diseases associated with increased metalloproteinase activity.
View details for PubMedID 27582494
Association of tree nut and coconut sensitizations.
Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology
2016; 117 (4): 412-416
Coconut (Cocos nucifera), despite being a drupe, was added to the US Food and Drug Administration list of tree nuts in 2006, causing potential confusion regarding the prevalence of coconut allergy among tree nut allergic patients.To determine whether sensitization to tree nuts is associated with increased odds of coconut sensitization.A single-center retrospective analysis of serum specific IgE levels to coconut, tree nuts (almond, Brazil nut, cashew, chestnut, hazelnut, macadamia, pecan, pistachio, and walnut), and controls (milk and peanut) was performed using deidentified data from January 2000 to August 2012. Spearman correlation (ρ) between coconut and each tree nut was determined, followed by hierarchical clustering. Sensitization was defined as a nut specific IgE level of 0.35 kU/L or higher. Unadjusted and adjusted associations between coconut and tree nut sensitization were tested by logistic regression.Of 298 coconut IgE values, 90 (30%) were considered positive results, with a mean (SD) of 1.70 (8.28) kU/L. Macadamia had the strongest correlation (ρ = 0.77), whereas most other tree nuts had significant (P < .05) but low correlation (ρ < 0.5) with coconut. The adjusted odds ratio between coconut and macadamia was 7.39 (95% confidence interval, 2.60-21.02; P < .001) and 5.32 (95% confidence interval, 2.18-12.95; P < .001) between coconut and almond, with other nuts not being statistically significant.Our findings suggest that although sensitization to most tree nuts appears to correlate with coconut, this is largely explained by sensitization to almond and macadamia. This finding has not previously been reported in the literature. Further study correlating these results with clinical symptoms is planned.
View details for DOI 10.1016/j.anai.2016.07.023
View details for PubMedID 27566863
Constellation: a tool for rapid, automated phenotype assignment of a highly polymorphic pharmacogene, CYP2D6, from whole-genome sequences
NPJ GENOMIC MEDICINE
2016; 1: 15007
An important component of precision medicine-the use of whole-genome sequencing (WGS) to guide lifelong healthcare-is electronic decision support to inform drug choice and dosing. To achieve this, automated identification of genetic variation in genes involved in drug absorption, distribution, metabolism, excretion and response (ADMER) is required. CYP2D6 is a major enzyme for drug bioactivation and elimination. CYP2D6 activity is predominantly governed by genetic variation; however, it is technically arduous to haplotype. Not only is the nucleotide sequence of CYP2D6 highly polymorphic, but the locus also features diverse structural variations, including gene deletion, duplication, multiplication events and rearrangements with the nonfunctional, neighbouring CYP2D7 and CYP2D8 genes. We developed Constellation, a probabilistic scoring system, enabling automated ascertainment of CYP2D6 activity scores from 2×100 paired-end WGS. The consensus reference method included TaqMan genotyping assays, quantitative copy-number variation determination and Sanger sequencing. When compared with the consensus reference Constellation had an analytic sensitivity of 97% (59 of 61 diplotypes) and analytic specificity of 95% (116 of 122 haplotypes). All extreme phenotypes, i.e., poor and ultrarapid metabolisers were accurately identified by Constellation. Constellation is anticipated to be extensible to functional variation in all ADMER genes, and to be performed at marginal incremental financial and computational costs in the setting of diagnostic WGS.
View details for DOI 10.1038/npjgenmed.2015.7
View details for Web of Science ID 000413227500002
View details for PubMedID 29263805
View details for PubMedCentralID PMC5685293
- A Semantic Framework for Intelligent Matchmaking for Clinical Trial Eligibility Criteria ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY 2013; 4 (4)
Finding disease similarity based on implicit semantic similarity
JOURNAL OF BIOMEDICAL INFORMATICS
2012; 45 (2): 363-371
Genomics has contributed to a growing collection of gene-function and gene-disease annotations that can be exploited by informatics to study similarity between diseases. This can yield insight into disease etiology, reveal common pathophysiology and/or suggest treatment that can be appropriated from one disease to another. Estimating disease similarity solely on the basis of shared genes can be misleading as variable combinations of genes may be associated with similar diseases, especially for complex diseases. This deficiency can be potentially overcome by looking for common biological processes rather than only explicit gene matches between diseases. The use of semantic similarity between biological processes to estimate disease similarity could enhance the identification and characterization of disease similarity. We present functions to measure similarity between terms in an ontology, and between entities annotated with terms drawn from the ontology, based on both co-occurrence and information content. The similarity measure is shown to outperform other measures used to detect similarity. A manually curated dataset with known disease similarities was used as a benchmark to compare the estimation of disease similarity based on gene-based and Gene Ontology (GO) process-based comparisons. The detection of disease similarity based on semantic similarity between GO Processes (Recall=55%, Precision=60%) performed better than using exact matches between GO Processes (Recall=29%, Precision=58%) or gene overlap (Recall=88% and Precision=16%). The GO-Process based disease similarity scores on an external test set show statistically significant Pearson correlation (0.73) with numeric scores provided by medical residents. GO-Processes associated with similar diseases were found to be significantly regulated in gene expression microarray datasets of related diseases.
View details for DOI 10.1016/j.jbi.2011.11.017
View details for Web of Science ID 000302208700015
View details for PubMedID 22166490
Drug repositioning using disease associated biological processes and network analysis of drug targets.
AMIA ... Annual Symposium proceedings. AMIA Symposium
2011; 2011: 305-311
The analysis of disease using protein-protein interaction networks and network pharmacology has enabled better understanding of disease etiology and drug action. New insights into disease etiology and a better understanding of biological subsystems have opened up the possibility of finding new uses for existing drugs besides their original medical indication. We present an approach which makes use of the biological processes associated with diseases along with their known drugs and drug targets to predict Biological Process-Drug relationships. Network analysis is used to further refine these associations to eventually predict new Disease-Drug relationships. The approach is validated by the observation that, out of 2078 predicted disease-drug relationships, 401 (18.1%) have been used in a clinical trial.
View details for PubMedID 22195082
Automated ontological gene annotation for computing disease similarity.
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
2010; 2010: 12-16
The annotation of gene/gene products with information on associated diseases is useful as an aid to clinical diagnosis and drug discovery. Several supervised and unsupervised methods exist that automate the association of genes with diseases, but relatively little work has been done to map protein sequence data to disease terminologies. This paper augments an existing open-disease terminology, the Disease Ontology (DO), and uses it for automated annotation of Swissprot records. In addition to the inherent benefits of mapping data to a rich ontology, we demonstrate a gain of 36.1% in gene-disease associations compared to that in DO. Further, we measure disease similarity by exploiting the co-occurrence of annotation among proteins and the hierarchical structure of DO. This makes it possible to find related diseases or signs, with the potential to find previously unknown relationships.
View details for PubMedID 21347137
MachineProse: An ontological framework for scientific assertions
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
2006; 13 (2): 220-232
The idea of testing a hypothesis is central to the practice of biomedical research. However, the results of testing a hypothesis are published mainly in the form of prose articles. Encoding the results as scientific assertions that are both human and machine readable would greatly enhance the synergistic growth and dissemination of knowledge.We have developed MachineProse (MP), an ontological framework for the concise specification of scientific assertions. MP is based on the idea of an assertion constituting a fundamental unit of knowledge. This is in contrast to current approaches that use discrete concept terms from domain ontologies for annotation and assertions are only inferred heuristically.We use illustrative examples to highlight the advantages of MP over the use of the Medical Subject Headings (MeSH) system and keywords in indexing scientific articles.We show how MP makes it possible to carry out semantic annotation of publications that is machine readable and allows for precise search capabilities. In addition, when used by itself, MP serves as a knowledge repository for emerging discoveries. A prototype for proof of concept has been developed that demonstrates the feasibility and novel benefits of MP. As part of the MP framework, we have created an ontology of relationship types with about 100 terms optimized for the representation of scientific assertions.MachineProse is a novel semantic framework that we believe may be used to summarize research findings, annotate biomedical publications, and support sophisticated searches.
View details for DOI 10.1197/jamia.M1910
View details for Web of Science ID 000236118000013
View details for PubMedID 16357355
View details for PubMedCentralID PMC1447552
Ontological modeling of transformation in heart defect diagrams.
AMIA ... Annual Symposium proceedings. AMIA Symposium
The accurate portrayal of a large volume data of variable heart defects is crucial to providing good patient care in pediatric cardiology. Our research aims to span the universe of congenital heart defects by generating illustrative diagrams that enhance data interpretation. To accommodate the range and severity of defects to be represented, we base our diagrams on transformation models applied to a normal heart rather than a static set of defects. These models are based on a domain-specific ontology, clustering, association rule mining and the use of parametric equations specified in a mathematical programming language.
View details for PubMedID 17238451
Tandem machine learning for the identification of genes regulated by transcription factors
The identification of promoter regions that are regulated by a given transcription factor has traditionally relied upon the identification and distributions of binding sites recognized by the factor. In this study, we have developed a tandem machine learning approach for the identification of regulatory target genes based on these parameters and on the corresponding binding site information contents that measure the affinities of the factor for these cognate elements.This method has been validated using models of DNA binding sites recognized by the xenobiotic-sensitive nuclear receptor, PXR/RXRalpha, for target genes within the human genome. An information theory-based weight matrix was first derived and refined from known PXR/RXRalpha binding sites. The promoter region of candidate genes was scanned with the weight matrix. A novel information density-based clustering algorithm was then used to identify clusters of information rich sites. Finally, transformed data representing metrics of location, strength and clustering of binding sites were used for classification of promoter regions using an ensemble approach involving neural networks, decision trees and Naïve Bayesian classification. The method was evaluated on a set of 24 known target genes and 288 genes known not to be regulated by PXR/RXRalpha. We report an average accuracy (proportion of correctly classified promoter regions) of 71%, sensitivity of 73%, and specificity of 70%, based on multiple cross-validation and the leave-one-out strategy. The performance on a test set of 13 genes showed that 10 were correctly classified.We have developed a machine learning approach for the successful detection of gene targets for transcription factors with high accuracy. The method has been validated for the transcription factor PXR/RXRalpha and has the potential to be extended to other transcription factors.
View details for DOI 10.1186/1471-2105-6-204
View details for Web of Science ID 000231855800001
View details for PubMedID 16115317
- ConsDiff: an algorithm for the detection of conserved differences between protein sequences ELSEVIER SCIENCE BV. 2005: 31–43
An informatics search for the low-molecular weight chromium-binding peptide.
BMC chemical biology
2004; 4 (1): 2-?
BACKGROUND: The amino acid composition of a low molecular weight chromium binding peptide (LMWCr), isolated from bovine liver, is reportedly E:G:C:D::4:2:2:2, though its sequence has not been discovered. There is some controversy surrounding the exact biochemical forms and the action of Cr(III) in biological systems; the topic has been the subject of many experimental reports and continues to be investigated. Clarification of Cr-protein interactions will further understanding Cr(III) biochemistry and provide a basis for novel therapies based on metallocomplexes or small molecules. RESULTS: A genomic search of the non-redundant database for all possible decapeptides of the reported composition yields three exact matches, EDGEECDCGE, DGEECDCGEE and CEGGCEEDDE. The first two sequences are found in ADAM 19 (A Disintegrin and Metalloproteinase domain 19) proteins in man and mouse; the last is found in a protein kinase in rice (Oryza sativa). A broader search for pentameric sequences (and assuming a disulfide dimer) corresponding to the stoichiometric ratio E:D:G:C::2:1:1:1, within the set of human proteins and the set of proteins in, or related to, the insulin signaling pathway, yields a match at an acidic region in the alpha-subunit of the insulin receptor (-EECGD-, residues 175-184). A synthetic peptide derived from this sequence binds chromium(III) and forms a metal-peptide complex that has properties matching those reported for isolated LMWCr and Cr(III)-containing peptide fractions. CONCLUSION: The search for an acidic decameric sequence indicates that LMWCr may not be a contiguous sequence. The identification of a distinct pentameric sequence in a significant insulin-signaling pathway protein suggests a possible identity for the LMWCr peptide. This identification clarifies directions for further investigation of LMWCr peptide fractions, chromium bio-coordination chemistry and a possible role in the insulin signaling pathway. Implications for models of chromium action in the insulin-signaling pathway are discussed.
View details for PubMedID 15603587
Collagenase unwinds triple-helical collagen prior to peptide bond hydrolysis
2004; 23 (15): 3020-3030
Breakdown of triple-helical interstitial collagens is essential in embryonic development, organ morphogenesis and tissue remodelling and repair. Aberrant collagenolysis may result in diseases such as arthritis, cancer, atherosclerosis, aneurysm and fibrosis. In vertebrates, it is initiated by collagenases belonging to the matrix metalloproteinase (MMP) family. The three-dimensional structure of a prototypic collagenase, MMP-1, indicates that the substrate-binding site of the enzyme is too narrow to accommodate triple-helical collagen. Here we report that collagenases bind and locally unwind the triple-helical structure before hydrolyzing the peptide bonds. Mutation of the catalytically essential residue Glu200 of MMP-1 to Ala resulted in a catalytically inactive enzyme, but in its presence noncollagenolytic proteinases digested collagen into typical 3/4 and 1/4 fragments, indicating that the MMP-1(E200A) mutant unwinds the triple-helical collagen. The study also shows that MMP-1 preferentially interacts with the alpha2(I) chain of type I collagen and cleaves the three alpha chains in succession. Our results throw light on the basic mechanisms that control a wide range of biological and pathological processes associated with tissue remodelling.
View details for DOI 10.1038/sj.emboj.7600318
View details for Web of Science ID 000223729400012
View details for PubMedID 15257288
Identification of the (RWTNNFREY191)-R-183 region as a critical segment of matrix metalloproteinase 1 for the expression of collagenolytic activity
JOURNAL OF BIOLOGICAL CHEMISTRY
2000; 275 (38): 29610-29617
Matrix metalloproteinase 1 (MMP-1) cleaves types I, II, and III collagen triple helices into (3/4) and (1/4) fragments. To understand the structural elements responsible for this activity, various lengths of MMP-1 segments have been introduced into MMP-3 (stromelysin 1) starting from the C-terminal end. MMP-3/MMP-1 chimeras and variants were overexpressed in Escherichia coli, folded from inclusion bodies, and isolated as zymogens. After activation, recombinant chimeras were tested for their ability to digest triple helical type I collagen at 25 degrees C. The results indicate that the nine residues (183)RWTNNFREY(191) located between the fifth beta-strand and the second alpha-helix in the catalytic domain of MMP-1 are critical for the expression of collagenolytic activity. Mutation of Tyr(191) of MMP-1 to Thr, the corresponding residue in MMP-3, reduced collagenolytic activity about 5-fold. Replacement of the nine residues with those of the MMP-3 sequence further decreased the activity 2-fold. Those variants exhibited significant changes in substrate specificity and activity against gelatin and synthetic substrates, further supporting the notion that this region plays a critical role in the expression of collagenolytic activity. However, introduction of this sequence into MMP-3 or a chimera consisting of the catalytic domain of MMP-3 with the hinge region and the C-terminal hemopexin domain of MMP-1 did not express any collagenolytic activity. It is therefore concluded that RWTNNFREY, together with the C-terminal hemopexin domain, is essential for collagenolytic activity but that additional structural elements in the catalytic domain are also required. These elements probably act in a concerted manner to cleave the collagen triple helix.
View details for Web of Science ID 000089439800058
View details for PubMedID 10871619
Tissue inhibitors of metalloproteinases: evolution, structure and function
BIOCHIMICA ET BIOPHYSICA ACTA-PROTEIN STRUCTURE AND MOLECULAR ENZYMOLOGY
2000; 1477 (1-2): 267-283
The matrix metalloproteinases (MMPs) play a key role in the normal physiology of connective tissue during development, morphogenesis and wound healing, but their unregulated activity has been implicated in numerous disease processes including arthritis, tumor cell metastasis and atherosclerosis. An important mechanism for the regulation of the activity of MMPs is via binding to a family of homologous proteins referred to as the tissue inhibitors of metalloproteinases (TIMP-1 to TIMP-4). The two-domain TIMPs are of relatively small size, yet have been found to exhibit several biochemical and physiological/biological functions, including inhibition of active MMPs, proMMP activation, cell growth promotion, matrix binding, inhibition of angiogenesis and the induction of apoptosis. Mutations in TIMP-3 are the cause of Sorsby's fundus dystrophy in humans, a disease that results in early onset macular degeneration. This review highlights the evolution of TIMPs, the recently elucidated high-resolution structures of TIMPs and their complexes with metalloproteinases, and the results of mutational and other studies of structure-function relationships that have enhanced our understanding of the mechanism and specificity of the inhibition of MMPs by TIMPs. Several intriguing questions, such as the basis of the multiple biological functions of TIMPs, the kinetics of TIMP-MMP interactions and the differences in binding in some TIMP-metalloproteinase pairs are discussed which, though not fully resolved, serve to illustrate the kind of issues that are important for a full understanding of the interactions between families of molecules.
View details for Web of Science ID 000085998500021
View details for PubMedID 10708863
- Variable-Temperature Mount for a Microliter-Raman Cell Applied Spectroscopy 2000; 54 (1): 153-154
Electric fields in active sites: substrate switching from null to strong fields in thiol- and selenol-subtilisins.
1999; 38 (20): 6659–67
Although known to be important factors in promoting catalysis, electric field effects in enzyme active sites are difficult to characterize from an experimental standpoint. Among optical probes of electric fields, Raman spectroscopy has the advantage of being able to distinguish electronic ground-state and excited-state effects. Earlier Raman studies on acyl derivatives of cysteine proteases [Doran, J. D., and Carey, P. R. (1996) Biochemistry 35, 12495-502], where the acyl group has extensive pi-electron conjugation, showed that electric field effects in the active site manifest themselves by polarizing the pi-electrons of the acyl group. Polarization gives rise to large shifts in certain Raman bands, e.g. , the C=C stretching band of the alpha,beta-unsaturated acyl group, and a large red shift in the absorption maximum. It was postulated that a major source of polarization is the alpha-helix dipole that originates from the alpha-helix terminating at the active-site cysteine of the cysteine protease family. In contrast, using the acyl group 5-methylthiophene acryloyl (5-MTA) as an active-site Raman probe, acyl enzymes of thiol- or selenol-subtilisin exhibit no polarization even though the acylating amino acid is at the terminus of an alpha-helix. Quantum mechanical calculations on 5-MTA ethyl thiol and selenol ethyl esters allowed us to identify the conformational states of these molecules along with their corresponding vibrational signatures. The Raman spectra of 5-MTA thiol and selenol subtilisins both showed that the acyl group binds in a single conformation in the active site that is s-trans about the =C-C=O single bond. Moreover, the positions of the C=C stretching bands show that the acyl group is not experiencing polarization. However, the release of steric constraints in the active site by mutagenesis, by creating the N155G form of selenol-subtilisin and the P225A form of thiol-subtilisin, results in the appearance of a second conformer in the active sites that is s-cis about the =C-C=O bond. The Raman signature of this second conformer indicates that it is strongly polarized with a permanent dipole being set up through the acyl group's pi-electron chain. Molecular modeling for 5-MTA in the active sites of selenol-subtilisin and N155G selenol-subtilisin confirms the findings from Raman spectroscopic studies and identifies the active-site features that give rise to polarization. The determinants of polarization appear to be strong electron pull at the acyl carbonyl group by a combination of hydrogen bonds and the field at the N-terminus of the alpha-helix and electron push from a negatively charged group placed at the opposite end of the chromophore.
View details for DOI 10.1021/bi9902541
View details for PubMedID 10350485
Molecular structure of 5-methyl thiophene acryloyl ethyl thiolester: a vibrational spectroscopic and density functional theory study.
1999; 5 (4): 201–18
Enzyme-substrate intermediates involving the acyl group 5-methyl thiophene acryloyl (5-MTA) bound to the active site of an enzyme via a sulfur or selenium atom have been characterized by Raman spectroscopy (e.g., J. D. Doran and P. R. Carey, Biochemistry 1996, 35, 12495-12502, and M. J. O'Connor et al., J Amer Chem Soc 1996, 118, 239-240). Raman difference spectra reveal the Raman spectrum of the acyl group in the active site and, in turn, these can be used to probe acyl group conformation and active site forces and interactions. In order to improve the understanding of the relationship between conformational states and vibrational spectra of 5-MTA thiolesters, calculations based on a density functional theory analysis are undertaken for 5-methyl thiophene acryloyl ethyl ester. The calculations provide the precise geometries and energies of rotomers of 5-MTA ethyl thiolester involving rotational isomerism about the C--C single bonds flanking the ethylenic linkage and the S--C bond linking the ethyl group to the sulfur atom. The calculations also provide the vibrational spectrum for each conformer and these predictions are compared with the experimental Raman an IR data for the thiolester in carbon tetrachloride. Modes are identified that can act as conformational markers for isomerism about the C--C and S--C2H5 single bonds. These findings are used to identify the two conformational states giving rise to the Raman spectrum of the 5-MTA-S-enzyme formed by the viral cysteine protease HAV-3C.
View details for DOI 10.1002/(SICI)1520-6343(1999)5:4<201::AID-BSPY1>3.0.CO;2-1
View details for PubMedID 10478951
- Extending the Raman analysis of biological samples to the 100 micromolar concentration range Applied Spectroscopy 1998; 52 (8): 1117-1121
Active site properties of the 3C proteinase from hepatitis A virus (a hybrid cysteine/serine protease) probed by Raman spectroscopy.
1997; 36 (16): 4943–48
Although the HAV 3C proteinase is a cysteine protease, it displays an active site configuration which resembles mammalian serine proteases and is structurally distinct from the papain superfamily of thiol proteases. Given the interesting serine/cysteine protease hybrid nature of HAV 3C, we have probed its active site properties via the Raman spectra of the acyl enzyme, 5-methylthiophene acryloyl HAV 3C, using the C24S variant of the enzyme to obtain stoichiometric acylation. The Raman difference spectral data show that the major population of the acyl groups in the active site experiences electron polarization intermediate between that in the papain superfamily and that in a nonpolarizing site. This is evidenced by the values of the acyl group ethylenic stretching frequency which occur near 1602 cm(-1) in a nonpolarizing environment, at 1588 cm(-1) when bound to HAV 3C (C24S), and at 1579 cm(-1) in acyl papains. The value of the electronic absorption maximum for the HAV 3C (C24S) acyl enzyme and the deacylation rate constant fit the correlation developed for the papain superfamily, suggesting that for HAV 3C too, polarizing forces in the active site can contribute to rate acceleration via transition state stabilization. The major population in the active site is s-cis about the acyl group's C1-C2 bond, but there is a second population that is s-trans, and this secondary population is not polarized. The two populations are evidenced by the presence of two sets of marker bands for s-cis and s-trans in the Raman spectra, which occur principally in the C=C stretching region near 1600 cm(-1), in the C-C stretching region near 1100 cm(-1), and near 560 cm(-1). The positions of the acyl carbonyl features in the Raman spectra point to hydrogen-bonding strengths of 20-25 kJ mol(-1) between the C=O and H-bonding donors in the active site. The 5-methylthiophene acryloyl HAV 3C (C24S) is a relatively unreactive acyl enzyme, deacylating with a pKa of 7.1 and a rate constant of 0.000 31 s(-1) at pH 9. Unlike most other cysteine or serine protease acyl enzymes characterized by Raman spectroscopy, no changes in the Raman spectrum could be detected with changes in pH.
View details for DOI 10.1021/bi963148x
View details for PubMedID 9125516