Honors & Awards
Gates Cambridge Scholar, Gates Cambridge Trust (2005-06)
NSF Graduate Research Fellow, National Science Foundation (2005-10)
Siebel Scholar, Siebel Foundation (2012)
Postdoctoral Fellow, American Cancer Society (2013-16)
K99/R00 Pathway to Independence Award, National Institutes of Health - NCI (2015)
Doctor of Philosophy, Massachusetts Institute of Technology (2012)
Master of Philosophy, Cambridge University, Computational Biology (2006)
Bachelor of Science, Stanford University, BIOL-MIN (2005)
Bachelor of Science, Stanford University, CHEM-BSH (2005)
Or Gozani, Postdoctoral Faculty Sponsor
A Proteomic Strategy Identifies Lysine Methylation of Splicing Factor snRNP70 by the SETMAR Enzyme
JOURNAL OF BIOLOGICAL CHEMISTRY
2015; 290 (19): 12040-12047
The lysine methyltransferase (KMT) SETMAR is implicated in the response to and repair of DNA damage, but its molecular function is not clear. SETMAR has been associated with dimethylation of histone H3 lysine 36 (H3K36) at sites of DNA damage. However, SETMAR does not methylate H3K36 in vitro. This and the observation that SETMAR is not active on nucleosomes suggest that H3K36 methylation is not a physiologically relevant activity. To identify potential non-histone substrates, we utilized a strategy on the basis of quantitative proteomic analysis of methylated lysine. Our approach identified lysine 130 of the mRNA splicing factor snRNP70 as a SETMAR substrate in vitro, and we show that the enzyme primarily generates monomethylation at this position. Furthermore, we show that SETMAR methylates snRNP70 Lys-130 in cells. Because snRNP70 is a key early regulator of 5' splice site selection, our results suggest a model in which methylation of snRNP70 by SETMAR regulates constitutive and/or alternative splicing. In addition, the proteomic strategy described here is broadly applicable and is a promising route for large-scale mapping of KMT substrates.
View details for DOI 10.1074/jbc.M115.641530
View details for Web of Science ID 000354388600019
- Emerging Technologies to Map the Protein Methylome JOURNAL OF MOLECULAR BIOLOGY 2014; 426 (20): 3350-3362
Proteome-wide enrichment of proteins modified by lysine methylation.
2014; 9 (1): 37-50
We present a protocol for using the triple malignant brain tumor domains of L3MBTL1 (3xMBT), which bind to mono- and di-methylated lysine with minimal sequence specificity, in order to enrich for such methylated lysine from cell lysates. Cells in culture are grown with amino acids containing light or heavy stable isotopic labels. Methylated proteins are enriched by incubating cell lysates with 3xMBT, or with the binding-null D355N mutant as a negative control. Quantitative liquid chromatography and tandem mass spectrometry (LC-MS/MS) are then used to identify proteins that are specifically enriched by 3xMBT pull-down. The addition of a third isotopic label allows the comparison of protein lysine methylation between different biological conditions. Unlike most approaches, our strategy does not require a prior hypothesis of candidate methylated proteins, and it recognizes a wider range of methylated proteins than any available method using antibodies. Cells are prepared by growing in isotopic labeling medium for about 7 d; the process of enriching methylated proteins takes 3 d and analysis by LC-MS/MS takes another 1-2 d.
View details for DOI 10.1038/nprot.2013.164
View details for PubMedID 24309976
A General Molecular Affinity Strategy for Global Detection and Proteomic Analysis of Lysine Methylation
2013; 50 (3): 444-456
Lysine methylation of histone proteins regulates chromatin dynamics and plays important roles in diverse physiological and pathological processes. However, beyond histone proteins, the proteome-wide extent of lysine methylation remains largely unknown. We have engineered the naturally occurring MBT domain repeats of L3MBTL1 to serve as a universal affinity reagent for detecting, enriching, and identifying proteins carrying a mono- or dimethylated lysine. The domain is broadly specific for methylated lysine ("pan-specific") and can be applied to any biological system. We have used our approach to demonstrate that SIRT1 is a substrate of the methyltransferase G9a both in vitro and in cells, to perform proteome-wide detection and enrichment of methylated proteins, and to identify candidate in-cell substrates of G9a and the related methyltransferase GLP. Together, our results demonstrate a powerful new approach for global and quantitative analysis of methylated lysine, and they represent the first systems biology understanding of lysine methylation.
View details for DOI 10.1016/j.molcel.2013.03.005
View details for Web of Science ID 000319183500015
Systems-pharmacology dissection of a drug synergy in imatinib-resistant CML
NATURE CHEMICAL BIOLOGY
2012; 8 (11): 905-912
Occurrence of the BCR-ABL(T315I) gatekeeper mutation is among the most pressing challenges in the therapy of chronic myeloid leukemia (CML). Several BCR-ABL inhibitors have multiple targets and pleiotropic effects that could be exploited for their synergistic potential. Testing combinations of such kinase inhibitors identified a strong synergy between danusertib and bosutinib that exclusively affected CML cells harboring BCR-ABL(T315I). To elucidate the underlying mechanisms, we applied a systems-level approach comprising phosphoproteomics, transcriptomics and chemical proteomics. Data integration revealed that both compounds targeted Mapk pathways downstream of BCR-ABL, resulting in impaired activity of c-Myc. Using pharmacological validation, we assessed that the relative contributions of danusertib and bosutinib could be mimicked individually by Mapk inhibitors and collectively by downregulation of c-Myc through Brd4 inhibition. Thus, integration of genome- and proteome-wide technologies enabled the elucidation of the mechanism by which a new drug synergy targets the dependency of BCR-ABL(T315I) CML cells on c-Myc through nonobvious off targets.
View details for DOI 10.1038/NCHEMBIO.1085
View details for Web of Science ID 000310052100009
View details for PubMedID 23023260
Labeling and Identification of Direct Kinase Substrates
2012; 5 (227)
Identifying kinase substrates is an important step in mapping signal transduction pathways, but it remains a difficult and time-consuming process. Analog-sensitive (AS) kinases have been used to selectively tag and identify direct kinase substrates in lysates from whole cells. In this approach, a ?-thiol adenosine triphosphate analog and an AS kinase are used to selectively thiophosphorylate target proteins. Thiophosphate is used as a chemical handle to purify peptides from a tryptic digest, and target proteins are identified by liquid chromatography and tandem mass spectrometry (LC-MS/MS). Here, we describe an updated strategy for labeling AS kinase substrates, solid-phase capture of thiophosphorylated peptides, incorporation of stable isotope labeling in cell culture for filtering nonspecific background peptides, enrichment of phosphorylated target peptides to identify low-abundance targets, and analysis by LC-MS/MS.
View details for DOI 10.1126/scisignal.2002568
View details for Web of Science ID 000305007500004
View details for PubMedID 22669844
Expanding applications of chemical genetics in signal transduction
2012; 11 (10): 1903-1909
Chemical genetics represents an expanding collection of techniques applied to a variety of signaling processes. These techniques use a combination of chemical reporters and protein engineering to identify targets of a signaling enzyme in a global and non-directed manner without resorting to hypothesis-driven candidate approaches. In the last year, chemical genetics has been applied to a variety of kinases, revealing a much broader spectrum of substrates than had been appreciated. Here, we discuss recent developments in chemical genetics, including insights from our own proteomic screen for substrates of the kinase ERK2. These studies have revealed that many kinases have overlapping substrate specificity, and they often target several proteins in any particular downstream pathway. It remains to be determined whether this configuration exists to provide redundant control, or whether each target contributes a fraction of the total regulatory effect. From a general perspective, chemical genetics is applicable in principle to a broad range of posttranslational modifications (PTMs), most notably methylation and acetylation, although many challenges remain in implementing this approach. Recent developments in chemical reporters and protein engineering suggest that chemical genetics will soon be a powerful tool for mapping signal transduction through these and other PTMs.
View details for DOI 10.4161/cc.19956
View details for Web of Science ID 000304039900015
View details for PubMedID 22544320
Large-Scale Discovery of ERK2 Substrates Identifies ERK-Mediated Transcriptional Regulation by ETV3
2011; 4 (196)
The mitogen-activated protein kinase (MAPK) extracellular signal-regulated kinase 2 (ERK2) is ubiquitously expressed in mammalian tissues and is involved in a wide range of biological processes. Although MAPKs have been intensely studied, identification of their substrates remains challenging. We have optimized a chemical genetic system using analog-sensitive ERK2, a form of ERK2 engineered to use an analog of adenosine 5'-triphosphate (ATP), to tag and isolate ERK2 substrates in vitro. This approach identified 80 proteins phosphorylated by ERK2, 13 of which are known ERK2 substrates. The 80 substrates are associated with diverse cellular processes, including regulation of transcription and translation, mRNA processing, and regulation of the activity of the Rho family guanosine triphosphatases. We found that one of the newly identified substrates, ETV3 (a member of the E twenty-six family of transcriptional regulators), was extensively phosphorylated on sites within canonical and noncanonical ERK motifs. Phosphorylation of ETV3 regulated transcription by preventing its binding to DNA at promoters for several thousand genes, including some involved in negative feedback regulation of itself and of upstream signals.
View details for DOI 10.1126/scisignal.2002010
View details for Web of Science ID 000296560500006
View details for PubMedID 22028470
Using Small Molecules and Chemical Genetics To Interrogate Signaling Networks
ACS CHEMICAL BIOLOGY
2011; 6 (1): 75-85
The limited clinical success of therapeutics targeting cellular signaling processes is due to multiple factors, including off-target effects and complex feedback regulation encoded within the signaling network. To understand these effects, chemical proteomics and chemical genetics tools have been developed to map the direct targets of kinase inhibitors, determine the network-level response to inhibitor treatment, and to infer network topology. Here we provide an overview of chemical phosphoproteomic and chemical genetic methods, including specific examples where these methods have been applied to yield biological insight regarding network structure and the system-wide effects of targeted therapeutics. The challenges and caveats associated with each method are described, along with approaches being used to resolve some of these issues. With the broad array of available techniques the next decade should see a rapid improvement in our understanding of signaling networks regulation and response to targeted perturbations, leading to more efficacious therapeutic strategies.
View details for DOI 10.1021/cb1002834
View details for Web of Science ID 000286306000008
View details for PubMedID 21077690
Integrated data management and validation platform for phosphorylated tandem mass spectrometry data
2010; 10 (19): 3515-3524
MS/MS is a widely used method for proteome-wide analysis of protein expression and PTMs. The thousands of MS/MS spectra produced from a single experiment pose a major challenge for downstream analysis. Standard programs, such as MASCOT, provide peptide assignments for many of the spectra, including identification of PTM sites, but these results are plagued by false-positive identifications. In phosphoproteomic experiments, only a single peptide assignment is typically available to support identification of each phosphorylation site, and hence minimizing false positives is critical. Thus, tedious manual validation is often required to increase confidence in the spectral assignments. We have developed phoMSVal, an open-source platform for managing MS/MS data and automatically validating identified phosphopeptides. We tested five classification algorithms with 17 extracted features to separate correct peptide assignments from incorrect ones using over 2600 manually curated spectra. The naïve Bayes algorithm was among the best classifiers with an AUC value of 97% and PPV of 97% for phosphotyrosine data. This classifier required only three features to achieve a 76% decrease in false positives as compared with MASCOT while retaining 97% of true positives. This algorithm was able to classify an independent phosphoserine/threonine data set with AUC value of 93% and PPV of 91%, demonstrating the applicability of this method for all types of phospho-MS/MS data. PhoMSVal is available at http://csbi.ltdk.helsinki.fi/phomsval.
View details for DOI 10.1002/pmic.200900727
View details for Web of Science ID 000283202400010
View details for PubMedID 20827731
Mcl-1 Integrates the Opposing Actions of Signaling Pathways That Mediate Survival and Apoptosis
MOLECULAR AND CELLULAR BIOLOGY
2009; 29 (14): 3845-3852
Mcl-1 is a member of the Bcl2-related protein family that is a critical mediator of cell survival. Exposure of cells to stress causes inhibition of Mcl-1 mRNA translation and rapid destruction of Mcl-1 protein by proteasomal degradation mediated by a phosphodegron created by glycogen synthase kinase 3 (GSK3) phosphorylation of Mcl-1. Here we demonstrate that prior phosphorylation of Mcl-1 by the c-Jun N-terminal protein kinase (JNK) is essential for Mcl-1 phosphorylation by GSK3. Stress-induced Mcl-1 degradation therefore requires the coordinated activity of JNK and GSK3. Together, these data establish that Mcl-1 functions as a site of signal integration between the proapoptotic activity of JNK and the prosurvival activity of the AKT pathway that inhibits GSK3.
View details for DOI 10.1128/MCB.00279-09
View details for Web of Science ID 000267939300003
View details for PubMedID 19433446
Biomarker clustering to address correlations in proteomic data
2007; 7 (7): 1037-1046
Correlated variables have been shown to confound statistical analyses in microarray experiments. The same effect applies to an even greater degree in proteomics, especially with the use of MS for parallel measurements. Biological effects such as PTM, fragmentation, and multimer formation can produce strongly correlated variables. The problem is compounded in some types of MS by technical effects such as incomplete chromatographic separation, binding to multiple surfaces, or multiple ionizations. Existing methods for dimension reduction, notably principal components analysis and related techniques, are not always satisfactory because they produce data that often lack clear biological interpretation. We propose a preprocessing algorithm that clusters highly correlated features, using the Bayes information criterion to select an optimal number of clusters. Statistical analysis of clusters, instead of individual features, benefits from lower noise, and reduces the difficulties associated with strongly correlated data. This preprocessing increases the statistical power of analyses using false discovery rate on simulated data. Strong correlations are often present in real data, and we find that clustering improves biomarker discovery in clinical SELDI-TOF-MS datasets of plasma from patients with Kawasaki disease, and bone-marrow cell extracts from patients with acute myeloid or acute lymphoblastic leukemia.
View details for DOI 10.1002/pmic.200600514
View details for Web of Science ID 000245739100003
View details for PubMedID 17390293
A potential biomarker in the cord blood of preterm infants who develop retinopathy of prematurity
2007; 61 (2): 215-221
Preterm infants are at risk of developing sepsis, necrotizing enterocolitis (NEC), chronic lung disease (CLD), and retinopathy of prematurity (ROP). We used high-throughput mass spectrometry to investigate whether cord blood proteins can be used to predict development of these morbidities. Cord blood plasma from 44 infants with a birth weight of <1500 g was analyzed by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF). Six infants developed ROP >or=stage II, 10 CLD, three sepsis, and one NEC. We detected 814 protein signals representing 330 distinct protein species. Nineteen biomarkers were associated with development of >or=stage II ROP [false-discovery rate (FDR) <5%] and none with CLD. Several proteins with molecular weight (Mr) 15-16 kD and pI 4-5 were detected with increased abundance in infants with ROP, while similar Mr proteins with pI 7-9 were less abundant in these patients. Sodium dodecylsulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and sequence analysis identified these proteins as alpha-, beta-, and gamma-globin chains. Partial deamidation of Asn139 in beta-globin chains was observed only in the pI 4-5 proteins. We conclude that there are several promising biomarkers for the risk of ROP. Deamidation of globin chains is especially promising and may indicate underlying prenatal pathologic mechanisms in ROP. Validation studies will be undertaken to determine their clinical utility.
View details for DOI 10.1203/pdr.0b013e31802d776d
View details for Web of Science ID 000243714700016
View details for PubMedID 17237725
Improving feature detection and analysis of surface-enhanced laser desorption/ionization-time of flight mass spectra
2005; 5 (11): 2778-2788
Discovering valid biological information from surface-enhanced laser desorption/ionization-time of flight mass spectrometry (SELDI-TOF MS) depends on clear experimental design, meticulous sample handling, and sophisticated data processing. Most published literature deals with the biological aspects of these experiments, or with computer-learning algorithms to locate sets of classifying biomarkers. The process of locating and measuring proteins across spectra has received less attention. This process should be tunable between sensitivity and false-discovery, and should guarantee that features are biologically meaningful in that they represent chemical species that can be identified and investigated. Existing feature detection in SELDI-TOF MS is not optimal for acquiring biologically relevant data. Most methods have so many user-defined settings that reproducibility and comparability among studies suffer considerably. To address these issues, we have developed an approach, called simultaneous spectrum analysis (SSA), which (i) locates proteins across spectra, (ii) measures their abundance, (iii) subtracts baseline, (iv) excludes irreproducible measurements, and (v) computes normalization factors for comparing spectra. SSA uses only two key parameters for feature detection and one parameter each for quality thresholds on spectra and peaks. The effectiveness of SSA is demonstrated by identifying proteins differentially expressed in SELDI-TOF spectra from plasma of wild-type and knockout mice for plasma glutathione peroxidase. Comparing analyses by SSA and CiphergenExpress Data Manager 2.1 finds similar results for large signal peaks, but SSA improves the number and quality of differences betweens groups among lower signal peaks. SSA is also less likely to introduce systematic bias when normalizing spectra.
View details for DOI 10.1002/pmic.200401184
View details for Web of Science ID 000231036100008
View details for PubMedID 15986333
Genetic structure of the purebred domestic dog
2004; 304 (5674): 1160-1164
We used molecular markers to study genetic relationships in a diverse collection of 85 domestic dog breeds. Differences among breeds accounted for approximately 30% of genetic variation. Microsatellite genotypes were used to correctly assign 99% of individual dogs to breeds. Phylogenetic analysis separated several breeds with ancient origins from the remaining breeds with modern European origins. We identified four genetic clusters, which predominantly contained breeds with similar geographic origin, morphology, or role in human activities. These results provide a genetic classification of dog breeds and will aid studies of the genetics of phenotypic breed differences.
View details for Web of Science ID 000221524500044
View details for PubMedID 15155949
A joint analysis of asthma affection status and IgE levels in multiple data sets collected for asthma
WILEY-BLACKWELL. 2001: S148-S153
We present a joint linkage analysis of eight data sets collected for asthma. Three of the data sets are full genome scans, while the remaining five concentrate on a 40-cM region on chromosome 5. We perform the analysis using one qualitative and one quantitative phenotype: asthma status and IgE level. Considering all data sets simultaneously, we do not find evidence for linkage to asthma affection status beyond the level expected to occur by chance twice per genome scan. In contrast, we observe significant linkage to IgE level on chromosome 6.
View details for Web of Science ID 000171462700029
View details for PubMedID 11793658