I am interested studying complex RNA processes like pre-mRNA splicing using 2D and 3D computational structural modeling along with experimental structural probing techniques.
Rhiju Das, Doctoral Dissertation Advisor (AC)
De novo 3D models of SARS-CoV-2 RNA elements from consensus experimental secondary structures.
Nucleic acids research
The rapid spread of COVID-19 is motivating development of antivirals targeting conserved SARS-CoV-2 molecular machinery. The SARS-CoV-2 genome includes conserved RNA elements that offer potential small-molecule drug targets, but most of their 3D structures have not been experimentally characterized. Here, we provide a compilation of chemical mapping data from our and other labs, secondary structure models, and 3D model ensembles based on Rosetta's FARFAR2 algorithm for SARS-CoV-2 RNA regions including the individual stems SL1-8 in the extended 5' UTR; the reverse complement of the 5' UTR SL1-4; the frameshift stimulating element (FSE); and the extended pseudoknot, hypervariable region, and s2m of the 3' UTR. For eleven of these elements (the stems in SL1-8, reverse complement of SL1-4, FSE, s2m and 3' UTR pseudoknot), modeling convergence supports the accuracy of predicted low energy states; subsequent cryo-EM characterization of the FSE confirms modeling accuracy. To aid efforts to discover small molecule RNA binders guided by computational models, we provide a second set of similarly prepared models for RNA riboswitches that bind small molecules. Both datasets ('FARFAR2-SARS-CoV-2', https://github.com/DasLab/FARFAR2-SARS-CoV-2; and 'FARFAR2-Apo-Riboswitch', at https://github.com/DasLab/FARFAR2-Apo-Riboswitch') include up to 400 models for each RNA element, which may facilitate drug discovery approaches targeting dynamic ensembles of RNA molecules.
View details for DOI 10.1093/nar/gkab119
View details for PubMedID 33693814
Anomalous Reverse Transcription through Chemical Modifications in Polyadenosine Stretches.
Thermostable reverse transcriptases are workhorse enzymes underlying nearly all modern techniques for RNA structure mapping and for the transcriptome-wide discovery of RNA chemical modifications. Despite their wide use, these enzymes' behaviors at chemical modified nucleotides remain poorly understood. Wellington-Oguri et al. recently reported an apparent loss of chemical modification within putatively unstructured polyadenosine stretches modified by dimethyl sulfate or 2' hydroxyl acylation, as probed by reverse transcription. Here, reanalysis of these and other publicly available data, capillary electrophoresis experiments on chemically modified RNAs, and nuclear magnetic resonance spectroscopy on (A)12 and variants show that this effect is unlikely to arise from an unusual structure of polyadenosine. Instead, tests of different reverse transcriptases on chemically modified RNAs and molecules synthesized with single 1-methyladenosines implicate a previously uncharacterized reverse transcriptase behavior: near-quantitative bypass through chemical modifications within polyadenosine stretches. All tested natural and engineered reverse transcriptases (MMLV; SuperScript II, III, and IV; TGIRT-III; and MarathonRT) exhibit this anomalous bypass behavior. Accurate DMS-guided structure modeling of the polyadenylated HIV-1 3' untranslated region requires taking into account this anomaly. Our results suggest that poly(rA-dT) hybrid duplexes can trigger an unexpectedly effective reverse transcriptase bypass and that chemical modifications in mRNA poly(A) tails may be generally undercounted.
View details for DOI 10.1021/acs.biochem.0c00020
View details for PubMedID 32407625
RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look.
RNA (New York, N.Y.)
As the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Here we present a first look at RNA sequence conservation and structural propensities in the SARS-CoV-2 genome. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nucleotides as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences subsequently reported from the COVID-19 outbreak, and we present a curated list of 30 'SARS-related-conserved' regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 'SARS-CoV-2-conserved-structured' regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the extended 5' UTR, frame-shifting element, and 3' UTR. Last, we predict regions of the SARS-CoV-2 viral genome that have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 'SARS-CoV-2-conserved-unstructured' genomic regions may be most easily targeted in primer-based diagnostic and oligonucleotide-based therapeutic strategies.
View details for DOI 10.1261/rna.076141.120
View details for PubMedID 32398273
Accelerated cryo-EM-guided determination of three-dimensional RNA-only structures.
2020; 17 (7): 699–707
The discovery and design of biologically important RNA molecules is outpacing three-dimensional structural characterization. Here, we demonstrate that cryo-electron microscopy can routinely resolve maps of RNA-only systems and that these maps enable subnanometer-resolution coordinate estimation when complemented with multidimensional chemical mapping and Rosetta DRRAFTER computational modeling. This hybrid 'Ribosolve' pipeline detects and falsifies homologies and conformational rearrangements in 11 previously unknown 119- to 338-nucleotide protein-free RNA structures: full-length Tetrahymena ribozyme, hc16 ligase with and without substrate, full-length Vibrio cholerae and Fusobacterium nucleatum glycine riboswitch aptamers with and without glycine, Mycobacterium SAM-IV riboswitch with and without S-adenosylmethionine, and the computer-designed ATP-TTR-3 aptamer with and without AMP. Simulation benchmarks, blind challenges, compensatory mutagenesis, cross-RNA homologies and internal controls demonstrate that Ribosolve can accurately resolve the global architectures of RNA molecules but does not resolve atomic details. These tests offer guidelines for making inferences in future RNA structural studies with similarly accelerated throughput.
View details for DOI 10.1038/s41592-020-0878-9
View details for PubMedID 32616928
FARFAR2: Improved De Novo Rosetta Prediction of Complex Global RNA Folds.
Structure (London, England : 1993)
Predicting RNA three-dimensional structures from sequence could accelerate understanding of the growing number of RNA molecules being discovered across biology. Rosetta's Fragment Assembly of RNA with Full-Atom Refinement (FARFAR) has shown promise in community-wide blind RNA-Puzzle trials, but lack of a systematic and automated benchmark has left unclear what limits FARFAR performance. Here, we benchmark FARFAR2, an algorithm integrating RNA-Puzzle-inspired innovations with updated fragment libraries and helix modeling. In 16 of 21 RNA-Puzzles revisited without experimental data or expert intervention, FARFAR2 recovers native-like structures more accurate than models submitted during the RNA-Puzzles trials. Remaining bottlenecks include conformational sampling for >80-nucleotide problems and scoring function limitations more generally. Supporting these conclusions, preregistered blind models for adenovirus VA-I RNA and five riboswitch complexes predicted native-like folds with 3- to 14 Å root-mean-square deviation accuracies. We present a FARFAR2 webserver and three large model archives (FARFAR2-Classics, FARFAR2-Motifs, and FARFAR2-Puzzles) to guide future applications and advances.
View details for DOI 10.1016/j.str.2020.05.011
View details for PubMedID 32531203
Using Rosetta for RNA homology modeling.
Methods in enzymology
2019; 623: 177–207
The three-dimensional structures of RNA molecules provide rich and often critical information for understanding their functions, including how they recognize small molecule and protein partners. Computational modeling of RNA 3D structure is becoming increasingly accurate, particularly with the availability of growing numbers of template structures already solved experimentally and the development of sequence alignment and 3D modeling tools to take advantage of this database. For several recent "RNA puzzle" blind modeling challenges, we have successfully identified useful template structures and achieved accurate structure predictions through homology modeling tools developed in the Rosetta software suite. We describe our semi-automated methodology here and walk through two illustrative examples: an adenine riboswitch aptamer, modeled from a template guanine riboswitch structure, and a SAM I/IV riboswitch aptamer, modeled from a template SAM I riboswitch structure.
View details for DOI 10.1016/bs.mie.2019.05.026
View details for PubMedID 31239046
Determination of Structural Ensembles of Proteins: Restraining vs Reweighting
JOURNAL OF CHEMICAL THEORY AND COMPUTATION
2018; 14 (12): 6632–41
The conformational fluctuations of proteins can be described using structural ensembles. To address the challenge of determining these ensembles accurately, a wide range of strategies have recently been proposed to combine molecular dynamics simulations with experimental data. Quite generally, there are two ways of implementing this type of approach, either by applying structural restraints during a simulation, or by reweighting a posteriori the conformations from an a priori ensemble. It is not yet clear, however, whether these two approaches can offer ensembles of equivalent quality. The advantages of the reweighting method are that it can involve any type of starting simulation and that it enables the integration of experimental data after the simulations are run. A disadvantage, however, is that this procedure may be inaccurate when the a priori ensemble is of poor quality. Here, our goal is to systematically compare the restraining and reweighting approaches and to explore the conditions required for the reweighted ensembles to be accurate. Our results indicate that the reweighting approach is computationally efficient and can perform as well as the restraining approach when the a priori sampling is already relatively accurate. More generally, to enable an effective use of the reweighting approach by avoiding the pitfalls of poor sampling, we suggest metrics for the quality control of the reweighted ensembles.
View details for DOI 10.1021/acs.jctc.8b00738
View details for Web of Science ID 000453489100044
View details for PubMedID 30428663