Hybrids of RNA viruses and viroid-like elements replicate in fungi.
2023; 14 (1): 2591
Earth's life may have originated as self-replicating RNA, and it has been argued that RNA viruses and viroid-like elements are remnants of such pre-cellular RNA world. RNA viruses are defined by linear RNA genomes encoding an RNA-dependent RNA polymerase (RdRp), whereas viroid-like elements consist of small, single-stranded, circular RNA genomes that, in some cases, encode paired self-cleaving ribozymes. Here we show that the number of candidate viroid-like elements occurring in geographically and ecologically diverse niches is much higher than previously thought. We report that, amongst these circular genomes, fungal ambiviruses are viroid-like elements that undergo rolling circle replication and encode their own viral RdRp. Thus, ambiviruses are distinct infectious RNAs showing hybrid features of viroid-like RNAs and viruses. We also detected similar circular RNAs, containing active ribozymes and encoding RdRps, related to mitochondrial-like fungal viruses, highlighting fungi as an evolutionary hub for RNA viruses and viroid-like elements. Our findings point to a deep co-evolutionary history between RNA viruses and subviral elements and offer new perspectives in the origin and evolution of primordial infectious agents, and RNA life.
View details for DOI 10.1038/s41467-023-38301-2
View details for PubMedID 37147358
Restriction Endonuclease-Based Modification-Dependent Enrichment (REMoDE) of DNA for Metagenomic Sequencing.
Applied and environmental microbiology
Metagenomic sequencing is a swift and powerful tool to ascertain the presence of an organism of interest in a sample. However, sequencing coverage of the organism of interest can be insufficient due to an inundation of reads from irrelevant organisms in the sample. Here, we report a nuclease-based approach to rapidly enrich for DNA from certain organisms, including enterobacteria, based on their differential endogenous modification patterns. We exploit the ability of taxon-specific methylated motifs to resist the action of cognate methylation-sensitive restriction endonucleases that thereby digest unwanted, unmethylated DNA. Subsequently, we use a distributive exonuclease or electrophoretic separation to deplete or exclude the digested fragments, thus enriching for undigested DNA from the organism of interest. As a proof of concept, we apply this method to enrich for the enterobacteria Escherichia coli and Salmonella enterica by 11- to 142-fold from mock metagenomic samples and validate this approach as a versatile means to enrich for genomes of interest in metagenomic samples. IMPORTANCE Pathogens that contaminate the food supply or spread through other means can cause outbreaks that bring devastating repercussions to the health of a populace. Investigations to trace the source of these outbreaks are initiated rapidly but can be drawn out due to the labored methods of pathogen isolation. Metagenomic sequencing can alleviate this hurdle but is often insufficiently sensitive. The approach and implementations detailed here provide a rapid means to enrich for many pathogens involved in foodborne outbreaks, thereby improving the utility of metagenomic sequencing as a tool in outbreak investigations. Additionally, this approach provides a means to broadly enrich for otherwise minute levels of modified DNA, which may escape unnoticed in metagenomic samples.
View details for DOI 10.1128/aem.01670-22
View details for PubMedID 36519847
A statistical, reference-free algorithm subsumes myriad problems in genome science and enables novel discovery.
bioRxiv : the preprint server for biology
We present a unifying statistical formulation for many fundamental problems in genome science and develop a reference-free, highly efficient algorithm that solves it. Sequence diversification - nucleic acid mutation, rearrangement, and reassortment - is necessary for the differentiation and adaptation of all replicating organisms. Identifying sample-dependent sequence diversification, e.g. adaptation or regulated isoform expression, is fundamental to many biological studies, and is achieved today with next-generation sequencing. Paradoxically, current analyses begin with attempts to align to or assemble necessarily incomplete reference genomes, a step that is at odds with detecting the most important examples of sequence diversification. In addition to being computationally expensive, reference-first approaches suffer from diminished discovery power: they are blind to unaligned or mis-aligned sequences. We provide a unifying formulation for detecting sample-dependent sequence diversification that subsumes core problems faced in diverse biological fields. This formulation allows us to construct an algorithm that performs inference on raw reads, avoiding references completely. We illustrate the power of our approach for new data-driven biological discovery with examples of novel single-cell resolved, cell-type-specific isoform expression, including expression in the major histocompatibility complex, and de novo prediction of viral protein adaptation including in SARS-CoV-2.
View details for DOI 10.1101/2022.06.24.497555
View details for PubMedID 35794890
Cryo-EM and antisense targeting of the 28-kDa frameshift stimulation element from the SARS-CoV-2 RNA genome.
Nature structural & molecular biology
Drug discovery campaigns against COVID-19 are beginning to target the SARS-CoV-2 RNA genome. The highly conserved frameshift stimulation element (FSE), required for balanced expression of viral proteins, is a particularly attractive SARS-CoV-2 RNA target. Here we present a 6.9A resolution cryo-EM structure of the FSE (88nucleotides, ~28kDa), validated through an RNA nanostructure tagging method. The tertiary structure presents a topologically complex fold in which the 5' end is threaded through a ring formed inside a three-stem pseudoknot. Guided by this structure, we develop antisense oligonucleotides that impair FSE function in frameshifting assays and knock down SARS-CoV-2 virus replication in A549-ACE2 cells at 100nM concentration.
View details for DOI 10.1038/s41594-021-00653-y
View details for PubMedID 34426697
De novo 3D models of SARS-CoV-2 RNA elements from consensus experimental secondary structures.
Nucleic acids research
The rapid spread of COVID-19 is motivating development of antivirals targeting conserved SARS-CoV-2 molecular machinery. The SARS-CoV-2 genome includes conserved RNA elements that offer potential small-molecule drug targets, but most of their 3D structures have not been experimentally characterized. Here, we provide a compilation of chemical mapping data from our and other labs, secondary structure models, and 3D model ensembles based on Rosetta's FARFAR2 algorithm for SARS-CoV-2 RNA regions including the individual stems SL1-8 in the extended 5' UTR; the reverse complement of the 5' UTR SL1-4; the frameshift stimulating element (FSE); and the extended pseudoknot, hypervariable region, and s2m of the 3' UTR. For eleven of these elements (the stems in SL1-8, reverse complement of SL1-4, FSE, s2m and 3' UTR pseudoknot), modeling convergence supports the accuracy of predicted low energy states; subsequent cryo-EM characterization of the FSE confirms modeling accuracy. To aid efforts to discover small molecule RNA binders guided by computational models, we provide a second set of similarly prepared models for RNA riboswitches that bind small molecules. Both datasets ('FARFAR2-SARS-CoV-2', https://github.com/DasLab/FARFAR2-SARS-CoV-2; and 'FARFAR2-Apo-Riboswitch', at https://github.com/DasLab/FARFAR2-Apo-Riboswitch') include up to 400 models for each RNA element, which may facilitate drug discovery approaches targeting dynamic ensembles of RNA molecules.
View details for DOI 10.1093/nar/gkab119
View details for PubMedID 33693814
RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look.
RNA (New York, N.Y.)
As the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Here we present a first look at RNA sequence conservation and structural propensities in the SARS-CoV-2 genome. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nucleotides as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences subsequently reported from the COVID-19 outbreak, and we present a curated list of 30 'SARS-related-conserved' regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 'SARS-CoV-2-conserved-structured' regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the extended 5' UTR, frame-shifting element, and 3' UTR. Last, we predict regions of the SARS-CoV-2 viral genome that have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 'SARS-CoV-2-conserved-unstructured' genomic regions may be most easily targeted in primer-based diagnostic and oligonucleotide-based therapeutic strategies.
View details for DOI 10.1261/rna.076141.120
View details for PubMedID 32398273
Accelerated cryo-EM-guided determination of three-dimensional RNA-only structures.
2020; 17 (7): 699–707
The discovery and design of biologically important RNA molecules is outpacing three-dimensional structural characterization. Here, we demonstrate that cryo-electron microscopy can routinely resolve maps of RNA-only systems and that these maps enable subnanometer-resolution coordinate estimation when complemented with multidimensional chemical mapping and Rosetta DRRAFTER computational modeling. This hybrid 'Ribosolve' pipeline detects and falsifies homologies and conformational rearrangements in 11 previously unknown 119- to 338-nucleotide protein-free RNA structures: full-length Tetrahymena ribozyme, hc16 ligase with and without substrate, full-length Vibrio cholerae and Fusobacterium nucleatum glycine riboswitch aptamers with and without glycine, Mycobacterium SAM-IV riboswitch with and without S-adenosylmethionine, and the computer-designed ATP-TTR-3 aptamer with and without AMP. Simulation benchmarks, blind challenges, compensatory mutagenesis, cross-RNA homologies and internal controls demonstrate that Ribosolve can accurately resolve the global architectures of RNA molecules but does not resolve atomic details. These tests offer guidelines for making inferences in future RNA structural studies with similarly accelerated throughput.
View details for DOI 10.1038/s41592-020-0878-9
View details for PubMedID 32616928
RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses.
bioRxiv : the preprint server for biology
As the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nucleotides as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences reported to date from the current COVID-19 outbreak, and we present a curated list of 30 'SARS-related-conserved' regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 'SARS-CoV-2-conserved-structured' regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the 5´ UTR, frame-shifting element, and 3´ UTR. Last, we predict regions of the SARS-CoV-2 viral genome have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 'SARS-CoV-2-conserved-unstructured' genomic regions may be most easily targeted in primer-based diagnostic and oligonucleotide-based therapeutic strategies.
View details for DOI 10.1101/2020.03.27.012906
View details for PubMedID 32511306
View details for PubMedCentralID PMC7217285