- Approximate Profile Maximum Likelihood Journal of Machine Learning Research (to appear in 2019) 2019; 20
- Quantum Channel Capacities per Unit Cost IEEE TRANSACTIONS ON INFORMATION THEORY 2019; 65 (1): 418–35
Compressing Tabular Data via Pairwise Dependencies
IEEE COMPUTER SOC. 2017: 455
View details for PubMedID 29046897
Kinetic and thermodynamic framework for P4-P6 RNA reveals tertiary motif modularity and modulation of the folding preferred pathway.
Proceedings of the National Academy of Sciences of the United States of America
2016; 113 (34): E4956-65
The past decade has seen a wealth of 3D structural information about complex structured RNAs and identification of functional intermediates. Nevertheless, developing a complete and predictive understanding of the folding and function of these RNAs in biology will require connection of individual rate and equilibrium constants to structural changes that occur in individual folding steps and further relating these steps to the properties and behavior of isolated, simplified systems. To accomplish these goals we used the considerable structural knowledge of the folded, unfolded, and intermediate states of P4-P6 RNA. We enumerated structural states and possible folding transitions and determined rate and equilibrium constants for the transitions between these states using single-molecule FRET with a series of mutant P4-P6 variants. Comparisons with simplified constructs containing an isolated tertiary contact suggest that a given tertiary interaction has a stereotyped rate for breaking that may help identify structural transitions within complex RNAs and simplify the prediction of folding kinetics and thermodynamics for structured RNAs from their parts. The preferred folding pathway involves initial formation of the proximal tertiary contact. However, this preference was only ∼10 fold and could be reversed by a single point mutation, indicating that a model akin to a protein-folding contact order model will not suffice to describe RNA folding. Instead, our results suggest a strong analogy with a modified RNA diffusion-collision model in which tertiary elements within preformed secondary structures collide, with the success of these collisions dependent on whether the tertiary elements are in their rare binding-competent conformations.
View details for DOI 10.1073/pnas.1525082113
View details for PubMedID 27493222
View details for PubMedCentralID PMC5003260
Chained Kullback-Leibler Divergences
IEEE. 2016: 580–84
We define and characterize the "chained" Kullback-Leibler divergence min w D(p‖w) + D(w‖q) minimized over all intermediate distributions w and the analogous k-fold chained K-L divergence min D(p‖wk-1) + … + D(w2‖w1) + D(w1‖q) minimized over the entire path (w1,…,wk-1). This quantity arises in a large deviations analysis of a Markov chain on the set of types - the Wright-Fisher model of neutral genetic drift: a population with allele distribution q produces offspring with allele distribution w, which then produce offspring with allele distribution p, and so on. The chained divergences enjoy some of the same properties as the K-L divergence (like joint convexity in the arguments) and appear in k-step versions of some of the same settings as the K-L divergence (like information projections and a conditional limit theorem). We further characterize the optimal k-step "path" of distributions appearing in the definition and apply our findings in a large deviations analysis of the Wright-Fisher process. We make a connection to information geometry via the previously studied continuum limit, where the number of steps tends to infinity, and the limiting path is a geodesic in the Fisher information metric. Finally, we offer a thermodynamic interpretation of the chained divergence (as the rate of operation of an appropriately defined Maxwell's demon) and we state some natural extensions and applications (a k-step mutual information and k-step maximum likelihood inference). We release code for computing the objects we study.
View details for PubMedID 29130024
Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data.
2015; 16: 3-?
Single-molecule techniques have emerged as incisive approaches for addressing a wide range of questions arising in contemporary biological research [Trends Biochem Sci 38:30-37, 2013; Nat Rev Genet 14:9-22, 2013; Curr Opin Struct Biol 2014, 28C:112-121; Annu Rev Biophys 43:19-39, 2014]. The analysis and interpretation of raw single-molecule data benefits greatly from the ongoing development of sophisticated statistical analysis tools that enable accurate inference at the low signal-to-noise ratios frequently associated with these measurements. While a number of groups have released analysis toolkits as open source software [J Phys Chem B 114:5386-5403, 2010; Biophys J 79:1915-1927, 2000; Biophys J 91:1941-1951, 2006; Biophys J 79:1928-1944, 2000; Biophys J 86:4015-4029, 2004; Biophys J 97:3196-3205, 2009; PLoS One 7:e30024, 2012; BMC Bioinformatics 288 11(8):S2, 2010; Biophys J 106:1327-1337, 2014; Proc Int Conf Mach Learn 28:361-369, 2013], it remains difficult to compare analysis for experiments performed in different labs due to a lack of standardization.Here we propose a standardized single-molecule dataset (SMD) file format. SMD is designed to accommodate a wide variety of computer programming languages, single-molecule techniques, and analysis strategies. To facilitate adoption of this format we have made two existing data analysis packages that are used for single-molecule analysis compatible with this format.Adoption of a common, standard data file format for sharing raw single-molecule data and analysis outcomes is a critical step for the emerging and powerful single-molecule field, which will benefit both sophisticated users and non-specialists by allowing standardized, transparent, and reproducible analysis practices.
View details for DOI 10.1186/s12859-014-0429-4
View details for PubMedID 25591752
View details for PubMedCentralID PMC4384321
- Photonic circuits for iterative decoding of a class of low-density parity-check codes NEW JOURNAL OF PHYSICS 2014; 16
Roles of Long-Range Tertiary Interactions in Limiting Dynamics of the Tetrahymena Group I Ribozyme
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY
2014; 136 (18): 6643-6648
We determined the effects of mutating the long-range tertiary contacts of the Tetrahymena group I ribozyme on the dynamics of its substrate helix (referred to as P1) and on catalytic activity. Dynamics were assayed by fluorescence anisotropy of the fluorescent base analogue, 6-methyl isoxanthopterin, incorporated into the P1 helix, and fluorescence anisotropy and catalytic activity were measured for wild type and mutant ribozymes over a range of conditions. Remarkably, catalytic activity correlated with P1 anisotropy over 5 orders of magnitude of activity, with a correlation coefficient of 0.94. The functional and dynamic effects from simultaneous mutation of the two long-range contacts that weaken P1 docking are cumulative and, based on this RNA's topology, suggest distinct underlying origins for the mutant effects. Tests of mechanistic predictions via single molecule FRET measurements of rate constants for P1 docking and undocking suggest that ablation of the P14 tertiary interaction frees P2 and thereby enhances the conformational space explored by the undocked attached P1 helix. In contrast, mutation of the metal core tertiary interaction disrupts the conserved core into which the P1 helix docks. Thus, despite following a single correlation, the two long-range tertiary contacts facilitate P1 helix docking by distinct mechanisms. These results also demonstrate that a fluorescence anisotropy probe incorporated into a specific helix within a larger RNA can report on changes in local helical motions as well as differences in more global dynamics. This ability will help uncover the physical properties and behaviors that underlie the function of RNAs and RNA/protein complexes.
View details for DOI 10.1021/ja413033d
View details for Web of Science ID 000335720200024
View details for PubMedID 24738560
View details for PubMedCentralID PMC4021564
- Optical modular arithmetic Conference on Micro- and Nanotechnology Sensors, Systems, and Applications VI SPIE-INT SOC OPTICAL ENGINEERING. 2014
The human genome contracts again
2013; 29 (17): 2199-2202
The number of human genomes that have been sequenced completely for different individuals has increased rapidly in recent years. Storing and transferring complete genomes between computers for the purpose of applying various applications and analysis tools will soon become a major hurdle, hindering the analysis phase. Therefore, there is a growing need to compress these data efficiently. Here, we describe a technique to compress human genomes based on entropy coding, using a reference genome and known Single Nucleotide Polymorphisms (SNPs). Furthermore, we explore several intrinsic features of genomes and information in other genomic databases to further improve the compression attained. Using these methods, we compress James Watson's genome to 2.5 megabytes (MB), improving on recent work by 37%. Similar compression is obtained for most genomes available from the 1000 Genomes Project. Our biologically inspired techniques promise even greater gains for genomes of lower organisms and for human genomes as more genomic data become available.Code is available at sourceforge.net/projects/genomezip/
View details for DOI 10.1093/bioinformatics/btt362
View details for Web of Science ID 000323344800018
View details for PubMedID 23793748
- Transformation of Quantum Photonic Circuit Models by Term Rewriting IEEE PHOTONICS JOURNAL 2013; 5 (1)
Specification of photonic circuits using quantum hardware description language
Theo Murphy Discussion Meeting on Principles and Applications of Quantum Control Engineering
ROYAL SOC. 2012: 5270–90
Following the simple observation that the interconnection of a set of quantum optical input-output devices can be specified using structural mode VHSIC hardware description language, we demonstrate a computer-aided schematic capture workflow for modelling and simulating multi-component photonic circuits. We describe an algorithm for parsing circuit descriptions to derive quantum equations of motion, illustrate our approach using simple examples based on linear and cavity-nonlinear optical components, and demonstrate a computational approach to hierarchical model reduction.
View details for DOI 10.1098/rsta.2011.0526
View details for Web of Science ID 000310365700004
View details for PubMedID 23091208
View details for PubMedCentralID PMC3479715
Single Molecule Analysis Research Tool (SMART): An Integrated Approach for Analyzing Single Molecule Data
2012; 7 (2)
Single molecule studies have expanded rapidly over the past decade and have the ability to provide an unprecedented level of understanding of biological systems. A common challenge upon introduction of novel, data-rich approaches is the management, processing, and analysis of the complex data sets that are generated. We provide a standardized approach for analyzing these data in the freely available software package SMART: Single Molecule Analysis Research Tool. SMART provides a format for organizing and easily accessing single molecule data, a general hidden Markov modeling algorithm for fitting an array of possible models specified by the user, a standardized data structure and graphical user interfaces to streamline the analysis and visualization of data. This approach guides experimental design, facilitating acquisition of the maximal information from single molecule experiments. SMART also provides a standardized format to allow dissemination of single molecule data and transparency in the analysis of reported data.
View details for DOI 10.1371/journal.pone.0030024
View details for Web of Science ID 000302871500004
View details for PubMedID 22363412
View details for PubMedCentralID PMC3282690
- Design of nanophotonic circuits for autonomous subsystem quantum error correction NEW JOURNAL OF PHYSICS 2011; 13
The dressed atom as binary phase modulator: towards attojoule/edge optical phase-shift keying
2011; 19 (7): 6486-6494
View details for Web of Science ID 000288852700082
The dressed atom as binary phase modulator: towards attojoule/edge optical phase-shift keying.
2011; 19 (7): 6478-6486
We use a single 133Cs atom strongly coupled to an optical resonator to induce random binary phase modulation of a near infra-red, ∼ 500 pW laser beam, with each modulation edge caused by the dissipation of a single photon (≈ 0.23 aJ) by the atom. While our ability to deterministically induce phase edges with an additional optical control beam is limited thus far, theoretical analysis of an analogous, solid-state system indicates that efficient external control should be achievable in demonstrated nanophotonic systems.
View details for DOI 10.1364/OE.19.006478
View details for PubMedID 21451676
Optical 'bistability' with single atom absorbers
Conference on Lasers and Electro-Optics (CLEO)
View details for Web of Science ID 000295612403360
Designing Quantum Memories with Embedded Control: Photonic Circuits for Autonomous Quantum Error Correction
PHYSICAL REVIEW LETTERS
2010; 105 (4)
We propose an approach to quantum error correction based on coding and continuous syndrome readout via scattering of coherent probe fields, in which the usual steps of measurement and discrete restoration are replaced by direct physical processing of the probe beams and coherent feedback to the register qubits. Our approach is well matched to physical implementations that feature solid-state qubits embedded in planar electromagnetic circuits, providing an autonomous and "on-chip" quantum memory design requiring no external clocking or control logic.
View details for DOI 10.1103/PhysRevLett.105.040502
View details for Web of Science ID 000280237700001
View details for PubMedID 20867826
Coherent-feedback formulation of continuous quantum error correction protocols
Conference on Lasers and Electro-Optics (CLEO)/Quantum Electronics and Laser Science Conference (QELS)
View details for Web of Science ID 000290513602219