Stanford Advisors


All Publications


  • Robust Digital Molecular Design of Binarized Neural Networks Leibniz International Proceedings in Informatics Linder, J., Chen, Y., Wong, D., Seelig, G., Ceze, L., Strauss, K. 2021; 205: 1-20

    View details for DOI 10.4230/LIPIcs.DNA.27.1

  • A Generative Neural Network for Maximizing Fitness and Diversity of Synthetic DNA and Protein Sequences CELL SYSTEMS Linder, J., Bogard, N., Rosenberg, A. B., Seelig, G. 2020; 11 (1): 49-+

    Abstract

    Engineering gene and protein sequences with defined functional properties is a major goal of synthetic biology. Deep neural network models, together with gradient ascent-style optimization, show promise for sequence design. The generated sequences can however get stuck in local minima and often have low diversity. Here, we develop deep exploration networks (DENs), a class of activation-maximizing generative models, which minimize the cost of a neural network fitness predictor by gradient descent. By penalizing any two generated patterns on the basis of a similarity metric, DENs explicitly maximize sequence diversity. To avoid drifting into low-confidence regions of the predictor, we incorporate variational autoencoders to maintain the likelihood ratio of generated sequences. Using DENs, we engineered polyadenylation signals with more than 10-fold higher selection odds than the best gradient ascent-generated patterns, identified splice regulatory sequences predicted to result in highly differential splicing between cell lines, and improved on state-of-the-art results for protein design tasks.

    View details for DOI 10.1016/j.cels.2020.05.007

    View details for Web of Science ID 000552577800008

    View details for PubMedID 32711843

  • Efficient inference of nonlinear feature attributions with Scrambling Neural Networks Machine Learning in Computational Biology Linder, J., La Fleur, A., Chen, Z., Ljubetic, A., Baker, D., Kannan, S., Seelig, G. 2020
  • A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation CELL Bogard, N., Linder, J., Rosenberg, A. B., Seelig, G. 2019; 178 (1): 91-+

    Abstract

    Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over 3 million APA reporters. APARENT's predictions are highly accurate when tasked with inferring APA in synthetic and human 3'UTRs. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of 3' end processing, and integrates these features into a comprehensive, interpretable, cis-regulatory code. We apply APARENT to forward engineer functional polyadenylation signals with precisely defined cleavage position and isoform usage and validate predictions experimentally. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.

    View details for DOI 10.1016/j.cell.2019.04.046

    View details for Web of Science ID 000473002700010

    View details for PubMedID 31178116

    View details for PubMedCentralID PMC6599575

  • Deep exploration networks for rapid engineering of functional DNA sequences Machine Learning in Computational Biology Linder, J., Bogard, N., Rosenberg, A. B., Seelig, G. 2019