Steven Dillmann
Ph.D. Student in Computational and Mathematical Engineering, admitted Autumn 2024
Education & Certifications
-
MPhil, University of Cambridge, Data Intensive Science (2024)
-
MEng, Imperial College London, Aeronautics with Spacecraft Engineering (2023)
All Publications
-
Genome modelling and design across all domains of life with Evo 2.
Nature
2026
Abstract
All of life encodes information with DNA. Although tools for genome sequencing, synthesis and editing have transformed biological research, we still lack sufficient understanding of the immense complexity encoded by genomes to predict the effects of many classes of genomic changes or to intelligently compose new biological systems. Artificial intelligence models that learn information from genomic sequences across diverse organisms have increasingly advanced prediction and design capabilities1,2. Here we introduce Evo 2, a biological foundation model trained on 9 trillion DNA base pairs from a highly curated genomic atlas spanning all domains of life to have a 1 million token context window with single-nucleotide resolution. Evo 2 learns to accurately predict the functional impacts of genetic variation-from noncoding pathogenic mutations to clinically significant BRCA1 variants-without task-specific fine-tuning. Mechanistic interpretability analyses reveal that Evo 2 learns representations associated with biological features, including exon-intron boundaries, transcription factor binding sites, protein structural elements and prophage genomic regions. The generative abilities of Evo 2 produce mitochondrial, prokaryotic and eukaryotic sequences at genome scale with greater naturalness and coherence than previous methods. Evo 2 also generates experimentally validated chromatin accessibility patterns when guided by predictive models3,4 and inference-time search. We have made Evo 2 fully open, including model parameters, training code5, inference code and the OpenGenome2 dataset, to accelerate the exploration and design of biological complexity.
View details for DOI 10.1038/s41586-026-10176-5
View details for PubMedID 41781614
View details for PubMedCentralID 12057570
-
A Poisson Process AutoDecoder for X-Ray Sources
ASTROPHYSICAL JOURNAL
2025; 988 (1)
View details for DOI 10.3847/1538-4357/add72e
View details for Web of Science ID 001531170300001
-
Hyperluminous Supersoft X-Ray Sources in the Chandra Catalog
ASTROPHYSICAL JOURNAL
2025; 983 (2)
View details for DOI 10.3847/1538-4357/adc256
View details for Web of Science ID 001467545700001
-
Representation learning for time-domain high-energy astrophysics: Discovery of extragalactic fast X-ray transient XRT 200515
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY
2025; 537 (2): 931-955
View details for DOI 10.1093/mnras/stae2808
View details for Web of Science ID 001410119500001
-
The Cloudspotting on Mars citizen science project: Seasonal and spatial cloud distributions observed by the Mars Climate Sounder
ICARUS
2024; 419
View details for DOI 10.1016/j.icarus.2023.115777
View details for Web of Science ID 001273651700001
-
The impact of satellite trails on Hubble Space Telescope observations
NATURE ASTRONOMY
2023; 7 (3): 262-268
View details for DOI 10.1038/s41550-023-01903-3
View details for Web of Science ID 000943005500002
https://orcid.org/0000-0002-4773-1463