Benjamin Doughty's Profile | Stanford Profiles

Contact

Academic
bgrd@stanford.edu

University - Student Department: Genetics Position: Graduate

Additional Info

Mail Code: 5120
ORCID:
https://orcid.org/0000-0003-0447-4468

All Publications

Single-molecule states link transcription factor binding to gene expression. Nature Doughty, B. R., Hinks, M. M., Schaepe, J. M., Marinov, G. K., Thurm, A. R., Rios-Martinez, C., Parks, B. E., Tan, Y., Marklund, E., Dubocanin, D., Bintu, L., Greenleaf, W. J. 2024

Abstract

The binding of multiple transcription factors (TFs) to genomic enhancers drives gene expression in mammalian cells1. However, the molecular details that link enhancer sequence to TF binding, promoter state and transcription levels remain unclear. Here we applied single-molecule footprinting2,3 to measure the simultaneous occupancy of TFs, nucleosomes and other regulatory proteins on engineered enhancer-promoter constructs with variable numbers of TF binding sites for both a synthetic TF and an endogenous TF involved in the type I interferon response. Although TF binding events on nucleosome-free DNA are independent, activation domains recruit cofactors that destabilize nucleosomes, driving observed TF binding cooperativity. Average TF occupancy linearly determines promoter activity, and we decompose TF strength into separable binding and activation terms. Finally, we develop thermodynamic and kinetic models that quantitatively predict both the enhancer binding microstates and gene expression dynamics. This work provides a template for the quantitative dissection of distinct contributors to gene expression, including TF activation domains, concentration, binding affinity, binding site configuration and recruitment of chromatin regulators.

View details for DOI 10.1038/s41586-024-08219-w

View details for PubMedID 39567683

View details for PubMedCentralID 3514679
Genome-wide enhancer maps link risk variants to disease genes. Nature Nasser, J., Bergman, D. T., Fulco, C. P., Guckelberger, P., Doughty, B. R., Patwardhan, T. A., Jones, T. R., Nguyen, T. H., Ulirsch, J. C., Lekschas, F., Mualim, K., Natri, H. M., Weeks, E. M., Munson, G., Kane, M., Kang, H. Y., Cui, A., Ray, J. P., Eisenhaure, T. M., Collins, R. L., Dey, K., Pfister, H., Price, A. L., Epstein, C. B., Kundaje, A., Xavier, R. J., Daly, M. J., Huang, H., Finucane, H. K., Hacohen, N., Lander, E. S., Engreitz, J. M. 2021

Abstract

Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-genemaps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.

View details for DOI 10.1038/s41586-021-03446-x

View details for PubMedID 33828297
HyPR-seq: Single-cell quantification of chosen RNAs via hybridization and sequencing of DNA probes. Proceedings of the National Academy of Sciences of the United States of America Marshall, J. L., Doughty, B. R., Subramanian, V., Guckelberger, P., Wang, Q., Chen, L. M., Rodriques, S. G., Zhang, K., Fulco, C. P., Nasser, J., Grinkevich, E. J., Noel, T., Mangiameli, S., Bergman, D. T., Greka, A., Lander, E. S., Chen, F., Engreitz, J. M. 2020; 117 (52): 33404–13

Abstract

Single-cell quantification of RNAs is important for understanding cellular heterogeneity and gene regulation, yet current approaches suffer from low sensitivity for individual transcripts, limiting their utility for many applications. Here we present Hybridization of Probes to RNA for sequencing (HyPR-seq), a method to sensitively quantify the expression of hundreds of chosen genes in single cells. HyPR-seq involves hybridizing DNA probes to RNA, distributing cells into nanoliter droplets, amplifying the probes with PCR, and sequencing the amplicons to quantify the expression of chosen genes. HyPR-seq achieves high sensitivity for individual transcripts, detects nonpolyadenylated and low-abundance transcripts, and can profile more than 100,000 single cells. We demonstrate how HyPR-seq can profile the effects of CRISPR perturbations in pooled screens, detect time-resolved changes in gene expression via measurements of gene introns, and detect rare transcripts and quantify cell-type frequencies in tissue using low-abundance marker genes. By directing sequencing power to genes of interest and sensitively quantifying individual transcripts, HyPR-seq reduces costs by up to 100-fold compared to whole-transcriptome single-cell RNA-sequencing, making HyPR-seq a powerful method for targeted RNA profiling in single cells.

View details for DOI 10.1073/pnas.2010738117

View details for PubMedID 33376219
Rewriting regulatory DNA to dissect and reprogram gene expression. Cell Martyn, G. E., Montgomery, M. T., Jones, H., Guo, K., Doughty, B. R., Linder, J., Bisht, D., Xia, F., Cai, X. S., Chen, Z., Cochran, K., Lawrence, K. A., Munson, G., Pampari, A., Fulco, C. P., Sahni, N., Kelley, D. R., Lander, E. S., Kundaje, A., Engreitz, J. M. 2025

Abstract

Regulatory DNA provides a platform for transcription factor binding to encode cell-type-specific patterns of gene expression. However, the effects and programmability of regulatory DNA sequences remain difficult to map or predict. Here, we develop variant effects from flow-sorting experiments with CRISPR targeting screens (Variant-EFFECTS) to introduce hundreds of designed edits to endogenous regulatory DNA and quantify their effects on gene expression. We systematically dissect and reprogram 3 regulatory elements for 2 genes in 2 cell types. These data reveal endogenous binding sites with effects specific to genomic context, transcription factor motifs with cell-type-specific activities, and limitations of computational models for predicting the effect sizes of variants. We identify small edits that can tune gene expression over a large dynamic range, suggesting new possibilities for prime-editing-based therapeutics targeting regulatory DNA. Variant-EFFECTS provides a generalizable tool to dissect regulatory DNA and to identify genome editing reagents that tune gene expression in an endogenous context.

View details for DOI 10.1016/j.cell.2025.03.034

View details for PubMedID 40245860
Thermodynamic principles link in vitro transcription factor affinities to single-molecule chromatin states in cells. bioRxiv : the preprint server for biology Schaepe, J. M., Fries, T., Doughty, B. R., Crocker, O. J., Hinks, M. M., Marklund, E., Greenleaf, W. J. 2025

Abstract

The molecular details governing transcription factor (TF) binding and the formation of accessible chromatin are not yet quantitatively understood - including how sequence context modulates affinity, how TFs search DNA, the kinetics of TF occupancy, and how motif grammars coordinate binding. To resolve these questions for a human TF, erythroid Krüppel-like factor (eKLF/KLF1), we quantitatively compare, in high throughput, in vitro TF binding rates and affinities with in vivo single molecule TF and nucleosome occupancies across engineered DNA sequences. We find that 40-fold flanking sequence effects on affinity are consistent with distal flanks tuning TF search parameters and captured by a linear energy model. Motif recognition probability, rather than time in the bound state, drives affinity changes, and in vitro and in nuclei measurements exhibit consistent, minutes-long TF residence times. Finally, pairing in vitro biophysical parameters with thermodynamic models accurately predicts in vivo single-molecule chromatin states for unseen motif grammars.

View details for DOI 10.1101/2025.01.27.635162

View details for PubMedID 39975040

View details for PubMedCentralID PMC11838358
The chromatin landscape of the histone-possessing Bacteriovorax bacteria. Genome research Marinov, G. K., Doughty, B., Kundaje, A., Greenleaf, W. J. 2024

Abstract

Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a Bacteriovorax strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, Bacteriovorax chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in Bacteriovorax positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, Bacteriovorax promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the Bacteriovorax genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.

View details for DOI 10.1101/gr.279418.124

View details for PubMedID 39572228
Deciphering the impact of genomic variation on function. Nature 2024; 633 (8028): 47-57

Abstract

Our genomes influence nearly every aspect of human biology-from molecular and cellular functions to phenotypes in health and disease. Studying the differences in DNA sequence between individuals (genomic variation) could reveal previously unknown mechanisms of human biology, uncover the basis of genetic predispositions to diseases, and guide the development of new diagnostic tools and therapeutic agents. Yet, understanding how genomic variation alters genome function to influence phenotype has proved challenging. To unlock these insights, we need a systematic and comprehensive catalogue of genome function and the molecular and cellular effects of genomic variants. Towards this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations and predictive modelling to investigate the relationships among genomic variation, genome function and phenotypes. IGVF will create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how such effects connect through gene-regulatory and protein-interaction networks. These experimental data, computational predictions and accompanying standards and pipelines will be integrated into an open resource that will catalyse community efforts to explore how our genomes influence biology and disease across populations.

View details for DOI 10.1038/s41586-024-07510-0

View details for PubMedID 39232149

View details for PubMedCentralID 7405896
Multicenter integrated analysis of noncoding CRISPRi screens. Nature methods Yao, D., Tycko, J., Oh, J. W., Bounds, L. R., Gosai, S. J., Lataniotis, L., Mackay-Smith, A., Doughty, B. R., Gabdank, I., Schmidt, H., Guerrero-Altamirano, T., Siklenka, K., Guo, K., White, A. D., Youngworth, I., Andreeva, K., Ren, X., Barrera, A., Luo, Y., Yardımcı, G. G., Tewhey, R., Kundaje, A., Greenleaf, W. J., Sabeti, P. C., Leslie, C., Pritykin, Y., Moore, J. E., Beer, M. A., Gersbach, C. A., Reddy, T. E., Shen, Y., Engreitz, J. M., Bassik, M. C., Reilly, S. K. 2024

Abstract

The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.

View details for DOI 10.1038/s41592-024-02216-7

View details for PubMedID 38504114

View details for PubMedCentralID 3771521
Single-molecule chromatin configurations link transcription factor binding to expression in human cells. bioRxiv : the preprint server for biology Doughty, B. R., Hinks, M. M., Schaepe, J. M., Marinov, G. K., Thurm, A. R., Rios-Martinez, C., Parks, B. E., Tan, Y., Marklund, E., Dubocanin, D., Bintu, L., Greenleaf, W. J. 2024

Abstract

The binding of multiple transcription factors (TFs) to genomic enhancers activates gene expression in mammalian cells. However, the molecular details that link enhancer sequence to TF binding, promoter state, and gene expression levels remain opaque. We applied single-molecule footprinting (SMF) to measure the simultaneous occupancy of TFs, nucleosomes, and components of the transcription machinery on engineered enhancer/promoter constructs with variable numbers of TF binding sites for both a synthetic and an endogenous TF. We find that activation domains enhance a TF's capacity to compete with nucleosomes for binding to DNA in a BAF-dependent manner, TF binding on nucleosome-free DNA is consistent with independent binding between TFs, and average TF occupancy linearly contributes to promoter activation rates. We also decompose TF strength into separable binding and activation terms, which can be tuned and perturbed independently. Finally, we develop thermodynamic and kinetic models that quantitatively predict both the binding microstates observed at the enhancer and subsequent time-dependent gene expression. This work provides a template for quantitative dissection of distinct contributors to gene activation, including the activity of chromatin remodelers, TF activation domains, chromatin acetylation, TF concentration, TF binding affinity, and TF binding site configuration.

View details for DOI 10.1101/2024.02.02.578660

View details for PubMedID 38352517
Rewriting regulatory DNA to dissect and reprogram gene expression. bioRxiv : the preprint server for biology Martyn, G. E., Montgomery, M. T., Jones, H., Guo, K., Doughty, B. R., Linder, J., Chen, Z., Cochran, K., Lawrence, K. A., Munson, G., Pampari, A., Fulco, C. P., Kelley, D. R., Lander, E. S., Kundaje, A., Engreitz, J. M. 2023

Abstract

Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico. 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.

View details for DOI 10.1101/2023.12.20.572268

View details for PubMedID 38187584

View details for PubMedCentralID PMC10769263
An encyclopedia of enhancer-gene regulatory interactions in the human genome. bioRxiv : the preprint server for biology Gschwind, A. R., Mualim, K. S., Karbalayghareh, A., Sheth, M. U., Dey, K. K., Jagoda, E., Nurtdinov, R. N., Xi, W., Tan, A. S., Jones, H., Ma, X. R., Yao, D., Nasser, J., Avsec, Ž., James, B. T., Shamim, M. S., Durand, N. C., Rao, S. S., Mahajan, R., Doughty, B. R., Andreeva, K., Ulirsch, J. C., Fan, K., Perez, E. M., Nguyen, T. C., Kelley, D. R., Finucane, H. K., Moore, J. E., Weng, Z., Kellis, M., Bassik, M. C., Price, A. L., Beer, M. A., Guigó, R., Stamatoyannopoulos, J. A., Lieberman Aiden, E., Greenleaf, W. J., Leslie, C. S., Steinmetz, L. M., Kundaje, A., Engreitz, J. M. 2023

Abstract

Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

View details for DOI 10.1101/2023.11.09.563812

View details for PubMedID 38014075

View details for PubMedCentralID PMC10680627
The landscape of the histone-organized chromatin of Bdellovibrionota bacteria. bioRxiv : the preprint server for biology Marinov, G. K., Doughty, B., Kundaje, A., Greenleaf, W. J. 2023

Abstract

Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a Bacteriovorax strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, Bacteriovorax chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in Bacteriovorax positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, Bacteriovorax promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the Bacteriovorax genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.

View details for DOI 10.1101/2023.10.30.564843

View details for PubMedID 37961278

View details for PubMedCentralID PMC10634947
Single-cell chromatin state transitions during epigenetic memory formation. bioRxiv : the preprint server for biology Fujimori, T., Rios-Martinez, C., Thurm, A. R., Hinks, M. M., Doughty, B. R., Sinha, J., Le, D., Hafner, A., Greenleaf, W. J., Boettiger, A. N., Bintu, L. 2023

Abstract

Repressive chromatin modifications are thought to compact chromatin to silence transcription. However, it is unclear how chromatin structure changes during silencing and epigenetic memory formation. We measured gene expression and chromatin structure in single cells after recruitment and release of repressors at a reporter gene. Chromatin structure is heterogeneous, with open and compact conformations present in both active and silent states. Recruitment of repressors associated with epigenetic memory produces chromatin compaction across 10-20 kilobases, while reversible silencing does not cause compaction at this scale. Chromatin compaction is inherited, but changes molecularly over time from histone methylation (H3K9me3) to DNA methylation. The level of compaction at the end of silencing quantitatively predicts epigenetic memory weeks later. Similarly, chromatin compaction at the Nanog locus predicts the degree of stem-cell fate commitment. These findings suggest that the chromatin state across tens of kilobases, beyond the gene itself, is important for epigenetic memory formation.

View details for DOI 10.1101/2023.10.03.560616

View details for PubMedID 37873344

View details for PubMedCentralID PMC10592931
Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations NATURE GENETICS Fulco, C. P., Nasser, J., Jones, T. R., Munson, G., Bergman, D. T., Subramanian, V., Grossman, S. R., Anyoha, R., Doughty, B. R., Patwardhan, T. A., Nguyen, T. H., Kane, M., Perez, E. M., Durand, N. C., Lareau, C. A., Stamenova, E. K., Aiden, E., Lander, E. S., Engreitz, J. M. 2019; 51 (12): 1664-+

Abstract

Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases1-4. Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-gene connections across cell types5,6. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer-gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer-gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.

View details for DOI 10.1038/s41588-019-0538-0

View details for Web of Science ID 000499696700003

View details for PubMedID 31784727

View details for PubMedCentralID PMC6886585

Benjamin Doughty

Ph.D. Student in Genetics, admitted Autumn 2020

Contact

Additional Info

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract