Nicolas Altemose's Profile | Stanford Profiles

Bio

Nicolas Altemose is an Assistant Professor of Genetics and a Chan Zuckerberg Biohub Investigator. The Altemose Lab develops new experimental and analytical tools to study how chromatin proteins organize and regulate complex regions of the human genome. For more information see altemoselab.stanford.edu

Academic Appointments

Assistant Professor, Genetics
Member, Bio-X

Honors & Awards

Pew Biomedical Scholar Award, Pew Trusts (2024-2028)
CZ Biohub Investigator Award, Chan Zuckerberg Biohub (2023-2028)
HHMI Hanna H. Gray Fellowship, Howard Hughes Medical Institute (2020-2027)
Siebel Scholarship, Siebel Scholars Foundation (2020)
HHMI Gilliam Fellowship, Howard Hughes Medical Institute (2013-2018)
Marshall Scholarship, UK Marshall Aid Commemoration Commission (2011-2013)
Angier B. Duke Scholarship, Duke University (2007-2011)

Professional Education

Postdoctoral Fellow, UC Berkeley, Molecular & Cell Biology
PhD, UC Berkeley and UCSF, Bioengineering (2021)
DPhil, University of Oxford, Statistics (2016)
BS, Duke University, Biology (2011)

Contact

Academic
altemose@stanford.edu

University - Faculty Department: Genetics Position: Assistant Professor

Administrative Contact Chris Barone Administrative Associate Level 3 cjbarone@stanford.edu

Additional Info

Profile:
Assistant Professor of Genetics
Other Names:
Nick Altemose
ORCID:
https://orcid.org/0000-0002-7231-6026

Current Research and Scholarly Interests

The Altemose Lab develops new experimental and analytical tools to study how chromatin proteins organize and regulate complex regions of the human genome.

2025-26 Courses

Genomics
GENE 211 (Win)
Independent Studies (4)
- Graduate Research
  GENE 399 (Aut, Win, Spr, Sum)
- Out-of-Department Graduate Research
  BIO 300X (Aut, Spr, Sum)
- Supervised Study
  GENE 260 (Aut, Win, Spr, Sum)
- Undergraduate Research
  GENE 199 (Spr)

Stanford Advisees

Doctoral Dissertation Reader (AC)
Lynette Chan, Jane Cook, Ezekiel Delgado, Angela Hickey, Geo Janer Carattini, Carolina Rios-Martinez
Postdoctoral Faculty Sponsor
Nathan Gamarra, Anthony Harris
Doctoral Dissertation Advisor (AC)
Maria Del Rio Pisula, Danilo Dubocanin, Rosa Lee, Hugo Mendez, Diego Pomales-Matos
Doctoral Dissertation Co-Advisor (AC)
Ashlie Barillas, Robert Hall

Graduate and Fellowship Programs

Genetics (Phd Program)

All Publications

DNA methylation influences human centromere positioning and function. Nature genetics Salinas-Luypaert, C., Dubocanin, D., Lee, R. J., Andrade Ruiz, L., Gamba, R., Grison, M., Velikovsky, L., Angrisani, A., Scelfo, A., Xu, Y., Dumont, M., Barra, V., Wilhelm, T., Velasco, G., Losito, M., Wardenaar, R., Francastel, C., Foijer, F., Kops, G. J., Miga, K. H., Altemose, N., Fachinetti, D. 2025

Abstract

Maintaining the epigenetic identity of centromeres is essential to prevent genome instability. Centromeres are epigenetically defined by the histone H3 variant CENP-A. Prior work in human centromeres has shown that CENP-A is associated with regions of hypomethylated DNA located within large arrays of hypermethylated repeats, but the functional importance of these DNA methylation (DNAme) patterns remains poorly understood. To address this, we developed tools to perturb centromeric DNAme, revealing that it causally influences CENP-A positioning. We show that rapid loss of methylation results in increased binding of centromeric proteins and alterations in centromere architecture, leading to aneuploidy and reduced cell viability. We also demonstrate that gradual centromeric DNA demethylation prompts a process of cellular adaptation. Altogether, we find that DNAme causally influences CENP-A localization and centromere function, offering mechanistic insights into pathological alterations of centromeric DNAme.

View details for DOI 10.1038/s41588-025-02324-w

View details for PubMedID 40908343

View details for PubMedCentralID 4160344
Human Satellite 3 DNA encodes megabase-scale transcription factor binding platforms. bioRxiv : the preprint server for biology Franklin, J. M., Dubocanin, D., Chittenden, C., Barillas, A., Lee, R. J., Ghosh, R. P., Gerton, J. L., Guan, K. L., Altemose, N. 2025

Abstract

Eukaryotic genomes frequently contain large arrays of tandem repeats, called satellite DNA. While some satellite DNAs participate in centromere function, others do not. For example, Human Satellite 3 (HSat3) forms the largest satellite DNA arrays in the human genome, but these multi-megabase regions were almost fully excluded from genome assemblies until recently, and their potential functions remain understudied and largely unknown. To address this, we performed a systematic screen for HSat3 binding proteins. Our work revealed that HSat3 contains millions of copies of transcription factor (TF) motifs bound by over a dozen TFs from various signaling pathways, including the growth-regulating transcription effector family TEAD1-4 from the Hippo pathway. Imaging experiments show that TEAD recruits the co-activator YAP to HSat3 regions in a cell-state specific manner. Using synthetic reporter assays, targeted repression of HSat3, inducible degradation of YAP, and super-resolution microscopy, we show that HSat3 arrays can localize YAP/TEAD inside the nucleolus, enhancing RNA Polymerase I activity. Beyond discovering a direct relationship between the Hippo pathway and ribosomal DNA regulation, this work demonstrates that satellite DNA can encode multiple transcription factor binding motifs, defining an important functional role for these enormous genomic elements.

View details for DOI 10.1101/2024.10.22.616524

View details for PubMedID 39484556

View details for PubMedCentralID PMC11526998
Integrating Single-Molecule Sequencing and Deep Learning to Predict Haplotype-Specific 3D Chromatin Organization in a Mendelian Condition. bioRxiv : the preprint server for biology Dubocanin, D., Kalygina, A., Franklin, J. M., Chittenden, C., Vollger, M. R., Neph, S., Stergachis, A. B., Altemose, N. 2025

Abstract

The three-dimensional (3D) architecture of the genome plays a crucial role in gene regulation and various human diseases. Short-read sequencing methods for measuring 3D genome organization are powerful, but they lack the ability to resolve individual human haplotypes or structurally complex regions. To address this, we present FiberFold, a deep learning model that combines convolutional neural networks and transformer architectures to accurately predict cell-type-specific and haplotype-specific 3D genome organization using multi-omic data from a single, long-read sequencing assay, Fiber-seq. By applying FiberFold to a cell line with allelic X-inactivation, we show that Topologically Associated Domains (TADs) are attenuated on the inactive chrX. Furthermore, FiberFold predicts significant changes to TADs surrounding a 13;X balanced translocation in a patient with a rare Mendelian disease. FiberFold showcases the power of integrating long-read epigenomic sequencing with deep learning tools to investigate fundamental chromatin biology as well as the molecular basis of human disease.

View details for DOI 10.1101/2025.02.26.640261

View details for PubMedID 40166185

View details for PubMedCentralID PMC11957061
DiMeLo-cito: a one-tube protocol for mapping protein-DNA interactions reveals CTCF bookmarking in mitosis. bioRxiv : the preprint server for biology Gamarra, N., Chittenden, C., Sundararajan, K., Schwartz, J. P., Lundqvist, S., Robles, D., Dixon-Luinenburg, O., Marcus, J., Maslan, A., Franklin, J. M., Streets, A., Straight, A. F., Altemose, N. 2025

Abstract

Genome regulation relies on complex and dynamic interactions between DNA and proteins. Recently, powerful methods have emerged that leverage third-generation sequencing to map protein-DNA interactions genome-wide. For example, Directed Methylation with Long-read sequencing (DiMeLo-seq) enables mapping of protein-DNA interactions along long, single chromatin fibers, including in highly repetitive genomic regions. However, DiMeLo-seq involves lossy centrifugation-based wash steps that limit its applicability to many sample types. To address this, we developed DiMeLo-cito, a single-tube, wash-free protocol that maximizes the yield and quality of genomic DNA obtained for long-read sequencing. This protocol enables the interrogation of genome-wide protein binding with as few as 100,000 cells and without the requirement of a nuclear envelope, enabling confident measurement of protein-DNA interactions during mitosis. Using this protocol, we detected strong binding of CTCF to mitotic chromosomes in diploid human cells, in contrast with earlier studies in karyotypically unstable cancer cell lines, suggesting that CTCF "bookmarks" specific sites critical for maintaining genome architecture across cell divisions. By expanding the capabilities of DiMeLo-seq to a broader range of sample types, DiMeLo-cito can provide new insights into genome regulation and organization.

View details for DOI 10.1101/2025.03.11.642717

View details for PubMedID 40161611

View details for PubMedCentralID PMC11952428
A classical revival: Human satellite DNAs enter the genomics era SEMINARS IN CELL & DEVELOPMENTAL BIOLOGY Altemose, N. 2022; 128: 2-14

Abstract

The classical human satellite DNAs, also referred to as human satellites 1, 2 and 3 (HSat1, HSat2, HSat3, or collectively HSat1-3), occur on most human chromosomes as large, pericentromeric tandem repeat arrays, which together constitute roughly 3% of the human genome (100 megabases, on average). Even though HSat1-3 were among the first human DNA sequences to be isolated and characterized at the dawn of molecular biology, they have remained almost entirely missing from the human genome reference assembly for 20 years, hindering studies of their sequence, regulation, and potential structural roles in the nucleus. Recently, the Telomere-to-Telomere Consortium produced the first truly complete assembly of a human genome, paving the way for new studies of HSat1-3 with modern genomic tools. This review provides an account of the history and current understanding of HSat1-3, with a view towards future studies of their evolution and roles in health and disease.

View details for DOI 10.1016/j.semcdb.2022.04.012

View details for Web of Science ID 000816909200002

View details for PubMedID 35487859
DiMeLo-seq: a long-read, single-molecule method for mapping protein-DNA interactions genome wide. Nature methods Altemose, N., Maslan, A., Smith, O. K., Sundararajan, K., Brown, R. R., Mishra, R., Detweiler, A. M., Neff, N., Miga, K. H., Straight, A. F., Streets, A. 2022

Abstract

Studies of genome regulation routinely use high-throughput DNA sequencing approaches to determine where specific proteins interact with DNA, and they rely on DNA amplification and short-read sequencing, limiting their quantitative application in complex genomic regions. To address these limitations, we developed directed methylation with long-read sequencing (DiMeLo-seq), which uses antibody-tethered enzymes to methylate DNA near a target protein's binding sites in situ. These exogenous methylation marks are then detected simultaneously with endogenous CpG methylation on unamplified DNA using long-read, single-molecule sequencing technologies. We optimized and benchmarked DiMeLo-seq by mapping chromatin-binding proteins and histone modifications across the human genome. Furthermore, we identified where centromere protein A localizes within highly repetitive regions that were unmappable with short sequencing reads, and we estimated the density of centromere protein A molecules along single chromatin fibers. DiMeLo-seq is a versatile method that provides multimodal, genome-wide information for investigating protein-DNA interactions.

View details for DOI 10.1038/s41592-022-01475-6

View details for PubMedID 35396487
Complete genomic and epigenetic maps of human centromeres. Science (New York, N.Y.) Altemose, N., Logsdon, G. A., Bzikadze, A. V., Sidhwani, P., Langley, S. A., Caldas, G. V., Hoyt, S. J., Uralsky, L., Ryabov, F. D., Shew, C. J., Sauria, M. E., Borchers, M., Gershman, A., Mikheenko, A., Shepelev, V. A., Dvorkina, T., Kunyavskaya, O., Vollger, M. R., Rhie, A., McCartney, A. M., Asri, M., Lorig-Roach, R., Shafin, K., Lucas, J. K., Aganezov, S., Olson, D., de Lima, L. G., Potapova, T., Hartley, G. A., Haukness, M., Kerpedjiev, P., Gusev, F., Tigyi, K., Brooks, S., Young, A., Nurk, S., Koren, S., Salama, S. R., Paten, B., Rogaev, E. I., Streets, A., Karpen, G. H., Dernburg, A. F., Sullivan, B. A., Straight, A. F., Wheeler, T. J., Gerton, J. L., Eichler, E. E., Phillippy, A. M., Timp, W., Dennis, M. Y., O'Neill, R. J., Zook, J. M., Schatz, M. C., Pevzner, P. A., Diekhans, M., Langley, C. H., Alexandrov, I. A., Miga, K. H. 2022; 376 (6588): eabl4178

Abstract

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.

View details for DOI 10.1126/science.abl4178

View details for PubMedID 35357911
Genomic Characterization of Large Heterochromatic Gaps in the Human Genome Assembly PLOS COMPUTATIONAL BIOLOGY Altemose, N., Miga, K. H., Maggioni, M., Willard, H. F. 2014; 10 (5): e1003628

Abstract

The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3). The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb) and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations.

View details for DOI 10.1371/journal.pcbi.1003628

View details for Web of Science ID 000337288000037

View details for PubMedID 24831296

View details for PubMedCentralID PMC4022460
Heterochromatin boundaries maintain centromere position, size and number. Nature structural & molecular biology Carty, B. L., Dubocanin, D., Murillo-Pineda, M., Dumont, M., Volpe, E., Mikulski, P., Humes, J., Whittingham, O., Fachinetti, D., Giunta, S., Altemose, N., Jansen, L. E. 2025

Abstract

Centromeres are defined by a unique single chromatin domain featuring the histone H3 variant, centromere protein A (CENP-A), and ensure proper chromosome segregation. Centromeric chromatin typically occupies a small subregion of low DNA methylation within multimegabase arrays of hypermethylated alpha-satellite repeats and constitutive pericentric heterochromatin. Here, we define the molecular basis of how heterochromatin serves as a primary driver of centromere and neocentromere position, size and number. Using single-molecule epigenomics, we uncover roles for H3K9me3 methyltransferases SUV39H1/H2 and SETDB1, in addition to noncanonical roles for SUZ12, in maintaining H3K9me3 boundaries at centromeres. Loss of these heterochromatin boundaries leads to the progressive expansion and/or repositioning of the primary CENP-A domain, erosion of surrounding DNA methylation and nucleation of additional functional CENP-A domains across the same alpha-satellite sequences. Our study identifies the functional importance and specialization of different H3K9 methyltransferases across centromeric and pericentric domains, crucial for maintaining centromere domain size and suppressing ectopic centromere nucleation events.

View details for DOI 10.1038/s41594-025-01706-2

View details for PubMedID 41291334
Integrated analysis of multimodal long-read epigenetic assays. bioRxiv : the preprint server for biology Marcus, J., Dixon-Luinenburg, O., Gamarra, N., Schwartz, J. P., Rozenwald, M., Maslan, A., Urnov, F. D., Straight, A. F., Altemose, N., Ioannidis, N. M., Streets, A. 2025

Abstract

Long-read sequencing assays that detect base modifications are becoming increasingly important research tools for the study of epigenetic regulation, especially with the development of DiMeLo-seq and similar methods that deposit non-native base modifications to mark a range of epigenetic features such as protein-DNA interactions and chromatin accessibility. A main benefit of these methods is their inherent capacity for multimodality, enabling the encoding of multiple genomic signals onto single nucleic acid molecules. However, there are limited tools available for visualization and statistical analysis of this type of multimodal data. Here we introduce dimelo-toolkit, a python package built to enable flexible visualizations and easy integration into custom data processing workflows. We demonstrate the utility of dimelo-toolkit's preset visualizations of multiple base modifications in long-read single-molecule sequencing data with a novel extension of the DiMeLo-seq protocol that can capture three separate aspects of chromatin state on the same single reads: target protein binding, CpG methylation, and chromatin accessibility. We apply this multimodal method to simultaneously map chromatin accessibility, CpG methylation, and LMNB1 and CTCF binding patterns, respectively, in GM12878 cells. Additionally, we use dimelo-toolkit to investigate technical biases that arise when working with this type of multimodal data. This software tool will pave the way for developing well-optimized protocols and help unlock previously inaccessible biological insights.

View details for DOI 10.1101/2025.11.09.687458

View details for PubMedID 41279073

View details for PubMedCentralID PMC12637566
A telomere-to-telomere map of somatic mutation burden and functional impact in cancer. bioRxiv : the preprint server for biology Sohn, M. H., Dubocanin, D., Vollger, M. R., Kwon, Y., Minkina, A., Munson, K. M., Hart, S. F., Ranchalis, J. E., Parmalee, N. L., Sedeño-Cortés, A. E., Ou, J., Au, N. Y., Bohaczuk, S., Carroll, B., Frazar, C. D., Harvey, W. T., Hoekzema, K., Huang, M. F., Jacques, C. N., Jensen, D. M., Kolar, J. T., Lee, R., Lin, J., Loy, K., Mack, T., Mao, Y., Pham, M. M., Ryke, E., Smith, J. D., Sutherlin, L., Swanson, E. G., Weiss, J. M., Wg, S. A., Carvalho, C., Coorens, T. H., Harris, K., Wei, C. L., Eichler, E. E., Altemose, N., Bennett, J. T., Stergachis, A. B. 2025

Abstract

Oncogenesis involves widespread genetic and epigenetic alterations, yet the full spectrum of somatic variation genome-wide remains unresolved. We generated a near-telomere-to-telomere (T2T) diploid assembly of a donor paired with deep short- and long-read sequencing of their melanoma. This revealed that 16% of somatic variants occur in sequences absent from GRCh38, with satellite repeats acting as hotspots for UV-induced damage due to sequence-intrinsic mutability and inefficient repair. Centromere kinetochore domains emerged as focal sites of structural, genetic, and epigenetic variation, leading to remodeling of centromere kinetochore binding domains during tumor evolution. Single-molecule telomere reconstructions uncovered cycles of attrition, deletion, and telomerase-mediated extension that shape cancer telomeres. Finally, diploid chromatin maps exposed that copy number alterations and epimutations, rather than point mutations, predominate in rewiring cancer regulatory programs. These findings define the full landscape of a cancer's somatic variation and their functional impact, establishing a blueprint for T2T studies of mosaicism.

View details for DOI 10.1101/2025.10.10.681725

View details for PubMedID 41279560

View details for PubMedCentralID PMC12632929
A complete diploid human genome benchmark for personalized genomics. bioRxiv : the preprint server for biology Hansen, N. F., Dwarshuis, N., Ji, H. J., Rhie, A., Loucks, H., Logsdon, G. A., Vollger, M. R., Storer, J. M., Kim, J., Adam, E., Altemose, N., Antipov, D., Asri, M., Barreira, S., Bohaczuk, S. C., Bzikadze, A. V., Carioscia, S. A., Carroll, A., Chao, K. H., Chu, Y., Das, A., Ebert, P., English, A., Fleharty, M., Fleming, L. E., Formenti, G., Guarracino, A., Hartley, G. A., Jenike, K., Kalleberg, J., Kang, Y., King, R., Lipovac, J., Mastoras, M., Mitchell, M. W., Negi, S., Olson, N. D., Oshima, K. K., Paulin, L. F., Pickett, B. D., Porubsky, D., Ranchalis, J., Ranjan, D., Rautiainen, M., Riethman, H., Schnabel, R. D., Sedlazeck, F. J., Shafin, K., Sikic, M., Solar, S. J., Sweeten, A. P., Timp, W., Wagner, J., Yoo, D., Zhou, Y., Garrison, E., Eichler, E. E., Schatz, M. C., Stergachis, A. B., O'Neill, R. J., Miga, K. H., Salzberg, S. L., Koren, S., Zook, J. M., Phillippy, A. M. 2025

Abstract

Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and structurally polymorphic regions of the genome unmapped. Consequently, existing variant benchmarks, generated by the same methods, fail to assess these complex regions. To address this limitation, we present a telomere-to-telomere genome benchmark that achieves near-perfect accuracy (i.e. no detectable errors) across 99.4% of the complete, diploid HG002 genome. This benchmark adds 701.4 Mb of autosomal sequence and both sex chromosomes (216.8 Mb), totaling 15.3% of the genome that was absent from prior benchmarks. We also provide a diploid annotation of genes, transposable elements, segmental duplications, and satellite repeats, including 39,144 protein-coding genes across both haplotypes. To facilitate application of the benchmark, we developed tools for measuring the accuracy of sequencing reads, phased variant call sets, and genome assemblies against a diploid reference. Genome-wide analyses show that state-of-the-art de novo assembly methods resolve 2-7% more sequence and outperform variant calling accuracy by an order of magnitude, yielding just one error per 100 kb across 99.9% of the benchmark regions. Adoption of genome-based benchmarking is expected to accelerate the development of cost-effective methods for complete genome sequencing, expanding the reach of genomic medicine to the entire genome and enabling a new era of personalized genomics.

View details for DOI 10.1101/2025.09.21.677443

View details for PubMedID 41000953

View details for PubMedCentralID PMC12458380
NKX2-1 drives neuroendocrine transdifferentiation of prostate cancer via epigenetic and 3D chromatin remodeling. Nature genetics Lu, X., Keo, V., Cheng, I., Xie, W., Gritsina, G., Wang, J., Lu, L., Shiau, C. K., He, Y., Jin, Q., Jin, P., Sanda, M. G., Corces, V. G., Altemose, N., Gao, R., Zhao, J. C., Yu, J. 2025

Abstract

A substantial amount of castration-resistant prostate cancer (CRPC) progresses into a neuroendocrine (NE) subtype, known as NEPC, which is associated with poor clinical outcomes. Here we report distinct three-dimensional chromatin architectures between NEPC and CRPC tumors, which were recapitulated by isogenic cell lines undergoing NE transformation (NET). Mechanistically, pioneer factors such as FOXA2 initiate binding at NE enhancers to mediate regional DNA demethylation and induce neural transcription factor (TF) NKX2-1 expression. NKX2-1 preferentially binds gene promoters and interacts with enhancer-bound FOXA2 through chromatin looping. NKX2-1 is highly expressed in NEPC and indispensable for NET of prostate cancer. NKX2-1/FOXA2 further recruits p300/CBP to activate NE enhancers, and pharmacological inhibition of p300/CBP effectively blunts NE gene expression and abolishes NEPC tumor growth. Taken together, our study reports a hierarchical network of TFs governed by NKX2-1 in critically regulating chromatin remodeling and driving luminal-to-NE transformation and suggests promising therapeutic approaches to mitigate NEPC.

View details for DOI 10.1038/s41588-025-02265-4

View details for PubMedID 40691407

View details for PubMedCentralID 6561293
Enhancing transcription-replication conflict targets ecDNA-positive cancers. Nature Tang, J., Weiser, N. E., Wang, G., Chowdhry, S., Curtis, E. J., Zhao, Y., Wong, I. T., Marinov, G. K., Li, R., Hanoian, P., Tse, E., Mojica, S. G., Hansen, R., Plum, J., Steffy, A., Milutinovic, S., Meyer, S. T., Luebeck, J., Wang, Y., Zhang, S., Altemose, N., Curtis, C., Greenleaf, W. J., Bafna, V., Benkovic, S. J., Pinkerton, A. B., Kasibhatla, S., Hassig, C. A., Mischel, P. S., Chang, H. Y. 2024; 635 (8037): 210-218

Abstract

Extrachromosomal DNA (ecDNA) presents a major challenge for cancer patients. ecDNA renders tumours treatment resistant by facilitating massive oncogene transcription and rapid genome evolution, contributing to poor patient survival1-7. At present, there are no ecDNA-specific treatments. Here we show that enhancing transcription-replication conflict enables targeted elimination of ecDNA-containing cancers. Stepwise analyses of ecDNA transcription reveal pervasive RNA transcription and associated single-stranded DNA, leading to excessive transcription-replication conflicts and replication stress compared with chromosomal loci. Nucleotide incorporation on ecDNA is markedly slower, and replication stress is significantly higher in ecDNA-containing tumours regardless of cancer type or oncogene cargo. pRPA2-S33, a mediator of DNA damage repair that binds single-stranded DNA, shows elevated localization on ecDNA in a transcription-dependent manner, along with increased DNA double strand breaks, and activation of the S-phase checkpoint kinase, CHK1. Genetic or pharmacological CHK1 inhibition causes extensive and preferential tumour cell death in ecDNA-containing tumours. We advance a highly selective, potent and bioavailable oral CHK1 inhibitor, BBI-2779, that preferentially kills ecDNA-containing tumour cells. In a gastric cancer model containing FGFR2 amplified on ecDNA, BBI-2779 suppresses tumour growth and prevents ecDNA-mediated acquired resistance to the pan-FGFR inhibitor infigratinib, resulting in potent and sustained tumour regression in mice. Transcription-replication conflict emerges as a target for ecDNA-directed therapy, exploiting a synthetic lethality of excess to treat cancer.

View details for DOI 10.1038/s41586-024-07802-5

View details for PubMedID 39506153
Mapping protein-DNA interactions with DiMeLo-seq. Nature protocols Maslan, A., Altemose, N., Marcus, J., Mishra, R., Brennan, L. D., Sundararajan, K., Karpen, G., Straight, A. F., Streets, A. 2024

Abstract

We recently developed directed methylation with long-read sequencing (DiMeLo-seq) to map protein-DNA interactions genome wide. DiMeLo-seq is capable of mapping multiple interaction sites on single DNA molecules, profiling protein binding in the context of endogenous DNA methylation, identifying haplotype-specific protein-DNA interactions and mapping protein-DNA interactions in repetitive regions of the genome that are difficult to study with short-read methods. With DiMeLo-seq, adenines in the vicinity of a protein of interest are methylated in situ by tethering the Hia5 methyltransferase to an antibody using protein A. Protein-DNA interactions are then detected by direct readout of adenine methylation with long-read, single-molecule DNA sequencing platforms such as Nanopore sequencing. Here we present a detailed protocol and practical guidance for performing DiMeLo-seq. This protocol can be run on nuclei from fresh, lightly fixed or frozen cells. The protocol requires 1-2 d for performing in situ targeted methylation, 1-5 d for library preparation depending on desired fragment length and 1-3 d for Nanopore sequencing depending on desired sequencing depth. The protocol requires basic molecular biology skills and equipment, as well as access to a Nanopore sequencer. We also provide a Python package, dimelo, for analysis of DiMeLo-seq data.

View details for DOI 10.1038/s41596-024-01032-9

View details for PubMedID 39237830

View details for PubMedCentralID 2921165
The complete sequence of a human Y chromosome NATURE Rhie, A., Nurk, S., Cechova, M., Hoyt, S. J., Taylor, D. J., Altemose, N., Hook, P. W., Koren, S., Rautiainen, M., Alexandrov, I. A., Allen, J., Asri, M., Bzikadze, A. V., Chen, N., Chin, C., Diekhans, M., Flicek, P., Formenti, G., Fungtammasan, A., Garcia Giron, C., Garrison, E., Gershman, A., Gerton, J. L., Grady, P. G. S., Guarracino, A., Haggerty, L., Halabian, R., Hansen, N. F., Harris, R., Hartley, G. A., Harvey, W. T., Haukness, M., Heinz, J., Hourlier, T., Hubley, R. M., Hunt, S. E., Hwang, S., Jain, M., Kesharwani, R. K., Lewis, A. P., Li, H., Logsdon, G. A., Lucas, J. K., Makalowski, W., Markovic, C., Martin, F. J., Cartney, A., Mccoy, R. C., Mcdaniel, J., Mcnulty, B. M., Medvedev, P., Mikheenko, A., Munson, K. M., Murphy, T. D., Olsen, H. E., Olson, N. D., Paulin, L. F., Porubsky, D., Potapova, T., Ryabov, F., Salzberg, S. L., Sauria, M. E. G., Sedlazeck, F. J., Shafin, K., Shepelev, V. A., Shumate, A., Storer, J. M., Surapaneni, L., Taravella Oill, A. M., Thibaud-Nissen, F., Timp, W., Tomaszkiewicz, M., Vollger, M. R., Walenz, B. P., Watwood, A. C., Weissensteiner, M. H., Wenger, A. M., Wilson, M. A., Zarate, S., Zhu, Y., Zook, J. M., Eichler, E. E., O'Neill, R. J., Schatz, M. C., Miga, K. H., Makova, K. D., Phillippy, A. M. 2023; 621 (7978): 344-354

Abstract

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.

View details for DOI 10.1038/s41586-023-06457-y

View details for Web of Science ID 001082304000001

View details for PubMedID 37612512

View details for PubMedCentralID 3975068
The complete sequence of a human genome SCIENCE Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A., Mikheenko, A., Vollger, M. R., Altemose, N., Uralsky, L., Gershman, A., Aganezov, S., Hoyt, S. J., Diekhans, M., Logsdon, G. A., Alonge, M., Antonarakis, S. E., Borchers, M., Bouffard, G. G., Brooks, S. Y., Caldas, G., Chen, N., Cheng, H., Chin, C., Chow, W., de Lima, L. G., Dishuck, P. C., Durbin, R., Dvorkina, T., Fiddes, I. T., Formenti, G., Fulton, R. S., Fungtammasan, A., Garrison, E., Grady, P. G. S., Graves-Lindsay, T. A., Hall, I. M., Hansen, N. F., Hartley, G. A., Haukness, M., Howe, K., Hunkapiller, M. W., Jain, C., Jain, M., Jarvis, E. D., Kerpedjiev, P., Kirsche, M., Kolmogorov, M., Korlach, J., Kremitzki, M., Li, H., Maduro, V. V., Marschall, T., McCartney, A. M., McDaniel, J., Miller, D. E., Mullikin, J. C., Myers, E. W., Olson, N. D., Paten, B., Peluso, P., Pevzner, P. A., Porubsky, D., Potapova, T., Rogaev, E., Rosenfeld, J. A., Salzberg, S. L., Schneider, V. A., Sedlazeck, F. J., Shafin, K., Shew, C. J., Shumate, A., Sims, Y., Smit, A. F. A., Soto, D. C., Sovic, I., Storer, J. M., Streets, A., Sullivan, B. A., Thibaud-Nissen, F., Torrance, J., Wagner, J., Walenz, B. P., Wenger, A., Wood, J. M. D., Xiao, C., Yan, S. M., Young, A. C., Zarate, S., Surti, U., McCoy, R. C., Dennis, M. Y., Alexandrov, I. A., Gerton, J. L., O'Neill, R. J., Timp, W., Zook, J. M., Schatz, M. C., Eichler, E. E., Miga, K. H., Phillippy, A. M. 2022; 376 (6588): 44-+

View details for DOI 10.1126/science.abj6987

View details for Web of Science ID 000780195200021
From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science (New York, N.Y.) Hoyt, S. J., Storer, J. M., Hartley, G. A., Grady, P. G., Gershman, A., de Lima, L. G., Limouse, C., Halabian, R., Wojenski, L., Rodriguez, M., Altemose, N., Rhie, A., Core, L. J., Gerton, J. L., Makalowski, W., Olson, D., Rosen, J., Smit, A. F., Straight, A. F., Vollger, M. R., Wheeler, T. J., Schatz, M. C., Eichler, E. E., Phillippy, A. M., Timp, W., Miga, K. H., O'Neill, R. J. 2022; 376 (6588): eabk3112

Abstract

Mobile elements and repetitive genomic regions are sources of lineage-specific genomic innovation and uniquely fingerprint individual genomes. Comprehensive analyses of such repeat elements, including those found in more complex regions of the genome, require a complete, linear genome assembly. We present a de novo repeat discovery and annotation of the T2T-CHM13 human reference genome. We identified previously unknown satellite arrays, expanded the catalog of variants and families for repeats and mobile elements, characterized classes of complex composite repeats, and located retroelement transduction events. We detected nascent transcription and delineated CpG methylation profiles to define the structure of transcriptionally active retroelements in humans, including those in centromeres. These data expand our insight into the diversity, distribution, and evolution of repetitive regions that have shaped the human genome.

View details for DOI 10.1126/science.abk3112

View details for PubMedID 35357925
Epigenetic patterns in a complete human genome SCIENCE Gershman, A., Sauria, M. E. G., Guitart, X., Vollger, M. R., Hook, P. W., Hoyt, S. J., Jain, M., Shumate, A., Razaghi, R., Koren, S., Altemose, N., Caldas, G., Logsdon, G. A., Rhie, A., Eichler, E. E., Schatz, M. C., O'Neill, R. J., Phillippy, A. M., Miga, K. H., Timp, W. 2022; 376 (6588): 58-+

Abstract

The completion of a telomere-to-telomere human reference genome, T2T-CHM13, has resolved complex regions of the genome, including repetitive and homologous regions. Here, we present a high-resolution epigenetic study of previously unresolved sequences, representing entire acrocentric chromosome short arms, gene family expansions, and a diverse collection of repeat classes. This resource precisely maps CpG methylation (32.28 million CpGs), DNA accessibility, and short-read datasets (166,058 previously unresolved chromatin immunoprecipitation sequencing peaks) to provide evidence of activity across previously unidentified or corrected genes and reveals clinically relevant paralog-specific regulation. Probing CpG methylation across human centromeres from six diverse individuals generated an estimate of variability in kinetochore localization. This analysis provides a framework with which to investigate the most elusive regions of the human genome, granting insights into epigenetic regulation.

View details for DOI 10.1126/science.abj5089

View details for Web of Science ID 000780195200026

View details for PubMedID 35357915

View details for PubMedCentralID PMC9170183
Characterization of transcript enrichment and detection bias in single-nucleus RNA-seq for mapping of distinct human adipocyte lineages GENOME RESEARCH Gupta, A., Shamsi, F., Altemose, N., Dorlhiac, G. F., Cypess, A. M., White, A. P., Yosef, N., Patti, M., Tseng, Y., Streets, A. 2022; 32 (2): 242-257

Abstract

Single-cell RNA sequencing (scRNA-seq) enables molecular characterization of complex biological tissues at high resolution. The requirement of single-cell extraction, however, makes it challenging for profiling tissues such as adipose tissue, for which collection of intact single adipocytes is complicated by their fragile nature. For such tissues, single-nucleus extraction is often much more efficient and therefore single-nucleus RNA sequencing (snRNA-seq) presents an alternative to scRNA-seq. However, nuclear transcripts represent only a fraction of the transcriptome in a single cell, with snRNA-seq marked with inherent transcript enrichment and detection biases. Therefore, snRNA-seq may be inadequate for mapping important transcriptional signatures in adipose tissue. In this study, we compare the transcriptomic landscape of single nuclei isolated from preadipocytes and mature adipocytes across human white and brown adipocyte lineages, with whole-cell transcriptome. We show that snRNA-seq is capable of identifying the broad cell types present in scRNA-seq at all states of adipogenesis. However, we also explore how and why the nuclear transcriptome is biased and limited, as well as how it can be advantageous. We robustly characterize the enrichment of nuclear-localized transcripts and adipogenic regulatory lncRNAs in snRNA-seq, while also providing a detailed understanding for the preferential detection of long genes upon using this technique. To remove such technical detection biases, we propose a normalization strategy for a more accurate comparison of nuclear and cellular data. Finally, we show successful integration of scRNA-seq and snRNA-seq data sets with existing bioinformatic tools. Overall, our results illustrate the applicability of snRNA-seq for the characterization of cellular diversity in the adipose tissue.

View details for DOI 10.1101/gr.275509.121

View details for Web of Science ID 000749564500004

View details for PubMedID 35042723

View details for PubMedCentralID PMC8805720
mu DamID: A Microfluidic Approach for Joint Imaging and Sequencing of Protein-DNA Interactions in Single Cells CELL SYSTEMS Altemose, N., Maslan, A., Rios-Martinez, C., Lai, A., White, J. A., Streets, A. 2020; 11 (4): 354-+

Abstract

DNA adenine methyltransferase identification (DamID) measures a protein's DNA-binding history by methylating adenine bases near each protein-DNA interaction site and then selectively amplifying and sequencing these methylated regions. Additionally, these interactions can be visualized using m6A-Tracer, a fluorescent protein that binds to methyladenines. Here, we combine these imaging and sequencing technologies in an integrated microfluidic platform (μDamID) that enables single-cell isolation, imaging, and sorting, followed by DamID. We use μDamID and an improved m6A-Tracer protein to generate paired imaging and sequencing data from individual human cells. We validate interactions between Lamin-B1 protein and lamina-associated domains (LADs), observe variable 3D chromatin organization and broad gene regulation patterns, and jointly measure single-cell heterogeneity in Dam expression and background methylation. μDamID provides the unique ability to compare paired imaging and sequencing data for each cell and between cells, enabling the joint analysis of the nuclear localization, sequence identity, and variability of protein-DNA interactions. A record of this paper's transparent peer review process is included in the Supplemental Information.

View details for DOI 10.1016/j.cels.2020.08.015

View details for Web of Science ID 000582118000004

View details for PubMedID 33099405

View details for PubMedCentralID PMC7588622
Two genetic variants explain the association of European ancestry with multiple sclerosis risk in African-Americans SCIENTIFIC REPORTS Nakatsuka, N., Patterson, N., Patsopoulos, N. A., Altemose, N., Tandon, A., Beecham, A. H., McCauley, J. L., Isobe, N., Hauser, S., De Jager, P. L., Hafler, D. A., Oksenberg, J. R., Reich, D. 2020; 10 (1): 16902

Abstract

Epidemiological studies have suggested differences in the rate of multiple sclerosis (MS) in individuals of European ancestry compared to African ancestry, motivating genetic scans to identify variants that could contribute to such patterns. In a whole-genome scan in 899 African-American cases and 1155 African-American controls, we confirm that African-Americans who inherit segments of the genome of European ancestry at a chromosome 1 locus are at increased risk for MS [logarithm of odds (LOD) = 9.8], although the signal weakens when adding an additional 406 cases, reflecting heterogeneity in the two sets of cases [logarithm of odds (LOD) = 2.7]. The association in the 899 individuals can be fully explained by two variants previously associated with MS in European ancestry individuals. These variants tag a MS susceptibility haplotype associated with decreased CD58 gene expression (odds ratio of 1.37; frequency of 84% in Europeans and 22% in West Africans for the tagging variant) as well as another haplotype near the FCRL3 gene (odds ratio of 1.07; frequency of 49% in Europeans and 8% in West Africans). Controlling for all other genetic and environmental factors, the two variants predict a 1.44-fold higher rate of MS in European-Americans compared to African-Americans.

View details for DOI 10.1038/s41598-020-74035-7

View details for Web of Science ID 000615373100014

View details for PubMedID 33037294

View details for PubMedCentralID PMC7547691
On-ratio PDMS bonding for multilayer microfluidic device fabrication JOURNAL OF MICROMECHANICS AND MICROENGINEERING Lai, A., Altemose, N., White, J. A., Streets, A. M. 2019; 29 (10)

View details for DOI 10.1088/1361-6439/ab341e

View details for Web of Science ID 000480294500001
A high-resolution map of non-crossover events reveals impacts of genetic diversity on mammalian meiotic recombination NATURE COMMUNICATIONS Li, R., Bitoun, E., Altemose, N., Davies, R. W., Davies, B., Myers, S. R. 2019; 10: 3900

Abstract

During meiotic recombination, homologue-templated repair of programmed DNA double-strand breaks (DSBs) produces relatively few crossovers and many difficult-to-detect non-crossovers. By intercrossing two diverged mouse subspecies over five generations and deep-sequencing 119 offspring, we detect thousands of crossover and non-crossover events genome-wide with unprecedented power and spatial resolution. We find that both crossovers and non-crossovers are strongly depleted at DSB hotspots where the DSB-positioning protein PRDM9 fails to bind to the unbroken homologous chromosome, revealing that PRDM9 also functions to promote homologue-templated repair. Our results show that complex non-crossovers are much rarer in mice than humans, consistent with complex events arising from accumulated non-programmed DNA damage. Unexpectedly, we also find that GC-biased gene conversion is restricted to non-crossover tracts containing only one mismatch. These results demonstrate that local genetic diversity profoundly alters meiotic repair pathway decisions via at least two distinct mechanisms, impacting genome evolution and Prdm9-related hybrid infertility.

View details for DOI 10.1038/s41467-019-11675-y

View details for Web of Science ID 000483017900010

View details for PubMedID 31467277

View details for PubMedCentralID PMC6715734
A map of human PRDM9 binding provides evidence for novel behaviors of PRDM9 and other zinc-finger proteins in meiosis ELIFE Altemose, N., Noor, N., Bitoun, E., Tumian, A., Imbeault, M., Chapman, J., Aricescu, A., Myers, S. R. 2017; 6

Abstract

PRDM9 binding localizes almost all meiotic recombination sites in humans and mice. However, most PRDM9-bound loci do not become recombination hotspots. To explore factors that affect binding and subsequent recombination outcomes, we mapped human PRDM9 binding sites in a transfected human cell line and measured PRDM9-induced histone modifications. These data reveal varied DNA-binding modalities of PRDM9. We also find that human PRDM9 frequently binds promoters, despite their low recombination rates, and it can activate expression of a small number of genes including CTCFL and VCX. Furthermore, we identify specific sequence motifs that predict consistent, localized meiotic recombination suppression around a subset of PRDM9 binding sites. These motifs strongly associate with KRAB-ZNF protein binding, TRIM28 recruitment, and specific histone modifications. Finally, we demonstrate that, in addition to binding DNA, PRDM9's zinc fingers also mediate its multimerization, and we show that a pair of highly diverged alleles preferentially form homo-multimers.

View details for DOI 10.7554/e.Life.28383

View details for Web of Science ID 000416379900001

View details for PubMedID 29072575

View details for PubMedCentralID PMC5705219
Re-engineering the zinc fingers of PRDM9 reverses hybrid sterility in mice NATURE Davies, B., Hatton, E., Altemose, N., Hussin, J. G., Pratto, F., Zhang, G., Hinch, A., Moralli, D., Biggs, D., Diaz, R., Preece, C., Li, R., Bitoun, E., Brick, K., Green, C. M., Amerini-Otero, R. C., Myers, S. R., Donnelly, P. 2016; 530 (7589): 171-+

Abstract

The DNA-binding protein PRDM9 directs positioning of the double-strand breaks (DSBs) that initiate meiotic recombination in mice and humans. Prdm9 is the only mammalian speciation gene yet identified and is responsible for sterility phenotypes in male hybrids of certain mouse subspecies. To investigate PRDM9 binding and its role in fertility and meiotic recombination, we humanized the DNA-binding domain of PRDM9 in C57BL/6 mice. This change repositions DSB hotspots and completely restores fertility in male hybrids. Here we show that alteration of one Prdm9 allele impacts the behaviour of DSBs controlled by the other allele at chromosome-wide scales. These effects correlate strongly with the degree to which each PRDM9 variant binds both homologues at the DSB sites it controls. Furthermore, higher genome-wide levels of such 'symmetric' PRDM9 binding associate with increasing fertility measures, and comparisons of individual hotspots suggest binding symmetry plays a downstream role in the recombination process. These findings reveal that subspecies-specific degradation of PRDM9 binding sites by meiotic drive, which steadily increases asymmetric PRDM9 binding, has impacts beyond simply changing hotspot positions, and strongly support a direct involvement in hybrid infertility. Because such meiotic drive occurs across mammals, PRDM9 may play a wider, yet transient, role in the early stages of speciation.

View details for DOI 10.1038/nature16931

View details for Web of Science ID 000369916700029

View details for PubMedID 26840484

View details for PubMedCentralID PMC4756437
Non-crossover gene conversions show strong GC bias and unexpected clustering in humans ELIFE Williams, A. L., Genovese, G., Dyer, T., Altemose, N., Truax, K., Jun, G., Patterson, N., Myers, S. R., Curran, J. E., Duggirala, R., Blangero, J., Reich, D., Przeworski, M., T2D-GENES Consortium 2015; 4

Abstract

Although the past decade has seen tremendous progress in our understanding of fine-scale recombination, little is known about non-crossover (NCO) gene conversion. We report the first genome-wide study of NCO events in humans. Using SNP array data from 98 meioses, we identified 103 sites affected by NCO, of which 50/52 were confirmed in sequence data. Overlap with double strand break (DSB) hotspots indicates that most of the events are likely of meiotic origin. We estimate that a site is involved in a NCO at a rate of 5.9 × 10(-6)/bp/generation, consistent with sperm-typing studies, and infer that tract lengths span at least an order of magnitude. Observed NCO events show strong allelic bias at heterozygous AT/GC SNPs, with 68% (58-78%) transmitting GC alleles (p = 5 × 10(-4)). Strikingly, in 4 of 15 regions with resequencing data, multiple disjoint NCO tracts cluster in close proximity (∼20-30 kb), a phenomenon not previously seen in mammals.

View details for DOI 10.7554/eLife.04637

View details for Web of Science ID 000351867100004

View details for PubMedID 25806687

View details for PubMedCentralID PMC4404656
Recombination in the Human Pseudoautosomal Region PAR1 PLOS GENETICS Hinch, A. G., Altemose, N., Noor, N., Donnelly, P., Myers, S. R. 2014; 10 (7): e1004503

Abstract

The pseudoautosomal region (PAR) is a short region of homology between the mammalian X and Y chromosomes, which has undergone rapid evolution. A crossover in the PAR is essential for the proper disjunction of X and Y chromosomes in male meiosis, and PAR deletion results in male sterility. This leads the human PAR with the obligatory crossover, PAR1, to having an exceptionally high male crossover rate, which is 17-fold higher than the genome-wide average. However, the mechanism by which this obligatory crossover occurs remains unknown, as does the fine-scale positioning of crossovers across this region. Recent research in mice has suggested that crossovers in PAR may be mediated independently of the protein PRDM9, which localises virtually all crossovers in the autosomes. To investigate recombination in this region, we construct the most fine-scale genetic map containing directly observed crossovers to date using African-American pedigrees. We leverage recombination rates inferred from the breakdown of linkage disequilibrium in human populations and investigate the signatures of DNA evolution due to recombination. Further, we identify direct PRDM9 binding sites using ChIP-seq in human cells. Using these independent lines of evidence, we show that, in contrast with mouse, PRDM9 does localise peaks of recombination in the human PAR1. We find that recombination is a far more rapid and intense driver of sequence evolution in PAR1 than it is on the autosomes. We also show that PAR1 hotspot activities differ significantly among human populations. Finally, we find evidence that PAR1 hotspot positions have changed between human and chimpanzee, with no evidence of sharing among the hottest hotspots. We anticipate that the genetic maps built and validated in this work will aid research on this vital and fascinating region of the genome.

View details for DOI 10.1371/journal.pgen.1004503

View details for Web of Science ID 000339902600048

View details for PubMedID 25033397

View details for PubMedCentralID PMC4102438
Centromere reference models for human chromosomes X and Y satellite arrays GENOME RESEARCH Miga, K. H., Newton, Y., Jain, M., Altemose, N., Willard, H. F., Kent, W. 2014; 24 (4): 697-707

Abstract

The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes.

View details for DOI 10.1101/gr.159624.113

View details for Web of Science ID 000334055600015

View details for PubMedID 24501022

View details for PubMedCentralID PMC3975068
Using population admixture to help complete maps of the human genome NATURE GENETICS Genovese, G., Handsaker, R. E., Li, H., Altemose, N., Lindgren, A. M., Chambert, K., Pasaniuc, B., Price, A. L., Reich, D., Morton, C. C., Pollak, M. R., Wilson, J. G., McCarroll, S. A. 2013; 45 (4): 406-414

Abstract

Tens of millions of base pairs of euchromatic human genome sequence, including many protein-coding genes, have no known location in the human genome. We describe an approach for localizing the human genome's missing pieces using the patterns of genome sequence variation created by population admixture. We mapped the locations of 70 scaffolds spanning 4 million base pairs of the human genome's unplaced euchromatic sequence, including more than a dozen protein-coding genes, and identified 8 new large interchromosomal segmental duplications. We find that most of these sequences are hidden in the genome's heterochromatin, particularly its pericentromeric regions. Many cryptic, pericentromeric genes are expressed at the RNA level and have been maintained intact for millions of years while their expression patterns diverged from those of paralogous genes elsewhere in the genome. We describe how knowledge of the locations of these sequences can inform disease association and genome biology studies.

View details for DOI 10.1038/ng.2565

View details for Web of Science ID 000316840600011

View details for PubMedID 23435088

View details for PubMedCentralID PMC3683849

Nicolas Altemose

Assistant Professor of Genetics

Web page: http://altemoselab.stanford.edu

Bio

Academic Appointments

Honors & Awards

Professional Education

Contact

Additional Info

Links

Current Research and Scholarly Interests

2025-26 Courses

Stanford Advisees

Graduate and Fellowship Programs

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract