Nature Genetics

2024
Watson EV, Lee JJ-K, Gulhan DC, Melloni GEM, Venev SV, Magesh RY, Frederick A, Chiba K, Wooten EC, Naxerova K, Dekker J, Park PJ, Elledge SJ. Chromosome evolution screens recapitulate tissue-specific tumor aneuploidy patterns. Nature Genetics 2024;Abstract

Whole chromosome and arm-level copy number alterations occur at high frequencies in tumors, but their selective advantages, if any, are poorly understood. Here, utilizing unbiased whole chromosome genetic screens combined with in vitro evolution to generate arm- and subarm-level events, we iteratively selected the fittest karyotypes from aneuploidized human renal and mammary epithelial cells. Proliferation-based karyotype selection in these epithelial lines modeled tissue-specific tumor aneuploidy patterns in patient cohorts in the absence of driver mutations. Hi-C-based translocation mapping revealed that arm-level events usually emerged in multiples of two via centromeric translocations and occurred more frequently in tetraploids than diploids, contributing to the increased diversity in evolving tetraploid populations. Isogenic clonal lineages enabled elucidation of pro-tumorigenic mechanisms associated with common copy number alterations, revealing Notch signaling potentiation as a driver of 1q gain in breast cancer. We propose that intrinsic, tissue-specific proliferative effects underlie tumor copy number patterns in cancer.

pdf
Jin H, Gulhan DC, Geiger B, Ben-Isvy D, Geng D, Ljungström V , Park PJ. Accurate and sensitive mutational signature analysis with MuSiCal. Nature Genetics 2024;Abstract
Mutational signature analysis is a recent computational approach for interpreting somatic mutations in the genome. Its application to cancer data has enhanced our understanding of mutational forces driving tumorigenesis and demonstrated its potential to inform prognosis and treatment decisions. However, methodological challenges remain for discovering new signatures and assigning proper weights to existing signatures, thereby hindering broader clinical applications. Here we present Mutational Signature Calculator (MuSiCal), a rigorous analytical framework with algorithms that solve major problems in the standard workflow. Our simulation studies demonstrate that MuSiCal outperforms state-of-the-art algorithms for both signature discovery and assignment. By reanalyzing more than 2,700 cancer genomes, we provide an improved catalog of signatures and their assignments, discover nine indel signatures absent in the current catalog, resolve long-standing issues with the ambiguous ‘flat’ signatures and give insights into signatures with unknown etiologies. We expect MuSiCal and the improved catalog to be a step towards establishing best practices for mutational signature analysis.
pdf
2023
Gao T, Kastriti ME, Ljungström V, Heinzel A, Tischler AS, Oberbauer R, Loh P-R, Adameyko I, Park PJ**, Kharchenko P**. A pan-tissue survey of mosaic chromosomal alterations in 948 individuals. Nature Genetics 2023;Abstract
Genetic mutations accumulate in an organism’s body throughout its lifetime. While somatic single-nucleotide variants have been well characterized in the human body, the patterns and consequences of large chromosomal alterations in normal tissues remain largely unknown. Here, we present a pan-tissue survey of mosaic chromosomal alterations (mCAs) in 948 healthy individuals from the Genotype-Tissue Expression project, augmenting RNA-based allelic imbalance estimation with haplotype phasing. We found that approximately a quarter of the individuals carry a clonally-expanded mCA in at least one tissue, with incidence strongly correlated with age. The prevalence and genome-wide patterns of mCAs vary considerably across tissue types, suggesting tissue-specific mutagenic exposure and selection pressures. The mCA landscapes in normal adrenal and pituitary glands resemble those in tumors arising from these tissues, whereas the same is not true for the esophagus and skin. Together, our findings show a widespread age-dependent emergence of mCAs across normal human tissues with intricate connections to tumorigenesis.
Chung C, Yang X, Bae T, Vong KI, Mittal S, Donkels C, Phillips WH, Li Z, Marsh AP, Breuss MW, Ball LL, Garcia CAB, George RD, Gu J, Xu M, Barrows C, James KN, Stanley V, Nidhiry AS, Khoury S, Howe G, Riley E, Xu X, Copeland B, Wang Y, Kim SH, Kang H-C, Schulze-Bonhage A, Haas CA, Urbach H, Prinz M, Limbrick Jr DD, Gurnett CA, Smyth MD, Sattar S, Nespeca M, Gonda DD, Imai K, Takahashi Y, Chen H-H, Tsai J-W, Conti V, Guerrini R, Devinsky O, Silva Jr WA, Machado HR, Mathern GW, Abyzov A, Baldassari S, Baulac S, Consortium FCDN, Brain Somatic Mosaicism Network BSM, Gleeson JG. Comprehensive multi-omic profiling of somatic mutations in malformations of cortical development. Nat Genetics 2023;55:209-220.Abstract
Malformations of cortical development (MCD) are neurological conditions involving focal disruptions of cortical architecture and cellular organization that arise during embryogenesis, largely from somatic mosaic mutations, and cause intractable epilepsy. Identifying the genetic causes of MCD has been a challenge, as mutations remain at low allelic fractions in brain tissue resected to treat condition-related epilepsy. Here we report a genetic landscape from 283 brain resections, identifying 69 mutated genes through intensive profiling of somatic mutations, combining whole-exome and targeted-amplicon sequencing with functional validation including in utero electroporation of mice and single-nucleus RNA sequencing. Genotype–phenotype correlation analysis elucidated specific MCD gene sets associated with distinct pathophysiological and clinical phenotypes. The unique single-cell level spatiotemporal expression patterns of mutated genes in control and patient brains indicate critical roles in excitatory neurogenic pools during brain development and in promoting neuronal hyperexcitability after birth.
pdf
2022
Luquette LJ, Miller MB, Zhou Z, Bohrson CL, Zhao Y, Jin H, Gulhan D, Ganz J, Bizzotto S, Kirkham S, Hochepied T, Libert C, Galor A, Kim J, Lodato MA, Garaycoechea JI, Gawad C, West J, Walsh CA, Park PJ. Single-cell genome sequencing of human neurons identifies somatic point mutation and indel enrichment in regulatory elements. Nature Genetics 2022;54:1564-1571.Abstract
Accurate somatic mutation detection from single-cell DNA sequencing is challenging due to amplification-related artifacts. To reduce this artifact burden, an improved amplification technique, primary template-directed amplification (PTA), was recently introduced. We analyzed whole-genome sequencing data from 52 PTA-amplified single neurons using SCAN2, a new genotyper we developed to leverage mutation signatures and allele balance in identifying somatic single-nucleotide variants (SNVs) and small insertions and deletions (indels) in PTA data. Our analysis confirms an increase in nonclonal somatic mutation in single neurons with age, but revises the estimated rate of this accumulation to 16 SNVs per year. We also identify artifacts in other amplification methods. Most importantly, we show that somatic indels increase by at least three per year per neuron and are enriched in functional regions of the genome such as enhancers and promoters. Our data suggest that indels in gene-regulatory elements have a considerable effect on genome integrity in human neurons.
pdf
2020
Cortés-Ciriano I, Lee JJK, Xi R, Jain D, Jung YL, Yang L, Gordenin D, Klimczak LJ, Zhang CZ, Pellman DS, Group PCAWGSVW, Park PJ, Consortium PCAWG. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nature Genetics 2020;52(3):331-341.Abstract
Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.
pdf
Rodriguez-Martin B, Alvarez EG, Baez-Ortega A, Zamora J, Supek F, Demeulemeester J, Santamarina M, Ju YS, Temes J, Garcia-Souto D, Detering H, Li Y, Rodriguez-Castro J, Dueso-Barroso A, Bruzos AL, Dentro SC, Blanco MG, Contino G, Ardeljan D, Tojo M, Roberts ND, Zumalave S, Edwards PAW, Weischenfeldt J, Puiggròs M, Chong Z, Chen K, Lee EA, Wala JA, Raine K, Butler A, Waszak SM, Navarro FCP, Schumacher SE, Monlong J, Maura F, Bolli N, Bourque G, Gerstein M, Park PJ, Wedge DC, Beroukhim R, Torrents D, Korbel JO, Martincorena I, Fitzgerald RC, Van Loo P, Kazazian HH, Burns KH, Group PCAWGSVW, Campbell PJ, Tubio JMC, Consortium PCAWG. Pan-cancer analysis of whole genome identifies driver rearrangements promoted by LINE-1 retrotransposition. Nature Genetics 2020;52(3):306-319.Abstract
About half of all cancers have somatic integrations of retrotransposons. Here, to characterize their role in oncogenesis, we analyzed the patterns and mechanisms of somatic retrotransposition in 2,954 cancer genomes from 38 histological cancer subtypes within the framework of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. We identified 19,166 somatically acquired retrotransposition events, which affected 35% of samples and spanned a range of event types. Long interspersed nuclear element (LINE-1; L1 hereafter) insertions emerged as the first most frequent type of somatic structural variation in esophageal adenocarcinoma, and the second most frequent in head-and-neck and colorectal cancers. Aberrant L1 integrations can delete megabase-scale regions of a chromosome, which sometimes leads to the removal of tumor-suppressor genes, and can induce complex translocations and large-scale duplications. Somatic retrotranspositions can also initiate breakage-fusion-bridge cycles, leading to high-level amplification of oncogenes. These observations illuminate a relevant role of L1 retrotransposition in remodeling the cancer genome, with potential implications for the development of human tumors.
pdf
2019
Gulhan DC, Lee JJ-K, Melloni GEM, Cortés-Ciriano I, Park PJ. Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nature Genetics 2019;51(5):912-919.Abstract
Mutations in BRCA1 and/or BRCA2 (BRCA1/2) are the most common indication of deficiency in the homologous recombination (HR) DNA repair pathway. However, recent genome-wide analyses have shown that the same pattern of mutations found in BRCA1/2-mutant tumors is also present in several other tumors. Here, we present a new computational tool called Signature Multivariate Analysis (SigMA), which can be used to accurately detect the mutational signature associated with HR deficiency from targeted gene panels. Whereas previous methods require whole-genome or whole-exome data, our method detects the HR-deficiency signature even from low mutation counts, by using a likelihood-based measure combined with machine-learning techniques. Cell lines that we identify as HR deficient show a significant response to poly (ADP-ribose) polymerase (PARP) inhibitors; patients with ovarian cancer whom we found to be HR deficient show a significantly longer overall survival with platinum regimens. By enabling panel-based identification of mutational signatures, our method substantially increases the number of patients that may be considered for treatments targeting HR deficiency.
Bohrson CL, Barton AR, Lodato MA, Rodin RE, Luquette LJ, Viswanadham VV, Gulhan DC, Cortés-Ciriano I, Sherman MA, Kwon M, Coulter ME, Galor A, Walsh CA, Park PJ. Linked-read analysis identifies mutations in single-cell DNA-sequencing data. Nature Genetics 2019;51:749-754.Abstract
Whole-genome sequencing of DNA from single cells has the potential to reshape our understanding of mutational heterogeneity in normal and diseased tissues. However, a major difficulty is distinguishing amplification artifacts from biologically derived somatic mutations. Here, we describe linked-read analysis (LiRA), a method that accurately identifies somatic singlenucleotide variants (sSNVs) by using read-level phasing with nearby germline heterozygous polymorphisms, thereby enabling the characterization of mutational signatures and estimation of somatic mutation rates in single cells.
pdf
2017
Wang X*, Lee RS*, Alver BH*, Haswell JR, Wang S, Mieczkowski J, Drier Y, Gillespie SM, Archer TC, Wu JN, Tzvetkov EP, Troisi EC, Pomeroy SL, Biegel JA, Tolstorukov MY, Bernstein BE**, Park PJ**, Roberts CWM**. SMARCB1-mediated SWI/SNF complex function is essential for enhancer regulation. Nat Genet 2017;49(2):289-295.Abstract

SMARCB1 (also known as SNF5, INI1, and BAF47), a core subunit of the SWI/SNF (BAF) chromatin-remodeling complex, is inactivated in nearly all pediatric rhabdoid tumors. These aggressive cancers are among the most genomically stable, suggesting an epigenetic mechanism by which SMARCB1 loss drives transformation. Here we show that, despite having indistinguishable mutational landscapes, human rhabdoid tumors exhibit distinct enhancer H3K27ac signatures, which identify remnants of differentiation programs. We show that SMARCB1 is required for the integrity of SWI/SNF complexes and that its loss alters enhancer targeting-markedly impairing SWI/SNF binding to typical enhancers, particularly those required for differentiation, while maintaining SWI/SNF binding at super-enhancers. We show that these retained super-enhancers are essential for rhabdoid tumor survival, including some that are shared by all subtypes, such as SPRY1, and other lineage-specific super-enhancers, such as SOX2 in brain-derived rhabdoid tumors. Taken together, our findings identify a new chromatin-based epigenetic mechanism underlying the tumor-suppressive activity of SMARCB1.

pdf
Mathur R, Alver BH, San Roman AK, Wilson BG, Wang X, Agoston AT, Park PJ, Shivdasani RA, Roberts CWM. ARID1A loss impairs enhancer-mediated gene regulation and drives colon cancer in mice. Nat Genet 2017;49(2):296-302.Abstract

Genes encoding subunits of SWI/SNF (BAF) chromatin-remodeling complexes are collectively mutated in ∼20% of all human cancers. Although ARID1A is the most frequent target of mutations, the mechanism by which its inactivation promotes tumorigenesis is unclear. Here we demonstrate that Arid1a functions as a tumor suppressor in the mouse colon, but not the small intestine, and that invasive ARID1A-deficient adenocarcinomas resemble human colorectal cancer (CRC). These tumors lack deregulation of APC/β-catenin signaling components, which are crucial gatekeepers in common forms of intestinal cancer. We find that ARID1A normally targets SWI/SNF complexes to enhancers, where they function in coordination with transcription factors to facilitate gene activation. ARID1B preserves SWI/SNF function in ARID1A-deficient cells, but defects in SWI/SNF targeting and control of enhancer activity cause extensive dysregulation of gene expression. These findings represent an advance in colon cancer modeling and implicate enhancer-mediated gene regulation as a principal tumor-suppressor function of ARID1A.

pdf
2015
Jung H, Lee D, Lee J, Park D, Kim YJ, Park W-Y, Hong D**, Park PJ**, Lee E**. Intron retention is a widespread mechanism of tumor-suppressor inactivation. Nat Genet 2015;47(11):1242-8.Abstract

-A substantial fraction of disease-causing mutations are pathogenic through aberrant splicing. Although genome profiling studies have identified somatic single-nucleotide variants (SNVs) in cancer, the extent to which these variants trigger abnormal splicing has not been systematically examined. Here we analyzed RNA sequencing and exome data from 1,812 patients with cancer and identified ∼900 somatic exonic SNVs that disrupt splicing. At least 163 SNVs, including 31 synonymous ones, were shown to cause intron retention or exon skipping in an allele-specific manner, with ∼70% of the SNVs occurring on the last base of exons. Notably, SNVs causing intron retention were enriched in tumor suppressors, and 97% of these SNVs generated a premature termination codon, leading to loss of function through nonsense-mediated decay or truncated protein. We also characterized the genomic features predictive of such splicing defects. Overall, this work demonstrates that intron retention is a common mechanism of tumor-suppressor inactivation.

2013
Cancer Genome Atlas Research Network TCGA, Weinstein JN, Collisson EA, Mills GB, Shaw KMR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 2013;45(10):1113-20.Abstract

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

pdf
2012
Stadtfeld M, Apostolou E, Ferrari F, Choi J, Walsh RM, Chen T, Ooi SSK, Kim SY, Bestor TH, Shioda T, Park PJ, Hochedlinger K. Ascorbic acid prevents loss of Dlk1-Dio3 imprinting and facilitates generation of all-iPS cell mice from terminally differentiated B cells. Nat Genet 2012;44(4):398-405, S1-2.Abstract

The generation of induced pluripotent stem cells (iPSCs) often results in aberrant epigenetic silencing of the imprinted Dlk1-Dio3 gene cluster, compromising the ability to generate entirely iPSC-derived adult mice ('all-iPSC mice'). Here, we show that reprogramming in the presence of ascorbic acid attenuates hypermethylation of Dlk1-Dio3 by enabling a chromatin configuration that interferes with binding of the de novo DNA methyltransferase Dnmt3a. This approach allowed us to generate all-iPSC mice from mature B cells, which have until now failed to support the development of exclusively iPSC-derived postnatal animals. Our data show that transcription factor-mediated reprogramming can endow a defined, terminally differentiated cell type with a developmental potential equivalent to that of embryonic stem cells. More generally, these findings indicate that culture conditions during cellular reprogramming can strongly influence the epigenetic and biological properties of the resultant iPSCs.

pdf
2011
Kharchenko PV, Xi R, Park PJ. Evidence for dosage compensation between the X chromosome and autosomes in mammals. Nat Genet 2011;43(12):1167-9; author reply 1171-2.
2008
Mueller JL, Mahadevaiah SK, Park PJ, Warburton PE, Page DC, Turner JMA. The mouse X chromosome is enriched for multicopy testis genes showing postmeiotic expression. Nat Genet 2008;40(6):794-9.Abstract

According to the prevailing view, mammalian X chromosomes are enriched in spermatogenesis genes expressed before meiosis and deficient in spermatogenesis genes expressed after meiosis. The paucity of postmeiotic genes on the X chromosome has been interpreted as a consequence of meiotic sex chromosome inactivation (MSCI)--the complete silencing of genes on the XY bivalent at meiotic prophase. Recent studies have concluded that MSCI-initiated silencing persists beyond meiosis and that most genes on the X chromosome remain repressed in round spermatids. Here, we report that 33 multicopy gene families, representing approximately 273 mouse X-linked genes, are expressed in the testis and that this expression is predominantly in postmeiotic cells. RNA FISH and microarray analysis show that the maintenance of X chromosome postmeiotic repression is incomplete. Furthermore, X-linked multicopy genes exhibit a similar degree of expression as autosomal genes. Thus, not only is the mouse X chromosome enriched for spermatogenesis genes functioning before meiosis, but in addition, approximately 18% of mouse X-linked genes are expressed in postmeiotic cells.

pdf