Publications

2014
Ferrari F, Alekseyenko AA, Park PJ, Kuroda MI. Transcriptional control of a whole chromosome: emerging models for dosage compensation. Nat Struct Mol Biol 2014;21(2):118-25.Abstract

Males and females of many animal species differ in their sex-chromosome karyotype, and this creates imbalances between X-chromosome and autosomal gene products that require compensation. Although distinct molecular mechanisms have evolved in three highly studied systems, they all achieve coordinate regulation of an entire chromosome by differential RNA-polymerase occupancy at X-linked genes. High-throughput genome-wide methods have been pivotal in driving the latest progress in the field. Here we review the emerging models for dosage compensation in mammals, flies and nematodes, with a focus on mechanisms affecting RNA polymerase II activity on the X chromosome.

2013
Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 2013;499(7456):43-9.Abstract
Genetic changes underlying clear cell renal cell carcinoma (ccRCC) include alterations in genes controlling cellular oxygen sensing (for example, VHL) and the maintenance of chromatin states (for example, PBRM1). We surveyed more than 400 tumours using different genomic platforms and identified 19 significantly mutated genes. The PI(3)K/AKT pathway was recurrently mutated, suggesting this pathway as a potential therapeutic target. Widespread DNA hypomethylation was associated with mutation of the H3K36 methyltransferase SETD2, and integrative analysis suggested that mutations involving the SWI/SNF chromatin remodelling complex (PBRM1, ARID1A, SMARCA4) could have far-reaching effects on other pathways. Aggressive cancers demonstrated evidence of a metabolic shift, involving downregulation of genes involved in the TCA cycle, decreased AMPK and PTEN protein levels, upregulation of the pentose phosphate pathway and the glutamine transporter genes, increased acetyl-CoA carboxylase protein, and altered promoter methylation of miR-21 (also known as MIR21) and GRB10. Remodelling cellular metabolism thus constitutes a recurrent pattern in ccRCC that correlates with tumour stage and severity and offers new views on the opportunities for disease treatment.
Kandoth C, Schultz N, Cherniack AD, Akbani R, Liu Y, Shen H, Robertson GA, Pashtan I, Shen R, Benz CC, Yau C, Laird PW, Ding L, Zhang W, Mills GB, Kucherlapati R, Mardis ER, Levine DA. Integrated genomic characterization of endometrial carcinoma. Nature 2013;497(7447):67-73.Abstract
We performed an integrated genomic, transcriptomic and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumours and ∼25% of high-grade endometrioid tumours had extensive copy number alterations, few DNA methylation changes, low oestrogen receptor/progesterone receptor levels, and frequent TP53 mutations. Most endometrioid tumours had few copy number alterations or TP53 mutations, but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A and KRAS and novel mutations in the SWI/SNF chromatin remodelling complex gene ARID5B. A subset of endometrioid tumours that we identified had a markedly increased transversion mutation frequency and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy-number low, and copy-number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may affect post-surgical adjuvant treatment for women with aggressive tumours.
Brennan CW, Verhaak RGW, McKenna A, Campos B, Noushmehr H, Salama SR, Zheng S, Chakravarty D, Sanborn ZJ, Berman SH, Beroukhim R, Bernard B, Wu C-J, Genovese G, Shmulevich I, Barnholtz-Sloan J, Zou L, Vegesna R, Shukla SA, Ciriello G, Yung WK, Zhang W, Sougnez C, Mikkelsen T, Aldape K, Bigner DD, Van Meir EG, Prados M, Sloan A, Black KL, Eschbacher J, Finocchiaro G, Friedman W, Andrews DW, Guha A, Iacocca M, O'Neill BP, Foltz G, Myers J, Weisenberger DJ, Penny R, Kucherlapati R, Perou CM, Hayes ND, Gibbs R, Marra M, Mills GB, Lander E, Spellman P, Wilson R, Sander C, Weinstein J, Meyerson M, Gabriel S, Laird PW, Haussler D, Getz G, Chin L. The somatic genomic landscape of glioblastoma. Cell 2013;155(2):462-77.Abstract
We describe the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer.
Cancer Genome Atlas Research Network TCGA, Weinstein JN, Collisson EA, Mills GB, Shaw KMR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 2013;45(10):1113-20.Abstract

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

Soruco MML*, Chery J*, Bishop EP*, Siggers T, Tolstorukov MY, Leydon AR, Sugden AU, Goebel K, Feng J, Xia P, Vedenko A, Bulyk ML, Park PJ, Larschan E. The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. Genes Dev 2013;27(14):1551-6.Abstract

The Drosophila male-specific lethal (MSL) dosage compensation complex increases transcript levels on the single male X chromosome to equal the transcript levels in XX females. However, it is not known how the MSL complex is linked to its DNA recognition elements, the critical first step in dosage compensation. Here, we demonstrate that a previously uncharacterized zinc finger protein, CLAMP (chromatin-linked adaptor for MSL proteins), functions as the first link between the MSL complex and the X chromosome. CLAMP directly binds to the MSL complex DNA recognition elements and is required for the recruitment of the MSL complex. The discovery of CLAMP identifies a key factor required for the chromosome-specific targeting of dosage compensation, providing new insights into how subnuclear domains of coordinate gene regulation are formed within metazoan genomes.

Alekseyenko AA, Ellison CE, Gorchakov AA, Zhou Q, Kaiser VB, Toda N, Walton Z, Peng S, Park PJ, Bachtrog D, Kuroda MI. Conservation and de novo acquisition of dosage compensation on newly evolved sex chromosomes in Drosophila. Genes Dev 2013;27(8):853-8.Abstract

Dosage compensation has arisen in response to the evolution of distinct male (XY) and female (XX) karyotypes. In Drosophila melanogaster, the MSL complex increases male X transcription approximately twofold. X-specific targeting is thought to occur through sequence-dependent binding to chromatin entry sites (CESs), followed by spreading in cis to active genes. We tested this model by asking how newly evolving sex chromosome arms in Drosophila miranda acquired dosage compensation. We found evidence for the creation of new CESs, with the analogous sequence and spacing as in D. melanogaster, providing strong support for the spreading model in the establishment of dosage compensation.

Zhang B*, Day DS*, Ho JW, Song L, Cao J, Christodoulou D, Seidman JG, Crawford GE, Park PJ, Pu WT. A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity. Genome Res 2013;23(6):917-27.Abstract

Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth factor A (VEGFA), a major regulator of angiogenesis, triggers changes in transcriptional activity of human umbilical vein endothelial cells (HUVECs). Here, we used chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) to measure genome-wide changes in histone H3 acetylation at lysine 27 (H3K27ac), a marker of active enhancers, in unstimulated HUVECs and HUVECs stimulated with VEGFA for 1, 4, and 12 h. We show that sites with the greatest H3K27ac change upon stimulation were associated tightly with EP300, a histone acetyltransferase. Using the variation of H3K27ac as a novel epigenetic signature, we identified transcriptional regulatory elements that are functionally linked to angiogenesis, participate in rapid VEGFA-stimulated changes in chromatin conformation, and mediate VEGFA-induced transcriptional responses. Dynamic H3K27ac deposition and associated changes in chromatin conformation required EP300 activity instead of altered nucleosome occupancy or changes in DNase I hypersensitivity. EP300 activity was also required for a subset of dynamic H3K27ac sites to loop into proximity of promoters. Our study identified thousands of endothelial, VEGFA-responsive enhancers, demonstrating that an epigenetic signature based on the variation of a chromatin feature is a productive approach to define signal-responsive genomic elements. Further, our study implicates global epigenetic modifications in rapid, signal-responsive transcriptional regulation.

Apostolou E*, Ferrari F*, Walsh RM, Bar-Nur O, Stadtfeld M, Cheloufi S, Stuart HT, Polo JM, Ohsumi TK, Borowsky ML, Kharchenko PV, Park PJ**, Hochedlinger K**. Genome-wide chromatin interactions of the Nanog locus in pluripotency, differentiation, and reprogramming. Cell Stem Cell 2013;12(6):699-712.Abstract

The chromatin state of pluripotency genes has been studied extensively in embryonic stem cells (ESCs) and differentiated cells, but their potential interactions with other parts of the genome remain largely unexplored. Here, we identified a genome-wide, pluripotency-specific interaction network around the Nanog promoter by adapting circular chromosome conformation capture sequencing. This network was rearranged during differentiation and restored in induced pluripotent stem cells. A large fraction of Nanog-interacting loci were bound by Mediator or cohesin in pluripotent cells. Depletion of these proteins from ESCs resulted in a disruption of contacts and the acquisition of a differentiation-specific interaction pattern prior to obvious transcriptional and phenotypic changes. Similarly, the establishment of Nanog interactions during reprogramming often preceded transcriptional upregulation of associated genes, suggesting a causative link. Our results document a complex, pluripotency-specific chromatin "interactome" for Nanog and suggest a functional role for long-range genomic interactions in the maintenance and induction of pluripotency.

Majumdar S, Gong EM, Di Vizio D, Dreyfuss JM, Degraff DJ, Hager MH, Park PJ, Bellmunt J, Matusik RJ, Rosenberg JE, Adam RM. Loss of Sh3gl2/endophilin A1 is a common event in urothelial carcinoma that promotes malignant behavior. Neoplasia 2013;15(7):749-60.Abstract

Urothelial carcinoma (UC) causes substantial morbidity and mortality worldwide. However, the molecular mechanisms underlying urothelial cancer development and tumor progression are still largely unknown. Using informatics analysis, we identified Sh3gl2 (endophilin A1) as a bladder urothelium-enriched transcript. The gene encoding Sh3gl2 is located on chromosome 9p, a region frequently altered in UC. Sh3gl2 is known to regulate endocytosis of receptor tyrosine kinases implicated in oncogenesis, such as the epidermal growth factor receptor (EGFR) and c-Met. However, its role in UC pathogenesis is unknown. Informatics analysis of expression profiles as well as immunohistochemical staining of tissue microarrays revealed Sh3gl2 expression to be decreased in UC specimens compared to nontumor tissues. Loss of Sh3gl2 was associated with increasing tumor grade and with muscle invasion, which is a reliable predictor of metastatic disease and cancer-derived mortality. Sh3gl2 expression was undetectable in 19 of 20 human UC cell lines but preserved in the low-grade cell line RT4. Stable silencing of Sh3gl2 in RT4 cells by RNA interference 1) enhanced proliferation and colony formation in vitro, 2) inhibited EGF-induced EGFR internalization and increased EGFR activation, 3) stimulated phosphorylation of Src family kinases and STAT3, and 4) promoted growth of RT4 xenografts in subrenal capsule tissue recombination experiments. Conversely, forced re-expression of Sh3gl2 in T24 cells and silenced RT4 clones attenuated oncogenic behaviors, including growth and migration. Together, these findings identify loss of Sh3gl2 as a frequent event in UC development that promotes disease progression.

Kim Y-J, Lee H-J, Kim T-M, Eisinger-Mathason KTS, Zhang AY, Schmidt B, Karl DL, Nakazawa MS, Park PJ, Simon CM, Yoon SS. Overcoming evasive resistance from vascular endothelial growth factor a inhibition in sarcomas by genetic or pharmacologic targeting of hypoxia-inducible factor 1α. Int J Cancer 2013;132(1):29-41.Abstract

Increased levels of hypoxia and hypoxia-inducible factor 1α (HIF-1α) in human sarcomas correlate with tumor progression and radiation resistance. Prolonged antiangiogenic therapy of tumors not only delays tumor growth but may also increase hypoxia and HIF-1α activity. In our recent clinical trial, treatment with the vascular endothelial growth factor A (VEGF-A) antibody, bevacizumab, followed by a combination of bevacizumab and radiation led to near complete necrosis in nearly half of sarcomas. Gene Set Enrichment Analysis of microarrays from pretreatment biopsies found that the Gene Ontology category "Response to hypoxia" was upregulated in poor responders and that the hierarchical clustering based on 140 hypoxia-responsive genes reliably separated poor responders from good responders. The most commonly used chemotherapeutic drug for sarcomas, doxorubicin (Dox), was recently found to block HIF-1α binding to DNA at low metronomic doses. In four sarcoma cell lines, HIF-1α shRNA or Dox at low concentrations blocked HIF-1α induction of VEGF-A by 84-97% and carbonic anhydrase 9 by 83-93%. HT1080 sarcoma xenografts had increased hypoxia and/or HIF-1α activity with increasing tumor size and with anti-VEGF receptor antibody (DC101) treatment. Combining DC101 with HIF-1α shRNA or metronomic Dox had a synergistic effect in suppressing growth of HT1080 xenografts, at least in part via induction of tumor endothelial cell apoptosis. In conclusion, sarcomas respond to increased hypoxia by expressing HIF-1α target genes that may promote resistance to antiangiogenic and other therapies. HIF-1α inhibition blocks this evasive resistance and augments destruction of the tumor vasculature.

Gokcumen O, Tischler V, Tica J, Zhu Q, Iskow RC, Lee E, Fritz MH-Y, Langdon A, Stütz AM, Pavlidis P, Benes V, Mills RE, Park PJ, Lee C, Korbel JO. Primate genome architecture influences structural variation mechanisms and functional consequences. Proc Natl Acad Sci U S A 2013;110(39):15764-9.Abstract

Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.

Woo CJ, Kharchenko PV, Daheron L, Park PJ, Kingston RE. Variable requirements for DNA-binding proteins at polycomb-dependent repressive regions in human HOX clusters. Mol Cell Biol 2013;33(16):3274-85.Abstract

Polycomb group (PcG)-mediated repression is an evolutionarily conserved process critical for cell fate determination and maintenance of gene expression during embryonic development. However, the mechanisms underlying PcG recruitment in mammals remain unclear since few regulatory sites have been identified. We report two novel prospective PcG-dependent regulatory elements within the human HOXB and HOXC clusters and compare their repressive activities to a previously identified element in the HOXD cluster. These regions recruited the PcG proteins BMI1 and SUZ12 to a reporter construct in mesenchymal stem cells and conferred repression that was dependent upon PcG expression. Furthermore, we examined the potential of two DNA-binding proteins, JARID2 and YY1, to regulate PcG activity at these three elements. JARID2 has differential requirements, whereas YY1 appears to be required for repressive activity at all 3 sites. We conclude that distinct elements of the mammalian HOX clusters can recruit components of the PcG complexes and confer repression, similar to what has been seen in Drosophila. These elements, however, have diverse requirements for binding factors, which, combined with previous data on other loci, speaks to the complexity of PcG targeting in mammals.

Ferrari F*, Jung YL*, Kharchenko PV, Plachetka A, Alekseyenko AA, Kuroda MI, Park PJ. Comment on "Drosophila dosage compensation involves enhanced Pol II recruitment to male X-linked promoters". Science 2013;340(6130):273.Abstract

Conrad et al. (Reports, 10 August 2012, p. 742) reported a doubling of RNA polymerase II (Pol II) occupancy at X-linked promoters to support 5' recruitment as the key mechanism for dosage compensation in Drosophila. However, they employed an erroneous data-processing step, overestimating Pol II differences. Reanalysis of the data fails to support the authors' model for dosage compensation.

Davoli T, Xu AW, Mengwasser KE, Sack LM, Yoon JC, Park PJ, Elledge SJ. Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome. Cell 2013;155(4):948-62.Abstract

Aneuploidy has been recognized as a hallmark of cancer for more than 100 years, yet no general theory to explain the recurring patterns of aneuploidy in cancer has emerged. Here, we develop Tumor Suppressor and Oncogene (TUSON) Explorer, a computational method that analyzes the patterns of mutational signatures in tumors and predicts the likelihood that any individual gene functions as a tumor suppressor (TSG) or oncogene (OG). By analyzing >8,200 tumor-normal pairs, we provide statistical evidence suggesting that many more genes possess cancer driver properties than anticipated, forming a continuum of oncogenic potential. Integrating our driver predictions with information on somatic copy number alterations, we find that the distribution and potency of TSGs (STOP genes), OGs, and essential genes (GO genes) on chromosomes can predict the complex patterns of aneuploidy and copy number variation characteristic of cancer genomes. We propose that the cancer genome is shaped through a process of cumulative haploinsufficiency and triplosensitivity.

Yang L, Luquette LJ, Gehlenborg N, Xi R, Haseley PS, Hsieh C-H, Zhang C, Ren X, Protopopov A, Chin L, Kucherlapati R, Lee C, Park PJ. Diverse mechanisms of somatic structural variations in human cancer genomes. Cell 2013;153(4):919-29.Abstract

Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.

Kim T-M, Xi R, Luquette LJ, Park RW, Johnson MD, Park PJ. Functional genomic analysis of chromosomal aberrations in a compendium of 8000 cancer genomes. Genome Res 2013;23(2):217-27.Abstract

A large database of copy number profiles from cancer genomes can facilitate the identification of recurrent chromosomal alterations that often contain key cancer-related genes. It can also be used to explore low-prevalence genomic events such as chromothripsis. In this study, we report an analysis of 8227 human cancer copy number profiles obtained from 107 array comparative genomic hybridization (CGH) studies. Our analysis reveals similarity of chromosomal arm-level alterations among developmentally related tumor types as well as a number of co-occurring pairs of arm-level alterations. Recurrent ("pan-lineage") focal alterations identified across diverse tumor types show an enrichment of known cancer-related genes and genes with relevant functions in cancer-associated phenotypes (e.g., kinase and cell cycle). Tumor type-specific ("lineage-restricted") alterations and their enriched functional categories were also identified. Furthermore, we developed an algorithm for detecting regions in which the copy number oscillates rapidly between fixed levels, indicative of chromothripsis. We observed these massive genomic rearrangements in 1%-2% of the samples with variable tumor type-specific incidence rates. Taken together, our comprehensive view of copy number alterations provides a framework for understanding the functional significance of various genomic alterations in cancer genomes.

Ferrari F, Plachetka A, Alekseyenko AA, Jung YL, Ozsolak F, Kharchenko PV, Park PJ, Kuroda MI. "Jump start and gain" model for dosage compensation in Drosophila based on direct sequencing of nascent transcripts. Cell Rep 2013;5(3):629-36.Abstract

Dosage compensation in Drosophila is mediated by the MSL complex, which increases male X-linked gene expression approximately 2-fold. The MSL complex preferentially binds the bodies of active genes on the male X, depositing H4K16ac with a 3' bias. Two models have been proposed for the influence of the MSL complex on transcription: one based on promoter recruitment of RNA polymerase II (Pol II), and a second featuring enhanced transcriptional elongation. Here, we utilize nascent RNA sequencing to document dosage compensation during transcriptional elongation. We also compare X and autosomes from published data on paused and elongating polymerase in order to assess the role of Pol II recruitment. Our results support a model for differentially regulated elongation, starting with release from 5' pausing and increasing through X-linked gene bodies. Our results highlight facilitated transcriptional elongation as a key mechanism for the coordinated regulation of a diverse set of genes.

Tzatsos A, Paskaleva P*, Ferrari F*, Deshpande V, Stoykova S, Contino G, Wong K-K, Lan F, Trojer P, Park PJ, Bardeesy N. KDM2B promotes pancreatic cancer via Polycomb-dependent and -independent transcriptional programs. J Clin Invest 2013;123(2):727-39.Abstract

Epigenetic mechanisms mediate heritable control of cell identity in normal cells and cancer. We sought to identify epigenetic regulators driving the pathogenesis of pancreatic ductal adenocarcinoma (PDAC), one of the most lethal human cancers. We found that KDM2B (also known as Ndy1, FBXL10, and JHDM1B), an H3K36 histone demethylase implicated in bypass of cellular senescence and somatic cell reprogramming, is markedly overexpressed in human PDAC, with levels increasing with disease grade and stage, and highest expression in metastases. KDM2B silencing abrogated tumorigenicity of PDAC cell lines exhibiting loss of epithelial differentiation, whereas KDM2B overexpression cooperated with KrasG12D to promote PDAC formation in mouse models. Gain- and loss-of-function experiments coupled to genome-wide gene expression and ChIP studies revealed that KDM2B drives tumorigenicity through 2 different transcriptional mechanisms. KDM2B repressed developmental genes through cobinding with Polycomb group (PcG) proteins at transcriptional start sites, whereas it activated a module of metabolic genes, including mediators of protein synthesis and mitochondrial function, cobound by the MYC oncogene and the histone demethylase KDM5A. These results defined epigenetic programs through which KDM2B subverts cellular differentiation and drives the pathogenesis of an aggressive subset of PDAC.

Kim T-M, Laird PW, Park PJ. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 2013;155(4):858-68.Abstract

Microsatellites-simple tandem repeats present at millions of sites in the human genome-can shorten or lengthen due to a defect in DNA mismatch repair. We present here a comprehensive genome-wide analysis of the prevalence, mutational spectrum, and functional consequences of microsatellite instability (MSI) in cancer genomes. We analyzed MSI in 277 colorectal and endometrial cancer genomes (including 57 microsatellite-unstable ones) using exome and whole-genome sequencing data. Recurrent MSI events in coding sequences showed tumor type specificity, elevated frameshift-to-inframe ratios, and lower transcript levels than wild-type alleles. Moreover, genome-wide analysis revealed differences in the distribution of MSI versus point mutations, including overrepresentation of MSI in euchromatic and intronic regions compared to heterochromatic and intergenic regions, respectively, and depletion of MSI at nucleosome-occupied sequences. Our results provide a panoramic view of MSI in cancer genomes, highlighting their tumor type specificity, impact on gene expression, and the role of chromatin organization.

Pages