Publications

2017
Alver BH*, Kim KH*, Lu P, Wang X, Manchester HE, Wang W, Haswell JR, Park PJ**, Roberts CWM**. The SWI/SNF chromatin remodelling complex is required for maintenance of lineage specific enhancers. [Internet]. Nat Commun 2017;8:14648. PDF (open access) Abstract

Genes encoding subunits of SWI/SNF (BAF) chromatin remodelling complexes are collectively altered in over 20% of human malignancies, but the mechanisms by which these complexes alter chromatin to modulate transcription and cell fate are poorly understood. Utilizing mouse embryonic fibroblast and cancer cell line models, here we show via ChIP-seq and biochemical assays that SWI/SNF complexes are preferentially targeted to distal lineage specific enhancers and interact with p300 to modulate histone H3 lysine 27 acetylation. We identify a greater requirement for SWI/SNF at typical enhancers than at most super-enhancers and at enhancers in untranscribed regions than in transcribed regions. Our data further demonstrate that SWI/SNF-dependent distal enhancers are essential for controlling expression of genes linked to developmental processes. Our findings thus establish SWI/SNF complexes as regulators of the enhancer landscape and provide insight into the roles of SWI/SNF in cellular fate control.

Wang X*, Lee RS*, Alver BH*, Haswell JR, Wang S, Mieczkowski J, Drier Y, Gillespie SM, Archer TC, Wu JN, Tzvetkov EP, Troisi EC, Pomeroy SL, Biegel JA, Tolstorukov MY, Bernstein BE**, Park PJ**, Roberts CWM**. SMARCB1-mediated SWI/SNF complex function is essential for enhancer regulation. Nat Genet 2017;49(2):289-295.Abstract

SMARCB1 (also known as SNF5, INI1, and BAF47), a core subunit of the SWI/SNF (BAF) chromatin-remodeling complex, is inactivated in nearly all pediatric rhabdoid tumors. These aggressive cancers are among the most genomically stable, suggesting an epigenetic mechanism by which SMARCB1 loss drives transformation. Here we show that, despite having indistinguishable mutational landscapes, human rhabdoid tumors exhibit distinct enhancer H3K27ac signatures, which identify remnants of differentiation programs. We show that SMARCB1 is required for the integrity of SWI/SNF complexes and that its loss alters enhancer targeting-markedly impairing SWI/SNF binding to typical enhancers, particularly those required for differentiation, while maintaining SWI/SNF binding at super-enhancers. We show that these retained super-enhancers are essential for rhabdoid tumor survival, including some that are shared by all subtypes, such as SPRY1, and other lineage-specific super-enhancers, such as SOX2 in brain-derived rhabdoid tumors. Taken together, our findings identify a new chromatin-based epigenetic mechanism underlying the tumor-suppressive activity of SMARCB1.

Mathur R, Alver BH, San Roman AK, Wilson BG, Wang X, Agoston AT, Park PJ, Shivdasani RA, Roberts CWM. ARID1A loss impairs enhancer-mediated gene regulation and drives colon cancer in mice. Nat Genet 2017;49(2):296-302.Abstract

Genes encoding subunits of SWI/SNF (BAF) chromatin-remodeling complexes are collectively mutated in ∼20% of all human cancers. Although ARID1A is the most frequent target of mutations, the mechanism by which its inactivation promotes tumorigenesis is unclear. Here we demonstrate that Arid1a functions as a tumor suppressor in the mouse colon, but not the small intestine, and that invasive ARID1A-deficient adenocarcinomas resemble human colorectal cancer (CRC). These tumors lack deregulation of APC/β-catenin signaling components, which are crucial gatekeepers in common forms of intestinal cancer. We find that ARID1A normally targets SWI/SNF complexes to enhancers, where they function in coordination with transcription factors to facilitate gene activation. ARID1B preserves SWI/SNF function in ARID1A-deficient cells, but defects in SWI/SNF targeting and control of enhancer activity cause extensive dysregulation of gene expression. These findings represent an advance in colon cancer modeling and implicate enhancer-mediated gene regulation as a principal tumor-suppressor function of ARID1A.

McElroy KA, Jung YL, Zee BM, Wang CI, Park PJ, Kuroda MI. upSET, the Drosophila homologue of SET3, Is Required for Viability and the Proper Balance of Active and Repressive Chromatin Marks. G3 (Bethesda) 2017;7(2):625-635.Abstract

Chromatin plays a critical role in faithful implementation of gene expression programs. Different post-translational modifications (PTMs) of histone proteins reflect the underlying state of gene activity, and many chromatin proteins write, erase, bind, or are repelled by, these histone marks. One such protein is UpSET, the Drosophila homolog of yeast Set3 and mammalian KMT2E (MLL5). Here, we show that UpSET is necessary for the proper balance between active and repressed states. Using CRISPR/Cas-9 editing, we generated S2 cells that are mutant for upSET We found that loss of UpSET is tolerated in S2 cells, but that heterochromatin is misregulated, as evidenced by a strong decrease in H3K9me2 levels assessed by bulk histone PTM quantification. To test whether this finding was consistent in the whole organism, we deleted the upSET coding sequence using CRISPR/Cas-9, which we found to be lethal in both sexes in flies. We were able to rescue this lethality using a tagged upSET transgene, and found that UpSET protein localizes to transcriptional start sites (TSS) of active genes throughout the genome. Misregulated heterochromatin is apparent by suppressed position effect variegation of the w(m4) allele in heterozygous upSET-deleted flies. Using nascent-RNA sequencing in the upSET-mutant S2 lines, we show that this result applies to heterochromatin genes generally. Our findings support a critical role for UpSET in maintaining heterochromatin, perhaps by delimiting the active chromatin environment.

Cancer Genome Atlas Research Network TCGA. Integrated genomic and molecular characterization of cervical cancer. Nature 2017;Abstract

Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, the largest comprehensive genomic study of cervical cancer to date. We observed striking APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A, and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered novel amplifications in immune targets CD274/PD-L1 and PDCD1LG2/PD-L2, and the BCAR4 lncRNA that has been associated with response to lapatinib. HPV integration was observed in all HPV18-related cases and 76% of HPV16-related cases, and was associated with structural aberrations and increased target gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumors with high frequencies of KRAS, ARID1A, and PTEN mutations. Integrative clustering of 178 samples identified Keratin-low Squamous, Keratin-high Squamous, and Adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers.

Cancer Genome Atlas Research Network TCGA. Integrated genomic characterization of oesophageal carcinoma. Nature 2017;541(7636):169-175.Abstract

Oesophageal cancers are prominent worldwide; however, there are few targeted therapies and survival rates for these cancers remain dismal. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies.

2016
Saini N, Roberts SA, Klimczak LJ, Chan K, Grimm SA, Dai S, Fargo DC, Boyer JC, Kaufmann WK, Taylor JA, Lee E, Cortes-Ciriano I, Park PJ, Schurman SH, Malc EP, Mieczkowski PA, Gordenin DA. The impact of environmental and endogenous damage on somatic mutation load in human skin fibroblasts. PLoS genetics 2016;12(10):e1006385.
Day DS*, Zhang B*, Stevens SM, Ferrari F, Larschan EN, Park PJ**, Pu WT**. Comprehensive analysis of promoter-proximal RNA polymerase II pausing across mammalian cell types. Genome Biol 2016;17(1):120.Abstract

BACKGROUND: For many genes, RNA polymerase II stably pauses before transitioning to productive elongation. Although polymerase II pausing has been shown to be a mechanism for regulating transcriptional activation, the extent to which it is involved in control of mammalian gene expression and its relationship to chromatin structure remain poorly understood. RESULTS: Here, we analyze 85 RNA polymerase II chromatin immunoprecipitation (ChIP)-sequencing experiments from 35 different murine and human samples, as well as related genome-wide datasets, to gain new insights into the relationship between polymerase II pausing and gene regulation. Across cell and tissue types, paused genes (pausing index > 2) comprise approximately 60 % of expressed genes and are repeatedly associated with specific biological functions. Paused genes also have lower cell-to-cell expression variability. Increased pausing has a non-linear effect on gene expression levels, with moderately paused genes being expressed more highly than other paused genes. The highest gene expression levels are often achieved through a novel pause-release mechanism driven by high polymerase II initiation. In three datasets examining the impact of extracellular signals, genes responsive to stimulus have slightly lower pausing index on average than non-responsive genes, and rapid gene activation is linked to conditional pause-release. Both chromatin structure and local sequence composition near the transcription start site influence pausing, with divergent features between mammals and Drosophila. Most notably, in mammals pausing is positively correlated with histone H2A.Z occupancy at promoters. CONCLUSIONS: Our results provide new insights into the contribution of RNA polymerase II pausing in mammalian gene regulation and chromatin structure.

Xi R, Lee S, Xia Y, Kim T-M, Park PJ. Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants. Nucleic Acids Res 2016;Abstract

Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing methods. We apply this algorithm to low coverage whole-genome sequencing data from peripheral blood of nearly a thousand patients across eleven cancer types in The Cancer Genome Atlas (TCGA) to identify cancer-predisposing CNV regions. We confirm known regions and discover new ones including those covering KMT2C, GOLPH3, ERBB2 and PLAG1 Analysis of colorectal cancer genomes in particular reveals novel recurrent CNVs including deletions at two chromatin-remodeling genes RERE and NPM2 This method will be useful to many researchers interested in profiling CNVs from whole-genome sequencing data.

Yang L*, Lee M-S*, Lu H*, Oh D-Y, Kim YJ, Park D, Park G, Ren X, Bristow CA, Haseley PS, Lee S, Pantazi A, Kucherlapati R, Park W-Y, Scott KL**, Choi Y-L**, Park PJ**. Analyzing Somatic Genome Rearrangements in Human Cancers by Using Whole-Exome Sequencing. Am J Hum Genet 2016;98(5):843-56.Abstract

Although exome sequencing data are generated primarily to detect single-nucleotide variants and indels, they can also be used to identify a subset of genomic rearrangements whose breakpoints are located in or near exons. Using >4,600 tumor and normal pairs across 15 cancer types, we identified over 9,000 high confidence somatic rearrangements, including a large number of gene fusions. We find that the 5' fusion partners of functional fusions are often housekeeping genes, whereas the 3' fusion partners are enriched in tyrosine kinases. We establish the oncogenic potential of ROR1-DNAJC6 and CEP85L-ROS1 fusions by showing that they can promote cell proliferation in vitro and tumor formation in vivo. Furthermore, we found that ∼4% of the samples have massively rearranged chromosomes, many of which are associated with upregulation of oncogenes such as ERBB2 and TERT. Although the sensitivity of detecting structural alterations from exomes is considerably lower than that from whole genomes, this approach will be fruitful for the multitude of exomes that have been and will be generated, both in cancer and in other diseases.

Mieczkowski J, Cook A, Bowman SK, Mueller B, Alver BH, Kundu S, Deaton AM, Urban JA, Larschan E, Park PJ, Kingston RE, Tolstorukov MY. MNase titration reveals differences between nucleosome occupancy and chromatin accessibility. Nat Commun 2016;7:11485.Abstract

Chromatin accessibility plays a fundamental role in gene regulation. Nucleosome placement, usually measured by quantifying protection of DNA from enzymatic digestion, can regulate accessibility. We introduce a metric that uses micrococcal nuclease (MNase) digestion in a novel manner to measure chromatin accessibility by combining information from several digests of increasing depths. This metric, MACC (MNase accessibility), quantifies the inherent heterogeneity of nucleosome accessibility in which some nucleosomes are seen preferentially at high MNase and some at low MNase. MACC interrogates each genomic locus, measuring both nucleosome location and accessibility in the same assay. MACC can be performed either with or without a histone immunoprecipitation step, and thereby compares histone and non-histone protection. We find that changes in accessibility at enhancers, promoters and other regulatory regions do not correlate with changes in nucleosome occupancy. Moreover, high nucleosome occupancy does not necessarily preclude high accessibility, which reveals novel principles of chromatin regulation.

Tica J*, Lee E*, Untergasser A, Meiers S, Garfield DA, Gokcumen O, Furlong EEM, Park PJ, Stütz AM**, Korbel JO**. Next-generation sequencing-based detection of germline L1-mediated transductions. BMC Genomics 2016;17(1):342.Abstract

BACKGROUND: While active LINE-1 (L1) elements possess the ability to mobilize flanking sequences to different genomic loci through a process termed transduction influencing genomic content and structure, an approach for detecting polymorphic germline non-reference transductions in massively-parallel sequencing data has been lacking. RESULTS: Here we present the computational approach TIGER (Transduction Inference in GERmline genomes), enabling the discovery of non-reference L1-mediated transductions by combining L1 discovery with detection of unique insertion sequences and detailed characterization of insertion sites. We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome. We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species. CONCLUSIONS: By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

De Los Angeles A, Ferrari F, Fujiwara Y, Mathieu R, Lee S, Lee S, Tu H-C, Ross S, Chou S, Nguyen M, Wu Z, Theunissen TW, Powell BE, Imsoonthornruksa S, Chen J, Borkent M, Krupalnik V, Lujan E, Wernig M, Hanna JH, Hochedlinger K, Pei D, Jaenisch R, Deng H, Orkin SH, Park PJ, Daley GQ. Corrigendum: Failure to replicate the STAP cell phenomenon. Nature 2016;531(7594):400.
De Los Angeles A, Ferrari F, Xi R, Fujiwara Y, Benvenisty N, Deng H, Hochedlinger K, Jaenisch R, Lee S, Leitch HG, Lensch WM, Lujan E, Pei D, Rossant J, Wernig M, Park PJ, Daley GQ. Corrigendum: Hallmarks of pluripotency. Nature 2016;531(7594):400.
Lee J-K, Choi Y-L, Kwon M, Park PJ. Mechanisms and Consequences of Cancer Genome Instability: Lessons from Genome Sequencing Studies. Annu Rev Pathol 2016;Abstract

During tumor evolution, cancer cells can accumulate numerous genetic alterations, ranging from single nucleotide mutations to whole-chromosomal changes. Although a great deal of progress has been made in the past decades in characterizing genomic alterations, recent cancer genome sequencing studies have provided a wealth of information on the detailed molecular profiles of such alterations in various types of cancers. Here, we review our current understanding of the mechanisms and consequences of cancer genome instability, focusing on the findings uncovered through analysis of exome and whole-genome sequencing data. These analyses have shown that most cancers have evidence of genome instability, and the degree of instability is variable within and between cancer types. Importantly, we describe some recent evidence supporting the idea that chromosomal instability could be a major driving force in tumorigenesis and cancer evolution, actively shaping the genomes of cancer cells to maximize their survival advantage. Expected final online publication date for the Annual Review of Pathology: Mechanisms of Disease Volume 11 is May 23, 2016. Please see http://www.annualreviews.org/catalog/pubdates.aspx for revised estimates.

Evrony GD*, Lee E*, Park PJ**, Walsh CA**. Resolving rates of mutation in the brain using single-neuron genomics. Elife 2016;5Abstract

Whether somatic mutations contribute functional diversity to brain cells is a long-standing question. Single-neuron genomics enables direct measurement of somatic mutation rates in human brain and promises to answer this question. A recent study (Upton et al., 2015) reported high rates of somatic LINE-1 element (L1) retrotransposition in the hippocampus and cerebral cortex that would have major implications for normal brain function, and further claimed these mutation events preferentially impact genes important for neuronal function. We identify errors in single-cell sequencing approach, bioinformatic analysis, and validation methods that led to thousands of false-positive artifacts being mistakenly interpreted as somatic mutation events. Our reanalysis of the data supports a corrected mutation frequency (0.2 per cell) more than fifty-fold lower than reported, inconsistent with the authors' conclusion of 'ubiquitous' L1 mosaicism, but consistent with L1 elements mobilizing occasionally. Through consideration of the challenges and pitfalls identified, we provide a foundation and framework for designing single-cell genomics studies.

Nam J-Y, Kim NKD, Kim SC, Joung J-G, Xi R, Lee S, Park PJ**, Park W-Y**. Evaluation of somatic copy number estimation tools for whole-exome sequencing data. Brief Bioinform 2016;17(2):185-92.Abstract

Whole-exome sequencing (WES) has become a standard method for detecting genetic variants in human diseases. Although the primary use of WES data has been the identification of single nucleotide variations and indels, these data also offer a possibility of detecting copy number variations (CNVs) at high resolution. However, WES data have uneven read coverage along the genome owing to the target capture step, and the development of a robust WES-based CNV tool is challenging. Here, we evaluate six WES somatic CNV detection tools: ADTEx, CONTRA, Control-FREEC, EXCAVATOR, ExomeCNV and Varscan2. Using WES data from 50 kidney chromophobe, 50 bladder urothelial carcinoma, and 50 stomach adenocarcinoma patients from The Cancer Genome Atlas, we compared the CNV calls from the six tools with a reference CNV set that was identified by both single nucleotide polymorphism array 6.0 and whole-genome sequencing data. We found that these algorithms gave highly variable results: visual inspection reveals significant differences between the WES-based segmentation profiles and the reference profile, as well as among the WES-based profiles. Using a 50% overlap criterion, 13-77% of WES CNV calls were covered by CNVs from the reference set, up to 21% of the copy gains were called as losses or vice versa, and dramatic differences in CNV sizes and CNV numbers were observed. Overall, ADTEx and EXCAVATOR had the best performance with relatively high precision and sensitivity. We suggest that the current algorithms for somatic CNV detection from WES data are limited in their performance and that more robust algorithms are needed.

Zheng S, Cherniack AD, Dewal N, Moffitt RA, Danilova L, Murray BA, Lerario AM, Else T, Knijnenburg TA, Ciriello G, Kim S, Assie G, Morozova O, Akbani R, Shih J, Hoadley KA, Choueiri TK, Waldmann J, Mete O, Robertson GA, Wu H-T, Raphael BJ, Shao L, Meyerson M, Demeure MJ, Beuschlein F, Gill AJ, Sidhu SB, Almeida MQ, Fragoso MCBV, Cope LM, Kebebew E, Habra MA, Whitsett TG, Bussey KJ, Rainey WE, Asa SL, Bertherat J, Fassnacht M, Wheeler DA, Hammer GD, Giordano TJ, Verhaak RGW. Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma. Cancer Cell 2016;30(2):363.
Jung YL*, Kang H*, Park PJ, Kuroda MI. Correspondence of Drosophila Polycomb Group proteins with broad H3K27me3 silent domains. Fly 2016;Abstract

The Polycomb group (PcG) proteins are key conserved regulators of development, initially discovered in Drosophila and now strongly implicated in human disease. Nevertheless, differing silencing properties between the Drosophila and mammalian PcG systems have been observed. While specific DNA targeting sites for PcG proteins called Polycomb response elements (PREs) have been identified only in Drosophila, involvement of non-coding RNAs for PcG targeting has been favored in mammals. Another difference lies in the distribution patterns of PcG proteins. In mouse and human cells, PcG proteins show broad distributions, significantly overlapping with H3K27me3 domains. In contrast, only sharp peaks on PRE regions are observed for most PcG proteins in Drosophila, raising the question of how large domains of H3K27me3, up to many tens of kilobases, are formed and maintained in Drosophila. In this Extra View, we provide evidence that PcG distributions on silent chromatin in Drosophila are considerably broader than previously detected. Using BioTAP-XL, a chromatin crosslinking and tandem affinity purification approach, we find a broad, rather than PRE-limited overlap of PcG proteins with H3K27me3, suggesting a conserved spreading mechanism for PcG in flies and mammals.

Ordulu Z, Nucci MR, Dal Cin P, Hollowell ML, Otis CN, Hornick JL, Park PJ, Kim T-M, Quade BJ, Morton CC. Intravenous leiomyomatosis: an unusual intermediate between benign and malignant uterine smooth muscle tumors. Mod Pathol 2016;Abstract

Intravenous leiomyomatosis is an unusual smooth muscle neoplasm with quasi-malignant intravascular growth but a histologically banal appearance. Herein, we report expression and molecular cytogenetic analyses of a series of 12 intravenous leiomyomatosis cases to better understand the pathogenesis of intravenous leiomyomatosis. All cases were analyzed for the expression of HMGA2, MDM2, and CDK4 proteins by immunohistochemistry based on our previous finding of der(14)t(12;14)(q14.3;q24) in intravenous leiomyomatosis. Seven of 12 (58%) intravenous leiomyomatosis cases expressed HMGA2, and none expressed MDM2 or CDK4. Colocalization of hybridization signals for probes from the HMGA2 locus (12q14.3) and from 14q24 by interphase fluorescence in situ hybridization (FISH) was detected in a mean of 89.2% of nuclei in HMGA2-positive cases by immunohistochemistry, but in only 12.4% of nuclei in negative cases, indicating an association of HMGA2 expression and this chromosomal rearrangement (P=8.24 × 10(-10)). Four HMGA2-positive cases had greater than two HMGA2 hybridization signals per cell. No cases showed loss of a hybridization signal by interphase FISH for the frequently deleted region of 7q22 in uterine leiomyomata. One intravenous leiomyomatosis case analyzed by array comparative genomic hybridization revealed complex copy number variations. Finally, expression profiling was performed on three intravenous leiomyomatosis cases. Interestingly, hierarchical cluster analysis of the expression profiles revealed segregation of the intravenous leiomyomatosis cases with leiomyosarcoma rather than with myometrium, uterine leiomyoma of the usual histological type, or plexiform leiomyoma. These findings suggest that intravenous leiomyomatosis cases share some molecular cytogenetic characteristics with uterine leiomyoma, and expression profiles similar to that of leiomyosarcoma cases, further supporting their intermediate, quasi-malignant behavior.Modern Pathology advance online publication, 19 February 2016; doi:10.1038/modpathol.2016.36.

Pages