Marinov GK, Kundaje A, Park PJ, Wold BJ. Large-scale quality analysis of published ChIP-seq data. G3 2014;4(2):209-23.Abstract

ChIP-seq has become the primary method for identifying in vivo protein-DNA interactions on a genome-wide scale, with nearly 800 publications involving the technique appearing in PubMed as of December 2012. Individually and in aggregate, these data are an important and information-rich resource. However, uncertainties about data quality confound their use by the wider research community. Recently, the Encyclopedia of DNA Elements (ENCODE) project developed and applied metrics to objectively measure ChIP-seq data quality. The ENCODE quality analysis was useful for flagging datasets for closer inspection, eliminating or replacing poor data, and for driving changes in experimental pipelines. There had been no similarly systematic quality analysis of the large and disparate body of published ChIP-seq profiles. Here, we report a uniform analysis of vertebrate transcription factor ChIP-seq datasets in the Gene Expression Omnibus (GEO) repository as of April 1, 2012. The majority (55%) of datasets scored as being highly successful, but a substantial minority (20%) were of apparently poor quality, and another ∼25% were of intermediate quality. We discuss how different uses of ChIP-seq data are affected by specific aspects of data quality, and we highlight exceptional instances for which the metric values should not be taken at face value. Unexpectedly, we discovered that a significant subset of control datasets (i.e., no immunoprecipitation and mock immunoprecipitation samples) display an enrichment structure similar to successful ChIP-seq data. This can, in turn, affect peak calling and data interpretation. Published datasets identified here as high-quality comprise a large group that users can draw on for large-scale integrated analysis. In the future, ChIP-seq quality assessment similar to that used here could guide experimentalists at early stages in a study, provide useful input in the publication process, and be used to stratify ChIP-seq data for different community-wide uses.

Ho JWK*, Jung YL*, Liu T*, Alver BH, Lee S, Ikegami K, Sohn K-A, Minoda A, Tolstorukov MY, Appert A, Parker SCJ, Gu T, Kundaje A, Riddle NC, Bishop EP, Egelhofer TA, Hu S'en S, Alekseyenko AA, Rechtsteiner A, Asker D, Belsky JA, Bowman SK, Chen BQ, Chen RA-J, Day DS, Dong Y, Dose AC, Duan X, Epstein CB, Ercan S, Feingold EA, Ferrari F, Garrigues JM, Gehlenborg N, Good PJ, Haseley P, He D, Herrmann M, Hoffman MM, Jeffers TE, Kharchenko PV, Kolasinska-Zwierz P, Kotwaliwale CV, Kumar N, Langley SA, Larschan EN, Latorre I, Libbrecht MW, Lin X, Park R, Pazin MJ, Pham HN, Plachetka A, Qin B, Schwartz YB, Shoresh N, Stempor P, Vielle A, Wang C, Whittle CM, Xue H, Kingston RE, Kim JH, Bernstein BE, Dernburg AF, Pirrotta V, Kuroda MI, Noble WS, Tullius TD, Kellis M, MacAlpine DM**, Strome S**, Elgin SCR**, Liu XS**, Lieb JD**, Ahringer J**, Karpen GH**, Park PJ**. Comparative analysis of metazoan chromatin organization. Nature 2014;512(7515):449-52.Abstract

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.

West JA*, Cook A*, Alver BH, Stadtfeld M, Deaton AM, Hochedlinger K, Park PJ**, Tolstorukov MY**, Kingston RE**. Nucleosomal occupancy changes locally over key regulatory regions during cell differentiation and reprogramming. Nat Commun 2014;5:4719.Abstract

Chromatin structure determines DNA accessibility. We compare nucleosome occupancy in mouse and human embryonic stem cells (ESCs), induced-pluripotent stem cells (iPSCs) and differentiated cell types using MNase-seq. To address variability inherent in this technique, we developed a bioinformatic approach to identify regions of difference (RoD) in nucleosome occupancy between pluripotent and somatic cells. Surprisingly, most chromatin remains unchanged; a majority of rearrangements appear to affect a single nucleosome. RoDs are enriched at genes and regulatory elements, including enhancers associated with pluripotency and differentiation. RoDs co-localize with binding sites of key developmental regulators, including the reprogramming factors Klf4, Oct4/Sox2 and c-Myc. Nucleosomal landscapes in ESC enhancers are extensively altered, exhibiting lower nucleosome occupancy in pluripotent cells than in somatic cells. Most changes are reset during reprogramming. We conclude that changes in nucleosome occupancy are a hallmark of cell differentiation and reprogramming and likely identify regulatory regions essential for these processes.

Merlo P, Frost B, Peng S, Yang YJ, Park PJ, Feany M. p53 prevents neurodegeneration by regulating synaptic genes. Proc Natl Acad Sci U S A 2014;111(50):18055-60.Abstract

DNA damage has been implicated in neurodegenerative disorders, including Alzheimer's disease and other tauopathies, but the consequences of genotoxic stress to postmitotic neurons are poorly understood. Here we demonstrate that p53, a key mediator of the DNA damage response, plays a neuroprotective role in a Drosophila model of tauopathy. Further, through a whole-genome ChIP-chip analysis, we identify genes controlled by p53 in postmitotic neurons. We genetically validate a specific pathway, synaptic function, in p53-mediated neuroprotection. We then demonstrate that the control of synaptic genes by p53 is conserved in mammals. Collectively, our results implicate synaptic function as a central target in p53-dependent protection from neurodegeneration.

Ferrari F*, Apostolou E*, Park PJ**, Hochedlinger K**. Rearranging the chromatin for pluripotency. Cell Cycle 2014;13(2):167-8.
Soruco MML*, Chery J*, Bishop EP*, Siggers T, Tolstorukov MY, Leydon AR, Sugden AU, Goebel K, Feng J, Xia P, Vedenko A, Bulyk ML, Park PJ, Larschan E. The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. Genes Dev 2013;27(14):1551-6.Abstract

The Drosophila male-specific lethal (MSL) dosage compensation complex increases transcript levels on the single male X chromosome to equal the transcript levels in XX females. However, it is not known how the MSL complex is linked to its DNA recognition elements, the critical first step in dosage compensation. Here, we demonstrate that a previously uncharacterized zinc finger protein, CLAMP (chromatin-linked adaptor for MSL proteins), functions as the first link between the MSL complex and the X chromosome. CLAMP directly binds to the MSL complex DNA recognition elements and is required for the recruitment of the MSL complex. The discovery of CLAMP identifies a key factor required for the chromosome-specific targeting of dosage compensation, providing new insights into how subnuclear domains of coordinate gene regulation are formed within metazoan genomes.

Alekseyenko AA, Ellison CE, Gorchakov AA, Zhou Q, Kaiser VB, Toda N, Walton Z, Peng S, Park PJ, Bachtrog D, Kuroda MI. Conservation and de novo acquisition of dosage compensation on newly evolved sex chromosomes in Drosophila. Genes Dev 2013;27(8):853-8.Abstract

Dosage compensation has arisen in response to the evolution of distinct male (XY) and female (XX) karyotypes. In Drosophila melanogaster, the MSL complex increases male X transcription approximately twofold. X-specific targeting is thought to occur through sequence-dependent binding to chromatin entry sites (CESs), followed by spreading in cis to active genes. We tested this model by asking how newly evolving sex chromosome arms in Drosophila miranda acquired dosage compensation. We found evidence for the creation of new CESs, with the analogous sequence and spacing as in D. melanogaster, providing strong support for the spreading model in the establishment of dosage compensation.

Zhang B*, Day DS*, Ho JW, Song L, Cao J, Christodoulou D, Seidman JG, Crawford GE, Park PJ, Pu WT. A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity. Genome Res 2013;23(6):917-27.Abstract

Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth factor A (VEGFA), a major regulator of angiogenesis, triggers changes in transcriptional activity of human umbilical vein endothelial cells (HUVECs). Here, we used chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) to measure genome-wide changes in histone H3 acetylation at lysine 27 (H3K27ac), a marker of active enhancers, in unstimulated HUVECs and HUVECs stimulated with VEGFA for 1, 4, and 12 h. We show that sites with the greatest H3K27ac change upon stimulation were associated tightly with EP300, a histone acetyltransferase. Using the variation of H3K27ac as a novel epigenetic signature, we identified transcriptional regulatory elements that are functionally linked to angiogenesis, participate in rapid VEGFA-stimulated changes in chromatin conformation, and mediate VEGFA-induced transcriptional responses. Dynamic H3K27ac deposition and associated changes in chromatin conformation required EP300 activity instead of altered nucleosome occupancy or changes in DNase I hypersensitivity. EP300 activity was also required for a subset of dynamic H3K27ac sites to loop into proximity of promoters. Our study identified thousands of endothelial, VEGFA-responsive enhancers, demonstrating that an epigenetic signature based on the variation of a chromatin feature is a productive approach to define signal-responsive genomic elements. Further, our study implicates global epigenetic modifications in rapid, signal-responsive transcriptional regulation.

Apostolou E*, Ferrari F*, Walsh RM, Bar-Nur O, Stadtfeld M, Cheloufi S, Stuart HT, Polo JM, Ohsumi TK, Borowsky ML, Kharchenko PV, Park PJ**, Hochedlinger K**. Genome-wide chromatin interactions of the Nanog locus in pluripotency, differentiation, and reprogramming. Cell Stem Cell 2013;12(6):699-712.Abstract

The chromatin state of pluripotency genes has been studied extensively in embryonic stem cells (ESCs) and differentiated cells, but their potential interactions with other parts of the genome remain largely unexplored. Here, we identified a genome-wide, pluripotency-specific interaction network around the Nanog promoter by adapting circular chromosome conformation capture sequencing. This network was rearranged during differentiation and restored in induced pluripotent stem cells. A large fraction of Nanog-interacting loci were bound by Mediator or cohesin in pluripotent cells. Depletion of these proteins from ESCs resulted in a disruption of contacts and the acquisition of a differentiation-specific interaction pattern prior to obvious transcriptional and phenotypic changes. Similarly, the establishment of Nanog interactions during reprogramming often preceded transcriptional upregulation of associated genes, suggesting a causative link. Our results document a complex, pluripotency-specific chromatin "interactome" for Nanog and suggest a functional role for long-range genomic interactions in the maintenance and induction of pluripotency.

Woo CJ, Kharchenko PV, Daheron L, Park PJ, Kingston RE. Variable requirements for DNA-binding proteins at polycomb-dependent repressive regions in human HOX clusters. Mol Cell Biol 2013;33(16):3274-85.Abstract

Polycomb group (PcG)-mediated repression is an evolutionarily conserved process critical for cell fate determination and maintenance of gene expression during embryonic development. However, the mechanisms underlying PcG recruitment in mammals remain unclear since few regulatory sites have been identified. We report two novel prospective PcG-dependent regulatory elements within the human HOXB and HOXC clusters and compare their repressive activities to a previously identified element in the HOXD cluster. These regions recruited the PcG proteins BMI1 and SUZ12 to a reporter construct in mesenchymal stem cells and conferred repression that was dependent upon PcG expression. Furthermore, we examined the potential of two DNA-binding proteins, JARID2 and YY1, to regulate PcG activity at these three elements. JARID2 has differential requirements, whereas YY1 appears to be required for repressive activity at all 3 sites. We conclude that distinct elements of the mammalian HOX clusters can recruit components of the PcG complexes and confer repression, similar to what has been seen in Drosophila. These elements, however, have diverse requirements for binding factors, which, combined with previous data on other loci, speaks to the complexity of PcG targeting in mammals.

Tzatsos A, Paskaleva P*, Ferrari F*, Deshpande V, Stoykova S, Contino G, Wong K-K, Lan F, Trojer P, Park PJ, Bardeesy N. KDM2B promotes pancreatic cancer via Polycomb-dependent and -independent transcriptional programs. J Clin Invest 2013;123(2):727-39.Abstract

Epigenetic mechanisms mediate heritable control of cell identity in normal cells and cancer. We sought to identify epigenetic regulators driving the pathogenesis of pancreatic ductal adenocarcinoma (PDAC), one of the most lethal human cancers. We found that KDM2B (also known as Ndy1, FBXL10, and JHDM1B), an H3K36 histone demethylase implicated in bypass of cellular senescence and somatic cell reprogramming, is markedly overexpressed in human PDAC, with levels increasing with disease grade and stage, and highest expression in metastases. KDM2B silencing abrogated tumorigenicity of PDAC cell lines exhibiting loss of epithelial differentiation, whereas KDM2B overexpression cooperated with KrasG12D to promote PDAC formation in mouse models. Gain- and loss-of-function experiments coupled to genome-wide gene expression and ChIP studies revealed that KDM2B drives tumorigenicity through 2 different transcriptional mechanisms. KDM2B repressed developmental genes through cobinding with Polycomb group (PcG) proteins at transcriptional start sites, whereas it activated a module of metabolic genes, including mediators of protein synthesis and mitochondrial function, cobound by the MYC oncogene and the histone demethylase KDM5A. These results defined epigenetic programs through which KDM2B subverts cellular differentiation and drives the pathogenesis of an aggressive subset of PDAC.

DeGennaro CM, Alver BH, Marguerat S, Stepanova E, Davis CP, Bähler J, Park PJ, Winston F. Spt6 regulates intragenic and antisense transcription, nucleosome positioning, and histone modifications genome-wide in fission yeast. Mol Cell Biol 2013;33(24):4779-92.Abstract

Spt6 is a highly conserved histone chaperone that interacts directly with both RNA polymerase II and histones to regulate gene expression. To gain a comprehensive understanding of the roles of Spt6, we performed genome-wide analyses of transcription, chromatin structure, and histone modifications in a Schizosaccharomyces pombe spt6 mutant. Our results demonstrate dramatic changes to transcription and chromatin structure in the mutant, including elevated antisense transcripts at >70% of all genes and general loss of the +1 nucleosome. Furthermore, Spt6 is required for marks associated with active transcription, including trimethylation of histone H3 on lysine 4, previously observed in humans but not Saccharomyces cerevisiae, and lysine 36. Taken together, our results indicate that Spt6 is critical for the accuracy of transcription and the integrity of chromatin, likely via its direct interactions with RNA polymerase II and histones.

Tolstorukov MY*, Sansam CG*, Lu P*, Koellhoffer EC, Helming KC, Alver BH, Tillman EJ, Evans JA, Wilson BG, Park PJ**, Roberts CWM**. Swi/Snf chromatin remodeling/tumor suppressor complex establishes nucleosome occupancy at target promoters. Proc Natl Acad Sci U S A 2013;110(25):10165-70.Abstract

Precise nucleosome-positioning patterns at promoters are thought to be crucial for faithful transcriptional regulation. However, the mechanisms by which these patterns are established, are dynamically maintained, and subsequently contribute to transcriptional control are poorly understood. The switch/sucrose non-fermentable chromatin remodeling complex, also known as the Brg1 associated factors complex, is a master developmental regulator and tumor suppressor capable of mobilizing nucleosomes in biochemical assays. However, its role in establishing the nucleosome landscape in vivo is unclear. Here we have inactivated Snf5 and Brg1, core subunits of the mammalian Swi/Snf complex, to evaluate their effects on chromatin structure and transcription levels genomewide. We find that inactivation of either subunit leads to disruptions of specific nucleosome patterning combined with a loss of overall nucleosome occupancy at a large number of promoters, regardless of their association with CpG islands. These rearrangements are accompanied by gene expression changes that promote cell proliferation. Collectively, these findings define a direct relationship between chromatin-remodeling complexes, chromatin structure, and transcriptional regulation.

Ho JWK, Alekseyenko AA, Kuroda MI, Park PJ. Genome-wide mapping of protein-DNA interactions by ChIP-seq [Internet]. In: Harbers M, Kahl G Tag-Based Next Generation Sequencing. Weinheim, Germany: Wiley-VCH Verlag GmbH & Co. KGaA; 2012 Publisher's Version
An integrated encyclopedia of DNA elements in the human genome.
ENCODE Project C. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489(7414):57-74.Abstract

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, Bernstein BE, Bickel P, Brown JB, Cayting P, Chen Y, DeSalvo G, Epstein C, Fisher-Aylor KI, Euskirchen G, Gerstein M, Gertz J, Hartemink AJ, Hoffman MM, Iyer VR, Jung YL, Karmakar S, Kellis M, Kharchenko PV, Li Q, Liu T, Liu SX, Ma L, Milosavljevic A, Myers RM, Park PJ, Pazin MJ, Perry MD, Raha D, Reddy TE, Rozowsky J, Shoresh N, Sidow A, Slattery M, Stamatoyannopoulos JA, Tolstorukov MY, White KP, Xi S, Farnham PJ, Lieb JD, Wold BJ, Snyder M. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 2012;22(9):1813-31.Abstract

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.

Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells.
Tolstorukov MY*, Goldman JA*, Gilbert C, Ogryzko V, Kingston RE**, Park PJ**. Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells. Mol Cell 2012;47(4):596-607.Abstract

Variation in chromatin composition and organization often reflects differences in genome function. Histone variants, for example, replace canonical histones to contribute to regulation of numerous nuclear processes including transcription, DNA repair, and chromosome segregation. Here we focus on H2A.Bbd, a rapidly evolving variant found in mammals but not in invertebrates. We report that in human cells, nucleosomes bearing H2A.Bbd form unconventional chromatin structures enriched within actively transcribed genes and characterized by shorter DNA protection and nucleosome spacing. Analysis of transcriptional profiles from cells depleted for H2A.Bbd demonstrated widespread changes in gene expression with a net downregulation of transcription and disruption of normal mRNA splicing patterns. In particular, we observed changes in exon inclusion rates and increased presence of intronic sequences in mRNA products upon H2A.Bbd depletion. Taken together, our results indicate that H2A.Bbd is involved in formation of a specific chromatin structure that facilitates both transcription and initial mRNA processing.

Alekseyenko AA*, Ho JWK*, Peng S*, Gelbart M, Tolstorukov MY, Plachetka A, Kharchenko PV, Jung YL, Gorchakov AA, Larschan E, Gu T, Minoda A, Riddle NC, Schwartz YB, Elgin SCR, Karpen GH, Pirrotta V, Kuroda MI**, Park PJ**. Sequence-specific targeting of dosage compensation in Drosophila favors an active chromatin context. PLoS Genet 2012;8(4):e1002646.Abstract

The Drosophila MSL complex mediates dosage compensation by increasing transcription of the single X chromosome in males approximately two-fold. This is accomplished through recognition of the X chromosome and subsequent acetylation of histone H4K16 on X-linked genes. Initial binding to the X is thought to occur at "entry sites" that contain a consensus sequence motif ("MSL recognition element" or MRE). However, this motif is only ∼2 fold enriched on X, and only a fraction of the motifs on X are initially targeted. Here we ask whether chromatin context could distinguish between utilized and non-utilized copies of the motif, by comparing their relative enrichment for histone modifications and chromosomal proteins mapped in the modENCODE project. Through a comparative analysis of the chromatin features in male S2 cells (which contain MSL complex) and female Kc cells (which lack the complex), we find that the presence of active chromatin modifications, together with an elevated local GC content in the surrounding sequences, has strong predictive value for functional MSL entry sites, independent of MSL binding. We tested these sites for function in Kc cells by RNAi knockdown of Sxl, resulting in induction of MSL complex. We show that ectopic MSL expression in Kc cells leads to H4K16 acetylation around these sites and a relative increase in X chromosome transcription. Collectively, our results support a model in which a pre-existing active chromatin environment, coincident with H3K36me3, contributes to MSL entry site selection. The consequences of MSL targeting of the male X chromosome include increase in nucleosome lability, enrichment for H4K16 acetylation and JIL-1 kinase, and depletion of linker histone H1 on active X-linked genes. Our analysis can serve as a model for identifying chromatin and local sequence features that may contribute to selection of functional protein binding sites in the genome.

Egelhofer TA*, Minoda A*, Klugman S*, Lee K, Kolasinska-Zwierz P, Alekseyenko AA, Cheung M-S, Day DS, Gadel S, Gorchakov AA, Gu T, Kharchenko PV, Kuan S, Latorre I, Linder-Basso D, Luu Y, Ngo Q, Perry M, Rechtsteiner A, Riddle NC, Schwartz YB, Shanower GA, Vielle A, Ahringer J, Elgin SCR, Kuroda MI, Pirrotta V, Ren B, Strome S, Park PJ**, Karpen GH**, Hawkins D**R, Lieb JD**. An assessment of histone-modification antibody quality. Nat Struct Mol Biol 2011;18(1):91-3.Abstract

We have tested the specificity and utility of more than 200 antibodies raised against 57 different histone modifications in Drosophila melanogaster, Caenorhabditis elegans and human cells. Although most antibodies performed well, more than 25% failed specificity tests by dot blot or western blot. Among specific antibodies, more than 20% failed in chromatin immunoprecipitation experiments. We advise rigorous testing of histone-modification antibodies before use, and we provide a website for posting new test results (http://compbio.med.harvard.edu/antibodies/).

Kharchenko PV, Xi R, Park PJ. Evidence for dosage compensation between the X chromosome and autosomes in mammals. Nat Genet 2011;43(12):1167-9; author reply 1171-2.