Epigenetics

Jung YL*, Kang H*, Park PJ, Kuroda MI. Correspondence of Drosophila Polycomb Group proteins with broad H3K27me3 silent domains. Fly 2016;Abstract

The Polycomb group (PcG) proteins are key conserved regulators of development, initially discovered in Drosophila and now strongly implicated in human disease. Nevertheless, differing silencing properties between the Drosophila and mammalian PcG systems have been observed. While specific DNA targeting sites for PcG proteins called Polycomb response elements (PREs) have been identified only in Drosophila, involvement of non-coding RNAs for PcG targeting has been favored in mammals. Another difference lies in the distribution patterns of PcG proteins. In mouse and human cells, PcG proteins show broad distributions, significantly overlapping with H3K27me3 domains. In contrast, only sharp peaks on PRE regions are observed for most PcG proteins in Drosophila, raising the question of how large domains of H3K27me3, up to many tens of kilobases, are formed and maintained in Drosophila. In this Extra View, we provide evidence that PcG distributions on silent chromatin in Drosophila are considerably broader than previously detected. Using BioTAP-XL, a chromatin crosslinking and tandem affinity purification approach, we find a broad, rather than PRE-limited overlap of PcG proteins with H3K27me3, suggesting a conserved spreading mechanism for PcG in flies and mammals.

An integrated encyclopedia of DNA elements in the human genome.
ENCODE Project C. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489(7414):57-74.Abstract

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Soruco MML*, Chery J*, Bishop EP*, Siggers T, Tolstorukov MY, Leydon AR, Sugden AU, Goebel K, Feng J, Xia P, Vedenko A, Bulyk ML, Park PJ, Larschan E. The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. Genes Dev 2013;27(14):1551-6.Abstract

The Drosophila male-specific lethal (MSL) dosage compensation complex increases transcript levels on the single male X chromosome to equal the transcript levels in XX females. However, it is not known how the MSL complex is linked to its DNA recognition elements, the critical first step in dosage compensation. Here, we demonstrate that a previously uncharacterized zinc finger protein, CLAMP (chromatin-linked adaptor for MSL proteins), functions as the first link between the MSL complex and the X chromosome. CLAMP directly binds to the MSL complex DNA recognition elements and is required for the recruitment of the MSL complex. The discovery of CLAMP identifies a key factor required for the chromosome-specific targeting of dosage compensation, providing new insights into how subnuclear domains of coordinate gene regulation are formed within metazoan genomes.

Gelbart ME, Larschan E, Peng S, Park PJ, Kuroda MI. Drosophila MSL complex globally acetylates H4K16 on the male X chromosome for dosage compensation. Nat Struct Mol Biol 2009;16(8):825-32.Abstract

The Drosophila melanogaster male-specific lethal (MSL) complex binds the single male X chromosome to upregulate gene expression to equal that from the two female X chromosomes. However, it has been puzzling that approximately 25% of transcribed genes on the X chromosome do not stably recruit MSL complex. Here we find that almost all active genes on the X chromosome are associated with robust H4 Lys16 acetylation (H4K16ac), the histone modification catalyzed by the MSL complex. The distribution of H4K16ac is much broader than that of the MSL complex, and our results favor the idea that chromosome-wide H4K16ac reflects transient association of the MSL complex, occurring through spreading or chromosomal looping. Our results parallel those of localized Polycomb repressive complex and its more broadly distributed chromatin mark, trimethylated histone H3 Lys27 (H3K27me3), suggesting a common principle for the establishment of active and silenced chromatin domains.

modENCODE Consortium *, Roy S*, Ernst J*, Kharchenko PV*, Kheradpour P*, Negre N*, Eaton ML*, Landolin JM*, Bristow CA*, Ma L*, Lin MF*, Washietl S*, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, Brown CD, Candeias R, Carlson JW, Carr A, Jungreis I, Marbach D, Sealfon R, Tolstorukov MY, Will S, Alekseyenko AA, Artieri C, Booth BW, Brooks AN, Dai Q, Davis CA, Duff MO, Feng X, Gorchakov AA, Gu T, Henikoff JG, Kapranov P, Li R, MacAlpine HK, Malone J, Minoda A, Nordman J, Okamura K, Perry M, Powell SK, Riddle NC, Sakai A, Samsonova A, Sandler JE, Schwartz YB, Sher N, Spokony R, Sturgill D, van Baren M, Wan KH, Yang L, Yu C, Feingold E, Good P, Guyer M, Lowdon R, Ahmad K, Andrews J, Berger B, Brenner SE, Brent MR, Cherbas L, Elgin SCR, Gingeras TR, Grossman R, Hoskins RA, Kaufman TC, Kent W, Kuroda MI, Orr-Weaver T, Perrimon N, Pirrotta V, Posakony JW, Ren B, Russell S, Cherbas P, Graveley BR, Lewis S, Micklem G, Oliver B, Park PJ, Celniker SE**, Henikoff S**, Karpen GH**, Lai EC**, MacAlpine DM**, Stein LD**, White KP**, Kellis M**. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010;330(6012):1787-97.Abstract

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

Apostolou E*, Ferrari F*, Walsh RM, Bar-Nur O, Stadtfeld M, Cheloufi S, Stuart HT, Polo JM, Ohsumi TK, Borowsky ML, Kharchenko PV, Park PJ**, Hochedlinger K**. Genome-wide chromatin interactions of the Nanog locus in pluripotency, differentiation, and reprogramming. Cell Stem Cell 2013;12(6):699-712.Abstract

The chromatin state of pluripotency genes has been studied extensively in embryonic stem cells (ESCs) and differentiated cells, but their potential interactions with other parts of the genome remain largely unexplored. Here, we identified a genome-wide, pluripotency-specific interaction network around the Nanog promoter by adapting circular chromosome conformation capture sequencing. This network was rearranged during differentiation and restored in induced pluripotent stem cells. A large fraction of Nanog-interacting loci were bound by Mediator or cohesin in pluripotent cells. Depletion of these proteins from ESCs resulted in a disruption of contacts and the acquisition of a differentiation-specific interaction pattern prior to obvious transcriptional and phenotypic changes. Similarly, the establishment of Nanog interactions during reprogramming often preceded transcriptional upregulation of associated genes, suggesting a causative link. Our results document a complex, pluripotency-specific chromatin "interactome" for Nanog and suggest a functional role for long-range genomic interactions in the maintenance and induction of pluripotency.

High-resolution ChIP-chip analysis reveals that the Drosophila MSL complex selectively identifies active genes on the male X chromosome.
Alekseyenko AA, Larschan E, Lai WR, Park PJ**, Kuroda MI**. High-resolution ChIP-chip analysis reveals that the Drosophila MSL complex selectively identifies active genes on the male X chromosome. Genes Dev 2006;20(7):848-57.Abstract

X-chromosome dosage compensation in Drosophila requires the male-specific lethal (MSL) complex, which up-regulates gene expression from the single male X chromosome. Here, we define X-chromosome-specific MSL binding at high resolution in two male cell lines and in late-stage embryos. We find that the MSL complex is highly enriched over most expressed genes, with binding biased toward the 3' end of transcription units. The binding patterns are largely similar in the distinct cell types, with approximately 600 genes clearly bound in all three cases. Genes identified as clearly bound in one cell type and not in another indicate that attraction of MSL complex correlates with expression state. Thus, sequence alone is not sufficient to explain MSL targeting. We propose that the MSL complex recognizes most X-linked genes, but only in the context of chromatin factors or modifications indicative of active transcription. Distinguishing expressed genes from the bulk of the genome is likely to be an important function common to many chromatin organizing and modifying activities.

Egelhofer TA*, Minoda A*, Klugman S*, Lee K, Kolasinska-Zwierz P, Alekseyenko AA, Cheung M-S, Day DS, Gadel S, Gorchakov AA, Gu T, Kharchenko PV, Kuan S, Latorre I, Linder-Basso D, Luu Y, Ngo Q, Perry M, Rechtsteiner A, Riddle NC, Schwartz YB, Shanower GA, Vielle A, Ahringer J, Elgin SCR, Kuroda MI, Pirrotta V, Ren B, Strome S, Park PJ**, Karpen GH**, Hawkins D**R, Lieb JD**. An assessment of histone-modification antibody quality. Nat Struct Mol Biol 2011;18(1):91-3.Abstract

We have tested the specificity and utility of more than 200 antibodies raised against 57 different histone modifications in Drosophila melanogaster, Caenorhabditis elegans and human cells. Although most antibodies performed well, more than 25% failed specificity tests by dot blot or western blot. Among specific antibodies, more than 20% failed in chromatin immunoprecipitation experiments. We advise rigorous testing of histone-modification antibodies before use, and we provide a website for posting new test results (http://compbio.med.harvard.edu/antibodies/).

Zhang B*, Day DS*, Ho JW, Song L, Cao J, Christodoulou D, Seidman JG, Crawford GE, Park PJ, Pu WT. A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity. Genome Res 2013;23(6):917-27.Abstract

Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth factor A (VEGFA), a major regulator of angiogenesis, triggers changes in transcriptional activity of human umbilical vein endothelial cells (HUVECs). Here, we used chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) to measure genome-wide changes in histone H3 acetylation at lysine 27 (H3K27ac), a marker of active enhancers, in unstimulated HUVECs and HUVECs stimulated with VEGFA for 1, 4, and 12 h. We show that sites with the greatest H3K27ac change upon stimulation were associated tightly with EP300, a histone acetyltransferase. Using the variation of H3K27ac as a novel epigenetic signature, we identified transcriptional regulatory elements that are functionally linked to angiogenesis, participate in rapid VEGFA-stimulated changes in chromatin conformation, and mediate VEGFA-induced transcriptional responses. Dynamic H3K27ac deposition and associated changes in chromatin conformation required EP300 activity instead of altered nucleosome occupancy or changes in DNase I hypersensitivity. EP300 activity was also required for a subset of dynamic H3K27ac sites to loop into proximity of promoters. Our study identified thousands of endothelial, VEGFA-responsive enhancers, demonstrating that an epigenetic signature based on the variation of a chromatin feature is a productive approach to define signal-responsive genomic elements. Further, our study implicates global epigenetic modifications in rapid, signal-responsive transcriptional regulation.

Blackledge NP, Zhou JC, Tolstorukov MY, Farcas AM, Park PJ, Klose RJ. CpG islands recruit a histone H3 lysine 36 demethylase. Molecular Cell 2010;38(2):179-90.Abstract

In higher eukaryotes, up to 70% of genes have high levels of nonmethylated cytosine/guanine base pairs (CpGs) surrounding promoters and gene regulatory units. These features, called CpG islands, were identified over 20 years ago, but there remains little mechanistic evidence to suggest how these enigmatic elements contribute to promoter function, except that they are refractory to epigenetic silencing by DNA methylation. Here we show that CpG islands directly recruit the H3K36-specific lysine demethylase enzyme KDM2A. Nucleation of KDM2A at these elements results in removal of H3K36 methylation, creating CpG island chromatin that is uniquely depleted of this modification. KDM2A utilizes a zinc finger CxxC (ZF-CxxC) domain that preferentially recognizes nonmethylated CpG DNA, and binding is blocked when the CpG DNA is methylated, thus constraining KDM2A to nonmethylated CpG islands. These data expose a straightforward mechanism through which KDM2A delineates a unique architecture that differentiates CpG island chromatin from bulk chromatin.

Marinov GK, Kundaje A, Park PJ, Wold BJ. Large-scale quality analysis of published ChIP-seq data. G3 2014;4(2):209-23.Abstract

ChIP-seq has become the primary method for identifying in vivo protein-DNA interactions on a genome-wide scale, with nearly 800 publications involving the technique appearing in PubMed as of December 2012. Individually and in aggregate, these data are an important and information-rich resource. However, uncertainties about data quality confound their use by the wider research community. Recently, the Encyclopedia of DNA Elements (ENCODE) project developed and applied metrics to objectively measure ChIP-seq data quality. The ENCODE quality analysis was useful for flagging datasets for closer inspection, eliminating or replacing poor data, and for driving changes in experimental pipelines. There had been no similarly systematic quality analysis of the large and disparate body of published ChIP-seq profiles. Here, we report a uniform analysis of vertebrate transcription factor ChIP-seq datasets in the Gene Expression Omnibus (GEO) repository as of April 1, 2012. The majority (55%) of datasets scored as being highly successful, but a substantial minority (20%) were of apparently poor quality, and another ∼25% were of intermediate quality. We discuss how different uses of ChIP-seq data are affected by specific aspects of data quality, and we highlight exceptional instances for which the metric values should not be taken at face value. Unexpectedly, we discovered that a significant subset of control datasets (i.e., no immunoprecipitation and mock immunoprecipitation samples) display an enrichment structure similar to successful ChIP-seq data. This can, in turn, affect peak calling and data interpretation. Published datasets identified here as high-quality comprise a large group that users can draw on for large-scale integrated analysis. In the future, ChIP-seq quality assessment similar to that used here could guide experimentalists at early stages in a study, provide useful input in the publication process, and be used to stratify ChIP-seq data for different community-wide uses.

Gorchakov AA, Alekseyenko AA, Kharchenko P, Park PJ, Kuroda MI. Long-range spreading of dosage compensation in Drosophila captures transcribed autosomal genes inserted on X. Genes Dev 2009;23(19):2266-71.Abstract

Dosage compensation in Drosophila melanogaster males is achieved via targeting of male-specific lethal (MSL) complex to X-linked genes. This is proposed to involve sequence-specific recognition of the X at approximately 150-300 chromatin entry sites, and subsequent spreading to active genes. Here we ask whether the spreading step requires transcription and is sequence-independent. We find that MSL complex binds, acetylates, and up-regulates autosomal genes inserted on X, but only if transcriptionally active. We conclude that a long-sought specific DNA sequence within X-linked genes is not obligatory for MSL binding. Instead, linkage and transcription play the pivotal roles in MSL targeting irrespective of gene origin and DNA sequence.

Larschan E*, Bishop EP*, Kharchenko PV, Core LJ, Lis JT, Park PJ**, Kuroda MI**. X chromosome dosage compensation via enhanced transcriptional elongation in Drosophila. Nature 2011;471(7336):115-8.Abstract

The evolution of sex chromosomes has resulted in numerous species in which females inherit two X chromosomes but males have a single X, thus requiring dosage compensation. MSL (Male-specific lethal) complex increases transcription on the single X chromosome of Drosophila males to equalize expression of X-linked genes between the sexes. The biochemical mechanisms used for dosage compensation must function over a wide dynamic range of transcription levels and differential expression patterns. It has been proposed that the MSL complex regulates transcriptional elongation to control dosage compensation, a model subsequently supported by mapping of the MSL complex and MSL-dependent histone 4 lysine 16 acetylation to the bodies of X-linked genes in males, with a bias towards 3' ends. However, experimental analysis of MSL function at the mechanistic level has been challenging owing to the small magnitude of the chromosome-wide effect and the lack of an in vitro system for biochemical analysis. Here we use global run-on sequencing (GRO-seq) to examine the specific effect of the MSL complex on RNA Polymerase II (RNAP II) on a genome-wide level. Results indicate that the MSL complex enhances transcription by facilitating the progression of RNAP II across the bodies of active X-linked genes. Improving transcriptional output downstream of typical gene-specific controls may explain how dosage compensation can be imposed on the diverse set of genes along an entire chromosome.

Pages