Publications by Year: 2010

2010
Gorchakov AA, Alekseenko AA, Kharchenko PV, Park P, Kuroda M. [Dosage compensation in drosophila: sequence-specific initiation and sequence-independent spreading of MSL complex to the active genes on the male X chromosome]. Genetika 2010;46(10):1430-4.Abstract
For the dosage compensation to occur, genes on the single male X chromosomes in Drosophila must be selectively bound and acetylated by the ribonucleoprotein complex called MSL complex. It remained unknown how such exquisite specificity is achieved, and whether specific DNA sequences were involved. In the present work we demonstrate that it is transcription of the gene on the X chromosome that is important for MSL targeting, irrespective of gene origin and DNA sequence.
Blackledge NP, Zhou JC, Tolstorukov MY, Farcas AM, Park PJ, Klose RJ. CpG islands recruit a histone H3 lysine 36 demethylase. Molecular Cell 2010;38(2):179-90.Abstract

In higher eukaryotes, up to 70% of genes have high levels of nonmethylated cytosine/guanine base pairs (CpGs) surrounding promoters and gene regulatory units. These features, called CpG islands, were identified over 20 years ago, but there remains little mechanistic evidence to suggest how these enigmatic elements contribute to promoter function, except that they are refractory to epigenetic silencing by DNA methylation. Here we show that CpG islands directly recruit the H3K36-specific lysine demethylase enzyme KDM2A. Nucleation of KDM2A at these elements results in removal of H3K36 methylation, creating CpG island chromatin that is uniquely depleted of this modification. KDM2A utilizes a zinc finger CxxC (ZF-CxxC) domain that preferentially recognizes nonmethylated CpG DNA, and binding is blocked when the CpG DNA is methylated, thus constraining KDM2A to nonmethylated CpG islands. These data expose a straightforward mechanism through which KDM2A delineates a unique architecture that differentiates CpG island chromatin from bulk chromatin.

Goutagny S, Yang HW, Zucman-Rossi J, Chan J, Dreyfuss JM, Park PJ, Black PM, Giovannini M, Carroll RS, Kalamarides M. Genomic profiling reveals alternative genetic pathways of meningioma malignant progression dependent on the underlying NF2 status. Clin Cancer Res 2010;16(16):4155-64.Abstract

PURPOSE: Meningiomas are the most common central nervous system tumors in the population of age 35 and older. WHO defines three grades predictive of the risk of recurrence. Clinical data supporting histologic malignant progression of meningiomas are sparse and underlying molecular mechanisms are not clearly depicted. EXPERIMENTAL DESIGN: We identified genetic alterations associated with histologic progression of 36 paired meningioma samples in 18 patients using 500K SNP genotyping arrays and NF2 gene sequencing. RESULTS: The most frequent chromosome alterations observed in progressing meningioma samples are early alterations (i.e., present both in lower- and higher-grade samples of a single patient). In our series, NF2 gene inactivation was an early and frequent event in progressing meningioma samples (73%). Chromosome alterations acquired during progression from grade I to grade II meningioma were not recurrent. Progression to grade III was characterized by recurrent genomic alterations, the most frequent being CDKN2A/CDKN2B locus loss on 9p. CONCLUSION: Meningiomas displayed different patterns of genetic alterations during progression according to their NF2 status: NF2-mutated meningiomas showed higher chromosome instability during progression than NF2-nonmutated meningiomas, which had very few imbalanced chromosome segments. This pattern of alterations could thus be used as markers in clinical practice to identify tumors prone to progress among grade I meningiomas.

modENCODE Consortium *, Roy S*, Ernst J*, Kharchenko PV*, Kheradpour P*, Negre N*, Eaton ML*, Landolin JM*, Bristow CA*, Ma L*, Lin MF*, Washietl S*, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, Brown CD, Candeias R, Carlson JW, Carr A, Jungreis I, Marbach D, Sealfon R, Tolstorukov MY, Will S, Alekseyenko AA, Artieri C, Booth BW, Brooks AN, Dai Q, Davis CA, Duff MO, Feng X, Gorchakov AA, Gu T, Henikoff JG, Kapranov P, Li R, MacAlpine HK, Malone J, Minoda A, Nordman J, Okamura K, Perry M, Powell SK, Riddle NC, Sakai A, Samsonova A, Sandler JE, Schwartz YB, Sher N, Spokony R, Sturgill D, van Baren M, Wan KH, Yang L, Yu C, Feingold E, Good P, Guyer M, Lowdon R, Ahmad K, Andrews J, Berger B, Brenner SE, Brent MR, Cherbas L, Elgin SCR, Gingeras TR, Grossman R, Hoskins RA, Kaufman TC, Kent W, Kuroda MI, Orr-Weaver T, Perrimon N, Pirrotta V, Posakony JW, Ren B, Russell S, Cherbas P, Graveley BR, Lewis S, Micklem G, Oliver B, Park PJ, Celniker SE**, Henikoff S**, Karpen GH**, Lai EC**, MacAlpine DM**, Stein LD**, White KP**, Kellis M**. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010;330(6012):1787-97.Abstract

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

Jagani Z, Mora-Blanco LE, Sansam CG, McKenna ES, Wilson B, Chen D, Klekota J, Tamayo P, Nguyen PTL, Tolstorukov M, Park PJ, Cho Y-J, Hsiao K, Buonamici S, Pomeroy SL, Mesirov JP, Ruffner H, Bouwmeester T, Luchansky SJ, Murtie J, Kelleher JF, Warmuth M, Sellers WR, Roberts CWM, Dorsch M. Loss of the tumor suppressor Snf5 leads to aberrant activation of the Hedgehog-Gli pathway. Nat Med 2010;16(12):1429-33.Abstract

Aberrant activation of the Hedgehog (Hh) pathway can drive tumorigenesis. To investigate the mechanism by which glioma-associated oncogene family zinc finger-1 (GLI1), a crucial effector of Hh signaling, regulates Hh pathway activation, we searched for GLI1-interacting proteins. We report that the chromatin remodeling protein SNF5 (encoded by SMARCB1, hereafter called SNF5), which is inactivated in human malignant rhabdoid tumors (MRTs), interacts with GLI1. We show that Snf5 localizes to Gli1-regulated promoters and that loss of Snf5 leads to activation of the Hh-Gli pathway. Conversely, re-expression of SNF5 in MRT cells represses GLI1. Consistent with this, we show the presence of a Hh-Gli-activated gene expression profile in primary MRTs and show that GLI1 drives the growth of SNF5-deficient MRT cells in vitro and in vivo. Therefore, our studies reveal that SNF5 is a key mediator of Hh signaling and that aberrant activation of GLI1 is a previously undescribed targetable mechanism contributing to the growth of MRT cells.

Balakrishnan A, Stearns AT, Park PJ, Dreyfuss JM, Ashley SW, Rhoads DB, Tavakkolizadeh A. MicroRNA mir-16 is anti-proliferative in enterocytes and exhibits diurnal rhythmicity in intestinal crypts. Exp Cell Res 2010;316(20):3512-21.Abstract

BACKGROUND AND AIMS: The intestine exhibits profound diurnal rhythms in function and morphology, in part due to changes in enterocyte proliferation. The regulatory mechanisms behind these rhythms remain largely unknown. We hypothesized that microRNAs are involved in mediating these rhythms, and studied the role of microRNAs specifically in modulating intestinal proliferation. METHODS: Diurnal rhythmicity of microRNAs in rat jejunum was analyzed by microarrays and validated by qPCR. Temporal expression of diurnally rhythmic mir-16 was further quantified in intestinal crypts, villi, and smooth muscle using laser capture microdissection and qPCR. Morphological changes in rat jejunum were assessed by histology and proliferation by immunostaining for bromodeoxyuridine. In IEC-6 cells stably overexpressing mir-16, proliferation was assessed by cell counting and MTS assay, cell cycle progression and apoptosis by flow cytometry, and cell cycle gene expression by qPCR and immunoblotting. RESULTS: mir-16 peaked 6 hours after light onset (HALO 6) with diurnal changes restricted to crypts. Crypt depth and villus height peaked at HALO 13-14 in antiphase to mir-16. Overexpression of mir-16 in IEC-6 cells suppressed specific G1/S regulators (cyclins D1-3, cyclin E1 and cyclin-dependent kinase 6) and produced G1 arrest. Protein expression of these genes exhibited diurnal rhythmicity in rat jejunum, peaking between HALO 11 and 17 in antiphase to mir-16. CONCLUSIONS: This is the first report of circadian rhythmicity of specific microRNAs in rat jejunum. Our data provide a link between anti-proliferative mir-16 and the intestinal proliferation rhythm and point to mir-16 as an important regulator of proliferation in jejunal crypts. This function may be essential to match proliferation and absorptive capacity with nutrient availability.

Peng S, Kuroda MI, Park PJ. Quantized correlation coefficient for measuring reproducibility of ChIP-chip data. BMC Bioinformatics 2010;11:399.Abstract

BACKGROUND: Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in replicates, and a correlation coefficient between replicates is used often to assess reproducibility. However, the correlation coefficient can be misleading because it is affected not only by the reproducibility of the signal but also by the amount of binding signal present in the data. RESULTS: We develop the Quantized correlation coefficient (QCC) that is much less dependent on the amount of signal. This involves discretization of data into set of quantiles (quantization), a merging procedure to group the background probes, and recalculation of the Pearson correlation coefficient. This procedure reduces the influence of the background noise on the statistic, which then properly focuses more on the reproducibility of the signal. The performance of this procedure is tested in both simulated and real ChIP-chip data. For replicates with different levels of enrichment over background and coverage, we find that QCC reflects reproducibility more accurately and is more robust than the standard Pearson or Spearman correlation coefficients. The quantization and the merging procedure can also suggest a proper quantile threshold for separating signal from background for further analysis. CONCLUSIONS: To measure reproducibility of ChIP-chip data correctly, a correlation coefficient that is robust to the amount of signal present should be used. QCC is one such measure. The QCC statistic can also be applied in a variety of other contexts for measuring reproducibility, including analysis of array CGH data for DNA copy number and gene expression data.

Woo CJ, Kharchenko PV, Daheron L, Park PJ, Kingston RE. A region of the human HOXD cluster that confers polycomb-group responsiveness. Cell 2010;140(1):99-110.Abstract

Polycomb group (PcG) proteins are essential for accurate axial body patterning during embryonic development. PcG-mediated repression is conserved in metazoans and is targeted in Drosophila by Polycomb response elements (PREs). However, targeting sequences in humans have not been described. While analyzing chromatin architecture in the context of human embryonic stem cell (hESC) differentiation, we discovered a 1.8kb region between HOXD11 and HOXD12 (D11.12) that is associated with PcG proteins, becomes nuclease hypersensitive, and then shows alteration in nuclease sensitivity as hESCs differentiate. The D11.12 element repressed luciferase expression from a reporter construct and full repression required a highly conserved region and YY1 binding sites. Furthermore, repression was dependent on the PcG proteins BMI1 and EED and a YY1-interacting partner, RYBP. We conclude that D11.12 is a Polycomb-dependent regulatory region with similarities to Drosophila PREs, indicating conservation in the mechanisms that target PcG function in mammals and flies.

Tolstorukov MY, Kharchenko PV, Park PJ. Analysis of primary structure of chromatin with next-generation sequencing. Epigenomics 2010;2(2):187-197.Abstract

The recent development of next-generation sequencing technology has enabled significant progress in chromatin structure analysis. Here, we review the experimental and bioinformatic approaches to studying nucleosome positioning and histone modification profiles on a genome scale using this technology. These studies advanced our knowledge of the nucleosome positioning patterns of both epigenetically modified and bulk nucleosomes and elucidated the role of such patterns in regulation of gene expression. The identification and analysis of large sets of nucleosome-bound DNA sequences allowed better understanding of the rules that govern nucleosome positioning in organisms of various complexity. We also discuss the existing challenges and prospects of using next-generation sequencing for nucleosome positioning analysis and outline the importance of such studies for the entire chromatin structure field.

Xi R, Kim T-M, Park PJ. Detecting structural variations in the human genome using next generation sequencing. Brief Funct Genomics 2010;9(5-6):405-15.Abstract

Structural variations are widespread in the human genome and can serve as genetic markers in clinical and evolutionary studies. With the advances in the next-generation sequencing technology, recent methods allow for identification of structural variations with unprecedented resolution and accuracy. They also provide opportunities to discover variants that could not be detected on conventional microarray-based platforms, such as dosage-invariant chromosomal translocations and inversions. In this review, we will describe some of the sequencing-based algorithms for detection of structural variations and discuss the key issues in future development.

Day DS, Luquette LJ, Park PJ, Kharchenko PV. Estimating enrichment of repetitive elements from high-throughput sequence data. Genome Biol 2010;11(6):R69.Abstract

We describe computational methods for analysis of repetitive elements from short-read sequencing data, and apply them to study histone modifications associated with the repetitive elements in human and mouse cells. Our results demonstrate that while accurate enrichment estimates can be obtained for individual repeat types and small sets of repeat instances, there are distinct combinatorial patterns of chromatin marks associated with major annotated repeat families, including H3K27me3/H3K9me3 differences among the endogenous retroviral element classes.

Kim H, Huang W, Jiang X, Pennicooke B, Park PJ, Johnson MD. Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship. Proc Natl Acad Sci U S A 2010;107(5):2183-8.Abstract

Using a multidimensional genomic data set on glioblastoma from The Cancer Genome Atlas, we identified hsa-miR-26a as a cooperating component of a frequently occurring amplicon that also contains CDK4 and CENTG1, two oncogenes that regulate the RB1 and PI3 kinase/AKT pathways, respectively. By integrating DNA copy number, mRNA, microRNA, and DNA methylation data, we identified functionally relevant targets of miR-26a in glioblastoma, including PTEN, RB1, and MAP3K2/MEKK2. We demonstrate that miR-26a alone can transform cells and it promotes glioblastoma cell growth in vitro and in the mouse brain by decreasing PTEN, RB1, and MAP3K2/MEKK2 protein expression, thereby increasing AKT activation, promoting proliferation, and decreasing c-JUN N-terminal kinase-dependent apoptosis. Overexpression of miR-26a in PTEN-competent and PTEN-deficient glioblastoma cells promoted tumor growth in vivo, and it further increased growth in cells overexpressing CDK4 or CENTG1. Importantly, glioblastoma patients harboring this amplification displayed markedly decreased survival. Thus, hsa-miR-26a, CDK4, and CENTG1 comprise a functionally integrated oncomir/oncogene DNA cluster that promotes aggressiveness in human cancers by cooperatively targeting the RB1, PI3K/AKT, and JNK pathways.

Gurumurthy S, Xie SZ, Alagesan B, Kim J, Yusuf RZ, Saez B, Tzatsos A, Ozsolak F, Milos P, Ferrari F, Park PJ, Shirihai OS, Scadden DT, Bardeesy N. The Lkb1 metabolic sensor maintains haematopoietic stem cell survival. Nature 2010;468(7324):659-63.Abstract

Haematopoietic stem cells (HSCs) can convert between growth states that have marked differences in bioenergetic needs. Although often quiescent in adults, these cells become proliferative upon physiological demand. Balancing HSC energetics in response to nutrient availability and growth state is poorly understood, yet essential for the dynamism of the haematopoietic system. Here we show that the Lkb1 tumour suppressor is critical for the maintenance of energy homeostasis in haematopoietic cells. Lkb1 inactivation in adult mice causes loss of HSC quiescence followed by rapid depletion of all haematopoietic subpopulations. Lkb1-deficient bone marrow cells exhibit mitochondrial defects, alterations in lipid and nucleotide metabolism, and depletion of cellular ATP. The haematopoietic effects are largely independent of Lkb1 regulation of AMP-activated protein kinase (AMPK) and mammalian target of rapamycin (mTOR) signalling. Instead, these data define a central role for Lkb1 in restricting HSC entry into cell cycle and in broadly maintaining energy homeostasis in haematopoietic cells through a novel metabolic checkpoint.

Kim T-M, Luquette LJ, Xi R, Park PJ. rSW-seq: algorithm for detection of copy number alterations in deep sequencing data. BMC Bioinformatics 2010;11:432.Abstract

BACKGROUND: Recent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy. RESULTS: We develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq) to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results. CONCLUSION: We propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.