Publications by Year: 2008

2008
Baskerville KA, Kent C, Personett D, Lai WR, Park PJ, Coleman P, McKinney M. Aging elevates metabolic gene expression in brain cholinergic neurons. Neurobiol Aging 2008;29(12):1874-93.Abstract
The basal forebrain (BF) cholinergic system is selectively vulnerable in human brain diseases, while the cholinergic groups in the upper pons of the brainstem (BS) resist neurodegeneration. Cholinergic neurons (200 per region per animal) were laser-microdissected from five young (8 months) and five aged (24 months) F344 rats from the BF and the BS pontine lateral dorsal tegmental/pedunculopontine nuclei (LDTN/PPN) and their expression profiles were obtained. The bioinformatics program SigPathway was used to identify gene groups and pathways that were selectively affected by aging. In the BF cholinergic system, aging most significantly altered genes involved with a variety of metabolic functions. In contrast, BS cholinergic neuronal age effects included gene groupings related to neuronal plasticity and a broad range of normal cellular functions. Transcription factor GA-binding protein alpha (GABPalpha), which controls expression of nuclear genes encoding mitochondrial proteins, was more strongly upregulated in the BF cholinergic neurons (+107%) than in the BS cholinergic population (+40%). The results suggest that aging elicits elevates metabolic activity in cholinergic populations and that this occurs to a much greater degree in the BF group than in the BS group.
Cancer Genome Atlas Research Network TCGA. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008;455(7216):1061-8.Abstract

Human cancer cells typically harbour multiple chromosomal aberrations, nucleotide substitutions and epigenetic modifications that drive malignant transformation. The Cancer Genome Atlas (TCGA) pilot project aims to assess the value of large-scale multi-dimensional analysis of these molecular characteristics in human cancer and to provide the data rapidly to the research community. Here we report the interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas--the most common type of adult brain cancer--and nucleotide sequence aberrations in 91 of the 206 glioblastomas. This analysis provides new insights into the roles of ERBB2, NF1 and TP53, uncovers frequent mutations of the phosphatidylinositol-3-OH kinase regulatory subunit gene PIK3R1, and provides a network view of the pathways altered in the development of glioblastoma. Furthermore, integration of mutation, DNA methylation and clinical treatment data reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated glioblastomas, an observation with potential clinical implications. Together, these findings establish the feasibility and power of TCGA, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.

Mueller JL, Mahadevaiah SK, Park PJ, Warburton PE, Page DC, Turner JMA. The mouse X chromosome is enriched for multicopy testis genes showing postmeiotic expression. Nat Genet 2008;40(6):794-9.Abstract

According to the prevailing view, mammalian X chromosomes are enriched in spermatogenesis genes expressed before meiosis and deficient in spermatogenesis genes expressed after meiosis. The paucity of postmeiotic genes on the X chromosome has been interpreted as a consequence of meiotic sex chromosome inactivation (MSCI)--the complete silencing of genes on the XY bivalent at meiotic prophase. Recent studies have concluded that MSCI-initiated silencing persists beyond meiosis and that most genes on the X chromosome remain repressed in round spermatids. Here, we report that 33 multicopy gene families, representing approximately 273 mouse X-linked genes, are expressed in the testis and that this expression is predominantly in postmeiotic cells. RNA FISH and microarray analysis show that the maintenance of X chromosome postmeiotic repression is incomplete. Furthermore, X-linked multicopy genes exhibit a similar degree of expression as autosomal genes. Thus, not only is the mouse X chromosome enriched for spermatogenesis genes functioning before meiosis, but in addition, approximately 18% of mouse X-linked genes are expressed in postmeiotic cells.

Tun HW, Personett D, Baskerville KA, Menke DM, Jaeckle KA, Kreinest P, Edenfield B, Zubair AC, O'Neill BP, Lai WR, Park PJ, McKinney M. Pathway analysis of primary central nervous system lymphoma. Blood 2008;111(6):3200-10.Abstract

Primary central nervous system (CNS) lymphoma (PCNSL) is a diffuse large B-cell lymphoma (DLBCL) confined to the CNS. A genome-wide gene expression comparison between PCNSL and non-CNS DLBCL was performed, the latter consisting of both nodal and extranodal DLBCL (nDLBCL and enDLBCL), to identify a "CNS signature." Pathway analysis with the program SigPathway revealed that PCNSL is characterized notably by significant differential expression of multiple extracellular matrix (ECM) and adhesion-related pathways. The most significantly up-regulated gene is the ECM-related osteopontin (SPP1). Expression at the protein level of ECM-related SPP1 and CHI3L1 in PCNSL cells was demonstrated by immunohistochemistry. The alterations in gene expression can be interpreted within several biologic contexts with implications for PCNSL, including CNS tropism (ECM and adhesion-related pathways, SPP1, DDR1), B-cell migration (CXCL13, SPP1), activated B-cell subtype (MUM1), lymphoproliferation (SPP1, TCL1A, CHI3L1), aggressive clinical behavior (SPP1, CHI3L1, MUM1), and aggressive metastatic cancer phenotype (SPP1, CHI3L1). The gene expression signature discovered in our study may represent a true "CNS signature" because we contrasted PCNSL with wide-spectrum non-CNS DLBCL on a genomic scale and performed an in-depth bioinformatic analysis.

Claus EB, Park PJ, Carroll R, Chan J, Black PM. Specific genes expressed in association with progesterone receptors in meningioma. Cancer Res 2008;68(1):314-22.Abstract

An association between hormones and meningioma has been postulated. No data exist that examine gene expression in meningioma by hormone receptor status. The data are surgical specimens from 31 meningioma patients undergoing neurosurgical resection at Brigham and Women's Hospital from March 15, 2004 to May 10, 2005. Progesterone and estrogen hormone receptors (PR and ER, respectively) were measured via immunohistochemistry and compared with gene expression profiling results. The sample is 77% female with a mean age of 55.7 years. Eighty percent were grade 1 and the mean MIB was 6.2, whereas 33% and 84% were ER+ and PR+, respectively. Gene expression seemed more strongly associated with PR status than with ER status. Genes on the long arm of chromosome 22 and near the neurofibromatosis type 2 (NF2) gene (22q12) were most frequently noted to have expression variation, with significant up-regulation in PR+ versus PR- lesions, suggesting a higher rate of 22q loss in PR- lesions. Pathway analyses indicated that genes in collagen and extracellular matrix pathways were most likely to be differentially expressed by PR status. These data, although preliminary, are the first to examine gene expression for meningioma cases by hormone receptor status and indicate a stronger association with PR than with ER status. PR status is related to the expression of genes near the NF2 gene, mutations in which have been identified as the initial event in many meningiomas. These findings suggest that PR status may be a clinical marker for genetic subgroups of meningioma and warrant further examination in a larger data set.

Lai W, Choudhary V, Park PJ. CGHweb: a tool for comparing DNA copy number segmentations from multiple algorithms. Bioinformatics 2008;24(7):1014-5.Abstract

UNLABELLED: Accurate estimation of DNA copy numbers from array comparative genomic hybridization (CGH) data is important for characterizing the cancer genome. An important part of this process is the segmentation of the log-ratios between the sample and control DNA along the chromosome into regions of different copy numbers. However, multiple algorithms are available in the literature for this procedure and the results can vary substantially among these. Thus, a visualization tool that can display the segmented profiles from a number of methods can be helpful to the biologist or the clinician to ascertain that a feature of interest did not arise as an artifact of the algorithm. Such a tool also allows the methodologist to easily contrast his method against others. We developed a web-based tool that applies a number of popular algorithms to a single array CGH profile entered by the user. It generates a heatmap panel of the segmented profiles for each method as well as a consensus profile. The clickable heatmap can be moved along the chromosome and zoomed in or out. It also displays the time that each algorithm took and provides numerical values of the segmented profiles for download. The web interface calls algorithms written in the statistical language R. We encourage developers of new algorithms to submit their routines to be incorporated into the website. AVAILABILITY: http://compbio.med.harvard.edu/CGHweb.

Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 2008;26(12):1351-9.Abstract

Recent progress in massively parallel sequencing platforms has enabled genome-wide characterization of DNA-associated proteins using the combination of chromatin immunoprecipitation and sequencing (ChIP-seq). Although a variety of methods exist for analysis of the established alternative ChIP microarray (ChIP-chip), few approaches have been described for processing ChIP-seq data. To fill this gap, we propose an analysis pipeline specifically designed to detect protein-binding positions with high accuracy. Using previously reported data sets for three transcription factors, we illustrate methods for improving tag alignment and correcting for background signals. We compare the sensitivity and spatial precision of three peak detection algorithms with published methods, demonstrating gains in spatial precision when an asymmetric distribution of tags on positive and negative strands is considered. We also analyze the relationship between the depth of sequencing and characteristics of the detected binding positions, and provide a method for estimating the sequencing depth necessary for a desired coverage of protein binding sites.

Orford K*, Kharchenko P*, Lai W, Dao MC, Worhunsky DJ, Ferro A, Janzen V, Park PJ**, Scadden DT**. Differential H3K4 methylation identifies developmentally poised hematopoietic genes. Dev Cell 2008;14(5):798-809.Abstract

Throughout development, cell fate decisions are converted into epigenetic information that determines cellular identity. Covalent histone modifications are heritable epigenetic marks and are hypothesized to play a central role in this process. In this report, we assess the concordance of histone H3 lysine 4 dimethylation (H3K4me2) and trimethylation (H3K4me3) on a genome-wide scale in erythroid development by analyzing pluripotent, multipotent, and unipotent cell types. Although H3K4me2 and H3K4me3 are concordant at most genes, multipotential hematopoietic cells have a subset of genes that are differentially methylated (H3K4me2+/me3-). These genes are transcriptionally silent, highly enriched in lineage-specific hematopoietic genes, and uniquely susceptible to differentiation-induced H3K4 demethylation. Self-renewing embryonic stem cells, which restrict H3K4 methylation to genes that contain CpG islands (CGIs), lack H3K4me2+/me3- genes. These data reveal distinct epigenetic regulation of CGI and non-CGI genes during development and indicate an interactive relationship between DNA sequence and differential H3K4 methylation in lineage-specific differentiation.

Park PJ. Epigenetics meets next-generation sequencing. Epigenetics 2008;3(6):318-21.Abstract

Next-generation sequencing is poised to unleash dramatic changes in every area of molecular biology. In the past few years, chromatin immunoprecipitation (ChIP) on tiled microarrays (ChIP-chip) has been an important tool for genome-wide mapping of DNA-binding proteins or histone modifications. Now, ChIP followed by direct sequencing of DNA fragments (ChIP-seq) offers superior data with less noise and higher resolution and is likely to replace ChIP-chip in the near future. We will describe advantages of this new technology and outline some of the issues in dealing with the data. ChIP-seq generates considerably larger quantities of data and the most challenging aspect for investigators will be computational and statistical analysis necessary to uncover biological insights hidden in the data.

Park PJ. Experimental design and data analysis for array comparative genomic hybridization. Cancer Invest 2008;26(9):923-8.Abstract

Array comparative genomic hybridization (aCGH) is a technique for measuring chromosomal aberrations in genomic DNA. With the availability of high-resolution microarrays, detailed characterization of the cancer genome has become possible. In this review, we discuss several issues in the generation and interpretation of aCGH data, including array platforms, experimental design, and data analysis. Due to the complexity of the data, application of appropriate statistical methods is crucial for avoiding false positive findings. We also describe integration of copy number data with other types of data to identify functional significance of observed aberrations.

Lee H, Kong SW, Park PJ. Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes. Bioinformatics 2008;24(7):889-96.Abstract

MOTIVATION: DNA copy number aberrations (CNAs) and gene expression (GE) changes provide valuable information for studying chromosomal instability and its consequences in cancer. While it is clear that the structural aberrations and the transcript levels are intertwined, their relationship is more complex and subtle than initially suspected. Most studies so far have focused on how a CNA affects the expression levels of those genes contained within that CNA. RESULTS: To better understand the impact of CNAs on expression, we investigated the correlation of each CNA to all other genes in the genome. The correlations are computed over multiple patients that have both expression and copy number measurements in brain, bladder and breast cancer data sets. We find that a CNA has a direct impact on the gene amplified or deleted, but it also has a broad, indirect impact elsewhere. To identify a set of CNAs that is coordinately associated with the expression changes of a set of genes, we used a biclustering algorithm on the correlation matrix. For each of the three cancer types examined, the aberrations in several loci are associated with cancer-type specific biological pathways that have been described in the literature: CNAs of chromosome (chr) 7p13 were significantly correlated with epidermal growth factor receptor signaling pathway in glioblastoma multiforme, chr 13q with NF-kappaB cascades in bladder cancer, and chr 11p with Reck pathway in breast cancer. In all three data sets, gene sets related to cell cycle/division such as M phase, DNA replication and cell division were also associated with CNAs. Our results suggest that CNAs are both directly and indirectly correlated with changes in expression and that it is beneficial to examine the indirect effects of CNAs. AVAILABILITY: The code is available upon request.

Sural TH, Peng S, Li B, Workman JL, Park PJ, Kuroda MI. The MSL3 chromodomain directs a key targeting step for dosage compensation of the Drosophila melanogaster X chromosome. Nat Struct Mol Biol 2008;15(12):1318-25.Abstract

The male-specific lethal (MSL) complex upregulates the single male X chromosome to achieve dosage compensation in Drosophila melanogaster. We have proposed that MSL recognition of specific entry sites on the X is followed by local targeting of active genes marked by histone H3 trimethylation (H3K36me3). Here we analyze the role of the MSL3 chromodomain in the second targeting step. Using ChIP-chip analysis, we find that MSL3 chromodomain mutants retain binding to chromatin entry sites but show a clear disruption in the full pattern of MSL targeting in vivo, consistent with a loss of spreading. Furthermore, when compared to wild type, chromodomain mutants lack preferential affinity for nucleosomes containing H3K36me3 in vitro. Our results support a model in which activating complexes, similarly to their silencing counterparts, use the nucleosomal binding specificity of their respective chromodomains to spread from initiation sites to flanking chromatin.

Kharchenko PV*, Woo CJ*, Tolstorukov MY, Kingston RE**, Park PJ**. Nucleosome positioning in human HOX gene clusters. Genome Res 2008;18(10):1554-61.Abstract

The distribution of nucleosomes along the genome is a significant aspect of chromatin structure and is thought to influence gene regulation through modulation of DNA accessibility. However, properties of nucleosome organization remain poorly understood, particularly in mammalian genomes. Toward this goal we used tiled microarrays to identify stable nucleosome positions along the HOX gene clusters in human cell lines. We show that nucleosome positions exhibit sequence properties and long-range organization that are different from those characterized in other organisms. Despite overall variability of internucleosome distances, specific loci contain regular nucleosomal arrays with 195-bp periodicity. Moreover, such arrays tend to occur preferentially toward the 3' ends of genes. Through comparison of different cell lines, we find that active transcription is correlated with increased positioning of nucleosomes, suggesting an unexpected role for transcription in the establishment of well-positioned nucleosomes.

Tolstorukov MY**, Choudhary V, Olson WK, Zhurkin VB, Park PJ**. nuScore: a web-interface for nucleosome positioning predictions. Bioinformatics 2008;24(12):1456-8.Abstract

SUMMARY: Sequence-directed mapping of nucleosome positions is of major biological interest. Here, we present a web-interface for estimation of the affinity of the histone core to DNA and prediction of nucleosome arrangement on a given sequence. Our approach is based on assessment of the energy cost of imposing the deformations required to wrap DNA around the histone surface. The interface allows the user to specify a number of options such as selecting from several structural templates for threading calculations and adding random sequences to the analysis. AVAILABILITY: The nuScore interface is freely available for use at http://compbio.med.harvard.edu/nuScore. CONTACT: peter_park@harvard.edu; tolstorukov@gmail.com SUPPLEMENTARY INFORMATION: The site contains user manual, description of the methodology and examples.

Alekseyenko AA, Peng S, Larschan E, Gorchakov AA, Lee O-K, Kharchenko P, McGrath SD, Wang CI, Mardis ER, Park PJ, Kuroda MI. A sequence motif within chromatin entry sites directs MSL establishment on the Drosophila X chromosome. Cell 2008;134(4):599-609.Abstract

The Drosophila MSL complex associates with active genes specifically on the male X chromosome to acetylate histone H4 at lysine 16 and increase expression approximately 2-fold. To date, no DNA sequence has been discovered to explain the specificity of MSL binding. We hypothesized that sequence-specific targeting occurs at "chromatin entry sites," but the majority of sites are sequence independent. Here we characterize 150 potential entry sites by ChIP-chip and ChIP-seq and discover a GA-rich MSL recognition element (MRE). The motif is only slightly enriched on the X chromosome ( approximately 2-fold), but this is doubled when considering its preferential location within or 3' to active genes (>4-fold enrichment). When inserted on an autosome, a newly identified site can direct local MSL spreading to flanking active genes. These results provide strong evidence for both sequence-dependent and -independent steps in MSL targeting of dosage compensation to the male X chromosome.

Dermody JL, Dreyfuss JM, Villén J, Ogundipe B, Gygi SP, Park PJ, Ponticelli AS, Moore CL, Buratowski S, Bucheli ME. Unphosphorylated SR-like protein Npl3 stimulates RNA polymerase II elongation. PLoS One 2008;3(9):e3273.Abstract

The production of a functional mRNA is regulated at every step of transcription. An area not well-understood is the transition of RNA polymerase II from elongation to termination. The S. cerevisiae SR-like protein Npl3 functions to negatively regulate transcription termination by antagonizing the binding of polyA/termination proteins to the mRNA. In this study, Npl3 is shown to interact with the CTD and have a direct stimulatory effect on the elongation activity of the polymerase. The interaction is inhibited by phosphorylation of Npl3. In addition, Casein Kinase 2 was found to be required for the phosphorylation of Npl3 and affect its ability to compete against Rna15 (Cleavage Factor I) for binding to polyA signals. Our results suggest that phosphorylation of Npl3 promotes its dissociation from the mRNA/RNAP II, and contributes to the association of the polyA/termination factor Rna15. This work defines a novel role for Npl3 in elongation and its regulation by phosphorylation.