%0 Journal Article %J Cell %D 2024 %T Contrasting somatic mutation patterns in aging human neurons and oligodendrocytes %A Ganz, Javier %A Luquette, L J %A Bizzotto, S %A Miller, M B %A Zhou, Zinan %A Bohrson, C L %A Jin, Hu %A Tran, A V %A Viswanadham, Vinayak V %A McDonough, G %A Brown, K %A Chahine, Yasmine %A Chhouk, B %A Galor, Alon %A Park, Peter J %A Walsh, C A %X Characterizing somatic mutations in the brain is important for disentangling the complex mechanisms of aging, yet little is known about mutational patterns in different brain cell types. Here, we performed whole-genome sequencing (WGS) of 86 single oligodendrocytes, 20 mixed glia, and 56 single neurons from neurotypical individuals spanning 0.4–104 years of age and identified >92,000 somatic single-nucleotide variants (sSNVs) and small insertions/deletions (indels). Although both cell types accumulate somatic mutations linearly with age, oligodendrocytes accumulated sSNVs 81% faster than neurons and indels 28% slower than neurons. Correlation of mutations with single-nucleus RNA profiles and chromatin accessibility from the same brains revealed that oligodendrocyte mutations are enriched in inactive genomic regions and are distributed across the genome similarly to mutations in brain cancers. In contrast, neuronal mutations are enriched in open, transcriptionally active chromatin. These stark differences suggest an assortment of active mutagenic processes in oligodendrocytes and neurons. %B Cell %G eng %0 Journal Article %J Nature Genetics %D 2024 %T Chromosome evolution screens recapitulate tissue-specific tumor aneuploidy patterns %A Watson, Emma V %A Lee, Jake June-Koo %A Gulhan, Doga C %A Melloni, G E M %A Venev, S V %A Magesh, R Y %A Frederick, A %A Chiba, K %A Wooten, Eric C %A Naxerova, K %A Dekker, J %A Park, Peter J %A Elledge, Stephen J %X

Whole chromosome and arm-level copy number alterations occur at high frequencies in tumors, but their selective advantages, if any, are poorly understood. Here, utilizing unbiased whole chromosome genetic screens combined with in vitro evolution to generate arm- and subarm-level events, we iteratively selected the fittest karyotypes from aneuploidized human renal and mammary epithelial cells. Proliferation-based karyotype selection in these epithelial lines modeled tissue-specific tumor aneuploidy patterns in patient cohorts in the absence of driver mutations. Hi-C-based translocation mapping revealed that arm-level events usually emerged in multiples of two via centromeric translocations and occurred more frequently in tetraploids than diploids, contributing to the increased diversity in evolving tetraploid populations. Isogenic clonal lineages enabled elucidation of pro-tumorigenic mechanisms associated with common copy number alterations, revealing Notch signaling potentiation as a driver of 1q gain in breast cancer. We propose that intrinsic, tissue-specific proliferative effects underlie tumor copy number patterns in cancer.

%B Nature Genetics %G eng %0 Journal Article %J Nature Genetics %D 2024 %T Accurate and sensitive mutational signature analysis with MuSiCal %A Jin, Hu %A Gulhan, Doga C %A Geiger, Benedikt %A Ben-Isvy, Daniel %A Geng, David %A Ljungström, V %A Park, Peter J %X Mutational signature analysis is a recent computational approach for interpreting somatic mutations in the genome. Its application to cancer data has enhanced our understanding of mutational forces driving tumorigenesis and demonstrated its potential to inform prognosis and treatment decisions. However, methodological challenges remain for discovering new signatures and assigning proper weights to existing signatures, thereby hindering broader clinical applications. Here we present Mutational Signature Calculator (MuSiCal), a rigorous analytical framework with algorithms that solve major problems in the standard workflow. Our simulation studies demonstrate that MuSiCal outperforms state-of-the-art algorithms for both signature discovery and assignment. By reanalyzing more than 2,700 cancer genomes, we provide an improved catalog of signatures and their assignments, discover nine indel signatures absent in the current catalog, resolve long-standing issues with the ambiguous ‘flat’ signatures and give insights into signatures with unknown etiologies. We expect MuSiCal and the improved catalog to be a step towards establishing best practices for mutational signature analysis. %B Nature Genetics %G eng %0 Journal Article %J Clinical Genitourinary Cancer %D 2024 %T A Panel-Based Mutational Signature of Mismatch Repair Deficiency is Associated With Durable Response to Pembrolizumab in Metastatic Castration-Resistant Prostate Cancer %A Boiarsky, Daniel %A Gulhan, Doga C %A Savignano, H %A Lakshminarayanan, G %A McClure, H M %A Silver, R %A Hirsch, M S %A Sholl, L M %A Choudhury, A D %A Ananda, G %A Park, P J %A Tewari, A K %A Berchuck, Alok K %X Immune checkpoint inhibitors (ICIs) have limited efficacy in prostate cancer (PCa) and better biomarkers are needed to predict responses to ICIs. In this study, we found that SigMA detects additional cases of mismatch repair deficiency as compared to microsatellite testing in PCa and identifies patients likely to experience durable response to pembrolizumab. %B Clinical Genitourinary Cancer %V 01 %G eng %N 011 %0 Journal Article %J PNAS %D 2023 %T A role for mutations in AK9 and other genes affecting ependymal cells in idiopathic normal pressure hydrocephalus %A Yang, H W %A Lee, S %A Berry, B C %A Yang, D %A Zheng, S %A Carroll, R S %A Park, J P %A Johnson, M D %X Idiopathic normal pressure hydrocephalus (iNPH) is an enigmatic neurological disorder that develops after age 60 and is characterized by gait difficulty, dementia, and incontinence. Recently, we reported that heterozygous CWH43 deletions may cause iNPH. Here, we identify mutations affecting nine additional genes (AK9RXFP2, PRKD1, HAVCR1, OTOG, MYO7A, NOTCH1, SPG11, and MYH13) that are statistically enriched among iNPH patients. The encoded proteins are all highly expressed in choroid plexus and ependymal cells, and most have been associated with cilia. Damaging mutations in AK9, which encodes an adenylate kinase, were detected in 9.6% of iNPH patients. Mice homozygous for an iNPH-associated AK9 mutation displayed normal cilia structure and number, but decreased cilia motility and beat frequency, communicating hydrocephalus, and balance impairment. AK9+/− mice displayed normal brain development and behavior until early adulthood, but subsequently developed communicating hydrocephalus. Together, our findings suggest that heterozygous mutations that impair ventricular epithelial function may contribute to iNPH. %B PNAS %V 120 %G eng %N 51 %0 Journal Article %J Scientific Data %D 2023 %T Genomic data resources of the Brain Somatic Mosaicism Network for neuropsychiatric diseases %A Garrison, M A %A Jang, Yeongjun %A Bae, T %A Cherskov, A %A Emery, S B %A Fasching, L %A Jones, A %A Moldovan, John B %A Molitor, Cindy %A Pochareddy, S %A Peters, M A %A Shin, J. H. %A Wang, Yifan %A Yang, X %A Akbarian, S %A Chess, A %A Gage, F H %A Gleeson, J G %A Kidd, J M %A McConnell, M %A Mills, Ryan E %A Moran, J V %A Park, Peter J %A Sestan, N %A Urban, A E %A Vaccarino, F M %A Walsh, C A %A Weinberger, D R %A Wheelan, S J %A Abyzov, A %A BSMN Consortium %X

Somatic mosaicism is defined as an occurrence of two or more populations of cells having genomic sequences differing at given loci in an individual who is derived from a single zygote. It is a characteristic of multicellular organisms that plays a crucial role in normal development and disease. To study the nature and extent of somatic mosaicism in autism spectrum disorder, bipolar disorder, focal cortical dysplasia, schizophrenia, and Tourette syndrome, a multi-institutional consortium called the Brain Somatic Mosaicism Network (BSMN) was formed through the National Institute of Mental Health (NIMH). In addition to genomic data of affected and neurotypical brains, the BSMN also developed and validated a best practices somatic single nucleotide variant calling workflow through the analysis of reference brain tissue. These resources, which include >400 terabytes of data from 1087 subjects, are now available to the research community via the NIMH Data Archive (NDA) and are described here.

%B Scientific Data %G eng %0 Journal Article %J Nature Genetics %D 2023 %T A pan-tissue survey of mosaic chromosomal alterations in 948 individuals %A Gao, Teng %A Kastriti, Maria Eleni %A Ljungström, Viktor %A Heinzel, Andreas %A Tischler, A S %A Oberbauer, Rainer %A Po-Ru Loh %A Adameyko, Igor %A Park, Peter J** %A Kharchenko, P** %X Genetic mutations accumulate in an organism’s body throughout its lifetime. While somatic single-nucleotide variants have been well characterized in the human body, the patterns and consequences of large chromosomal alterations in normal tissues remain largely unknown. Here, we present a pan-tissue survey of mosaic chromosomal alterations (mCAs) in 948 healthy individuals from the Genotype-Tissue Expression project, augmenting RNA-based allelic imbalance estimation with haplotype phasing. We found that approximately a quarter of the individuals carry a clonally-expanded mCA in at least one tissue, with incidence strongly correlated with age. The prevalence and genome-wide patterns of mCAs vary considerably across tissue types, suggesting tissue-specific mutagenic exposure and selection pressures. The mCA landscapes in normal adrenal and pituitary glands resemble those in tumors arising from these tissues, whereas the same is not true for the esophagus and skin. Together, our findings show a widespread age-dependent emergence of mCAs across normal human tissues with intricate connections to tumorigenesis. %B Nature Genetics %G eng %0 Journal Article %J Nature Methods %D 2023 %T Chromoscope: interactive multiscale visualization for structural variation in human genomes %A L Yi, Sehi %A Maziec, D %A Stevens, V %A Manz, T %A Veit, Alexander %A Berselli, Michele %A Park, Peter J** %A Głodzik, D** %A Gehlenborg, N** %B Nature Methods %G eng %0 Journal Article %J Nucleic Acids Research %D 2023 %T The landscape of human SVA retrotransposons %A Chu, Chong %A Lin, Eric W %A Tran, Antuan %A Jin, Hu %A Ho, Natalie I %A Veit, Alexander %A Cortes-Ciriano, Isidro %A Burns, Kathleen H %A Ting, David T %A Park, Peter J %X SINE-VNTR-Alu (SVA) retrotransposons are evolutionarily young and still-active transposable elements (TEs) in the human genome. Several pathogenic SVA insertions have been identified that directly mutate host genes to cause neurodegenerative and other types of diseases. However, due to their sequence heterogeneity and complex structures as well as limitations in sequencing techniques and analysis, SVA insertions have been less well studied compared to other mobile element insertions. Here, we identified polymorphic SVA insertions from 3646 whole-genome sequencing (WGS) samples of >150 diverse populations and constructed a polymorphic SVA insertion reference catalog. Using 20 long-read samples, we also assembled reference and polymorphic SVA sequences and characterized the internal hexamer/variable-number-tandem-repeat (VNTR) expansions as well as differing SVA activity for SVA subfamilies and human populations. In addition, we developed a module to annotate both reference and polymorphic SVA copies. By characterizing the landscape of both reference and polymorphic SVA retrotransposons, our study enables more accurate genotyping of these elements and facilitate the discovery of pathogenic SVA insertions. %B Nucleic Acids Research %G eng %0 Journal Article %J Clinical Cancer Research %D 2023 %T Hyper-Dependence on NHEJ Enables Synergy Between DNA-PK Inhibitors and Low-Dose Doxorubicin in Leiomyosarcoma %A Mariño-Enríquez, Adrián %A Philipp Novotny, Jan %A Gulhan, Doga C %A Klooster, Isabella %A Tran, Antuan V %A Kasbo, M %A Lundberg, M Z %A Ou, Wen-Bin %A Tao, Derrick L %A Pilco-Janeta, D F %A Mao, Victor Y %A Zenke, Frank T %A Leeper, B A %A Gokhale, P C %A Cowley, G S %A Baker, L H %A Ballman, K V %A Root, David E %A Albers, J %A Park, Peter J %A George, Suzanne %A Fletcher, J A %X

Purpose: Leiomyosarcoma (LMS) is an aggressive sarcoma for which standard chemotherapies achieve response rates under 30%. There are no effective targeted therapies against LMS. Most LMS are characterized by chromosomal instability (CIN), resulting in part from TP53 and RB1 co-inactivation and DNA damage repair defects. We sought to identify therapeutic targets that could exacerbate intrinsic CIN and DNA damage in LMS, inducing lethal genotoxicity.

Experimental design: We performed clinical targeted sequencing in 287 LMS and genome-wide loss-of-function screens in 3 patient-derived LMS cell lines, to identify LMS-specific dependencies. We validated candidate targets by biochemical and cell-response assays in vitro and in 7 mouse models.

Results: Clinical targeted sequencing revealed a high burden of somatic copy number alterations (median fraction of the genome altered=0.62) and demonstrated homologous recombination deficiency signatures in 35% of LMS. Genome-wide shRNA screens demonstrated PRKDC (DNA-PKcs) and RPA2 essentiality, consistent with compensatory non-homologous end joining hyper-dependence. DNA-PK inhibitor combinations with unconventionally low-dose doxorubicin had synergistic activity in LMS in vitro models. Combination therapy with peposertib and low-dose doxorubicin (standard or liposomal formulations) inhibited growth of 5 of 7 LMS mouse models without toxicity.

Conclusion: Combinations of DNA-PK inhibitors with unconventionally low, sensitizing, doxorubicin dosing showed synergistic effects in LMS in vitro and in vivo models, without discernable toxicity. These findings underscore the relevance of DNA damage repair alterations in LMS pathogenesis and identify dependence on NHEJ as a clinically actionable vulnerability in LMS.

%B Clinical Cancer Research %G eng %0 Journal Article %J Molecular Cell %D 2023 %T Spatial and temporal organization of the genome: Current state and future aims of the 4D nucleome project %A Dekker, J %A Alber, F %A Aufmkolk, S %A Believeau, Brian J %A Bruneau, Benoit G %A Belmont, A S %A Bintu, L %A Boettiger, A %A Calandrelli, R %A Disteche, C M %A Gilbert, D M %A Gregor, T %A Hansen, A S %A Huang, Bo %A Huangfu, Danwei %A Kalhor, R %A Leslie, C S %A W Li %A Y. Li %A Ma, J %A Noble, W S %A Park, Peter J %A Phillips-Cremins, J E %A Pollard, K S %A Rafelski, SM %A Ren, B %A Ruan, Y %A Shav-Tal, Y %A Y. Shen %A J Shendure %A Shu, X %A Strambio-De-Castilla, Caterina %A Vertii, A %A Zhang, H %A Zhong, S %X The four-dimensional nucleome (4DN) consortium studies the architecture of the genome and the nucleus in space and time. We summarize progress by the consortium and highlight the development of technologies for (1) mapping genome folding and identifying roles of nuclear components and bodies, proteins, and RNA, (2) characterizing nuclear organization with time or single-cell resolution, and (3) imaging of nuclear organization. With these tools, the consortium has provided over 2,000 public datasets. Integrative computational models based on these data are starting to reveal connections between genome structure and function. We then present a forward-looking perspective and outline current aims to (1) delineate dynamics of nuclear architecture at different timescales, from minutes to weeks as cells differentiate, in populations and in single cells, (2) characterize cis-determinants and trans-modulators of genome organization, (3) test functional consequences of changes in cis- and trans-regulators, and (4) develop predictive models of genome structure and function. %B Molecular Cell %V 83 %P 2624-2640 %G eng %N 15 %0 Journal Article %J Nature %D 2023 %T A framework for individualized splice-switching oligonucleotide therapy %A Kim, Jinkuk %A Woo, Sijae %A de Gusmao, Claudio M %A Zhao, Boxun %A Chin, Diana H %A DiDonato, Renata L %A Nguyen, Minh A %A Nakayama, Tojo %A Hu, Chunguang April %A Soucy, A %A Kuniholm, A %A Thornton, J K %A Riccardi, O %A Friedman, Danielle A %A Moufawad El Achkar, Christelle %A Dash, Zane %A Cornelissen, L %A Donado, C %A Faour, Kamli N W %A Bush, Lynn W %A Suslovitch, V %A Lentucci, C %A Park, Peter J %A Lee, Eunjung Alice %A Patterson, A %A Philippakis, A A %A Margus, B %A Berde, C B %A Yu, T W %X Splice-switching antisense oligonucleotides (ASOs) could be used to treat a subset of individuals with genetic diseases1, but the systematic identification of such individuals remains a challenge. Here we performed whole-genome sequencing analyses to characterize genetic variation in 235 individuals (from 209 families) with ataxia-telangiectasia, a severely debilitating and life-threatening recessive genetic disorder2,3, yielding a complete molecular diagnosis in almost all individuals. We developed a predictive taxonomy to assess the amenability of each individual to splice-switching ASO intervention; 9% and 6% of the individuals had variants that were ‘probably’ or ‘possibly’ amenable to ASO splice modulation, respectively. Most amenable variants were in deep intronic regions that are inaccessible to exon-targeted sequencing. We developed ASOs that successfully rescued mis-splicing and ATM cellular signalling in patient fibroblasts for two recurrent variants. In a pilot clinical study, one of these ASOs was used to treat a child who had been diagnosed with ataxia-telangiectasia soon after birth, and showed good tolerability without serious adverse events for three years. Our study provides a framework for the prospective identification of individuals with genetic diseases who might benefit from a therapeutic approach involving splice-switching ASOs. %B Nature %V 619 %P 828-836 %G eng %0 Journal Article %J Nature %D 2023 %T ERα-associated translocations underlie oncogene amplifications in breast cancer %A Lee, Jake June-Koo %A Jung, Youngsook Lucy %A Cheong, Taek-Chin %A Valle-Inclan, Jose Espejo %A Chong, Chu %A Gulhan, Doga C %A Ljungström, Viktor %A Jin, Hu %A Viswanadham, Vinayak V %A Watson, Emma V %A Cortés-Ciriano, Isidro %A Elledge, Stephen J %A Chiarle, Roberto %A Pellman, David %A Park, Peter J %X

Focal copy-number amplification is an oncogenic event. Although recent studies have revealed the complex structure1,2,3 and the evolutionary trajectories4 of oncogene amplicons, their origin remains poorly understood. Here we show that focal amplifications in breast cancer frequently derive from a mechanism—which we term translocation–bridge amplification—involving inter-chromosomal translocations that lead to dicentric chromosome bridge formation and breakage. In 780 breast cancer genomes, we observe that focal amplifications are frequently connected to each other by inter-chromosomal translocations at their boundaries. Subsequent analysis indicates the following model: the oncogene neighbourhood is translocated in G1 creating a dicentric chromosome, the dicentric chromosome is replicated, and as dicentric sister chromosomes segregate during mitosis, a chromosome bridge is formed and then broken, with fragments often being circularized in extrachromosomal DNAs. This model explains the amplifications of key oncogenes, including ERBB2 and CCND1. Recurrent amplification boundaries and rearrangement hotspots correlate with oestrogen receptor binding in breast cancer cells. Experimentally, oestrogen treatment induces DNA double-strand breaks in the oestrogen receptor target regions that are repaired by translocations, suggesting a role of oestrogen in generating the initial translocations. A pan-cancer analysis reveals tissue-specific biases in mechanisms initiating focal amplifications, with the breakage–fusion–bridge cycle prevalent in some and the translocation–bridge amplification in others, probably owing to the different timing of DNA break repair. Our results identify a common mode of oncogene amplification and propose oestrogen as its mechanistic origin in breast cancer.

News coverage on this paper:

%B Nature %G eng %U https://hms.harvard.edu/news/how-breast-cancer-arises %0 Journal Article %J Nat Genetics %D 2023 %T Comprehensive multi-omic profiling of somatic mutations in malformations of cortical development %A Chung, Changuk %A Yang, Xiaoxu %A Bae, Taejeong %A Vong, KI %A Mittal, S %A Donkels, C %A Phillips, H Westley %A Li,Z. %A Marsh, Ashley Pl %A Breuss, Martin W %A Ball, Laurel L %A Garcia, Camila Araújo Bernardino %A George, Renee D %A Gu, Jing %A M. Xu %A Barrows, C %A James, Kiely N %A Stanley, V %A Nidhiry, Anna S %A Khoury, Sami %A Howe, G %A Riley, E %A Xu, Xin %A Copeland, B %A Wang, Yifan %A Kim, Se Hoon %A Kang, Hoon-Chul %A Schulze-Bonhage, Andreas %A Haas, Carola A %A Urbach, Horst %A Prinz, Marco %A Limbrick Jr, David D %A Gurnett, Christina A %A Smyth, Matthew D %A Sattar, S %A Nespeca, M %A Gonda, David D %A Imai, Katsumi %A Takahashi, Y. %A Chen, Hsin-Hung %A Tsai, Jin-Wu %A Conti, Valerio %A Guerrini, Renzo %A Devinsky, O %A Silva Jr, Wilson A %A Machado, Helio R %A Mathern, Gary W %A Abyzov, A %A Baldassari, Sara %A Baulac, S %A Focal Cortical Dysplasia Neurogenetics Consortium %A Brain Somatic Mosaicism Network, BSM %A Gleeson, Joseph G %X Malformations of cortical development (MCD) are neurological conditions involving focal disruptions of cortical architecture and cellular organization that arise during embryogenesis, largely from somatic mosaic mutations, and cause intractable epilepsy. Identifying the genetic causes of MCD has been a challenge, as mutations remain at low allelic fractions in brain tissue resected to treat condition-related epilepsy. Here we report a genetic landscape from 283 brain resections, identifying 69 mutated genes through intensive profiling of somatic mutations, combining whole-exome and targeted-amplicon sequencing with functional validation including in utero electroporation of mice and single-nucleus RNA sequencing. Genotype–phenotype correlation analysis elucidated specific MCD gene sets associated with distinct pathophysiological and clinical phenotypes. The unique single-cell level spatiotemporal expression patterns of mutated genes in control and patient brains indicate critical roles in excitatory neurogenic pools during brain development and in promoting neuronal hyperexcitability after birth. %B Nat Genetics %V 55 %P 209-220 %G eng %0 Journal Article %J Cancer Discovery %D 2023 %T Genomic patterns of malignant peripheral nerve sheath tumor (MPNST) evolution correlate with clinical outcome and are detectable in cell-free DNA %A Cortes-Ciriano, Isidro %A Steele, Christopher D %A Piculell, Katherine %A Al-Ibraheemi, Alyaa %A Eulo, Vanessa %A Bui, Marilyn M %A Chatzipli, Aikaterini %A Dickson, Brendan C %A Borcherding, Dana C %A Feber, Andrew %A Galor, Alon %A %A Jones, Kevin B %A Jordan, Justin T %A Kim, Raymond H %A Lindsay, Daniel %A Miller, C %A Nishida, Y %A Proszek, Paula Z %A Serrano, J %A Sundby, R Taylor %A Szymanski, Jeffery J %A Ullrich, Nicole J %A Viskochil, David %A Wang, Xia %A Snuderl, M %A Park, Peter J %A Flanagan, Adrienne M %A Hirbe, Angela C %A Pillay, N %A Miller, David T %X

Malignant peripheral nerve sheath tumor (MPNST), an aggressive soft-tissue sarcoma, occurs in people with neurofibromatosis type 1 (NF1) and sporadically. Whole-genome and multiregional exome sequencing, transcriptomic, and methylation profiling of 95 tumor samples revealed the order of genomic events in tumor evolution. Following biallelic inactivation of NF1, loss of CDKN2A or TP53 with or without inactivation of polycomb repressive complex 2 (PRC2) leads to extensive somatic copy-number aberrations (SCNA). Distinct pathways of tumor evolution are associated with inactivation of PRC2 genes and H3K27 trimethylation (H3K27me3) status. Tumors with H3K27me3 loss evolve through extensive chromosomal losses followed by whole-genome doubling and chromosome 8 amplification, and show lower levels of immune cell infiltration. Retention of H3K27me3 leads to extensive genomic instability, but an immune cell-rich phenotype. Specific SCNAs detected in both tumor samples and cell-free DNA (cfDNA) act as a surrogate for H3K27me3 loss and immune infiltration, and predict prognosis.

Significance:

MPNST is the most common cause of death and morbidity for individuals with NF1, a relatively common tumor predisposition syndrome. Our results suggest that somatic copy-number and methylation profiling of tumor or cfDNA could serve as a biomarker for early diagnosis and to stratify patients into prognostic and treatment-related subgroups.

%B Cancer Discovery %V 13 %P 654-671 %G eng %N 3 %0 Journal Article %J Clinical Cancer Research %D 2022 %T Mutational Signature 3 Detected from Clinical Panel Sequencing is Associated with Responses to Olaparib in Breast and Ovarian Cancers %A Batalini, Felipe %A Gulhan, Doga C %A Mao, Victor %A Tran, Antuan %A Polak, Madeline %A Xiong, Niya %A Tayob, Nabihah %A Nadine M Tung %A Winer, Eric P %A Mayer, Erica L %A Knappskog, Stian %A Lønning, Per E %A Matulonis, Ursula A %A Konstantinopoulos, Panagiotis A %A Solit, David B %A Won, Helen %A Eikesdal, Hans P %A Park, Peter J %A Wulf, Gerburg M %X

Purpose: The identification of patients with homologous recombination deficiency (HRD) beyond BRCA1/2 mutations is an urgent task, as they may benefit from PARP inhibitors. We have previously developed a method to detect mutational signature 3 (Sig3), termed SigMA, associated with HRD from clinical panel sequencing data, that is able to reliably detect HRD from the limited sequencing data derived from gene-focused panel sequencing.

Experimental design: We apply this method to patients from two independent datasets: (i) high-grade serous ovarian cancer and triple-negative breast cancer (TNBC) from a phase Ib trial of the PARP inhibitor olaparib in combination with the PI3K inhibitor buparlisib (BKM120; NCT01623349), and (ii) TNBC patients who received neoadjuvant olaparib in the phase II PETREMAC trial (NCT02624973).

Results: We find that Sig3 as detected by SigMA is positively associated with improved progression-free survival and objective responses. In addition, comparison of Sig3 detection in panel and exome-sequencing data from the same patient samples demonstrated highly concordant results and superior performance in comparison with the genomic instability score.

Conclusions: Our analyses demonstrate that HRD can be detected reliably from panel-sequencing data that are obtained as part of routine clinical care, and that this approach can identify patients beyond those with germline BRCA1/2mut who might benefit from PARP inhibitors. Prospective clinical utility testing is warranted.

%B Clinical Cancer Research %V 28 %P 4714-4723 %G eng %N 21 %0 Journal Article %J Nature Genetics %D 2022 %T Single-cell genome sequencing of human neurons identifies somatic point mutation and indel enrichment in regulatory elements %A Luquette, Lovelace J %A Miller, Michael B %A Zhou, Zinan %A Bohrson, Craig L %A Zhao, Yifan %A Jin, Hu %A Gulhan, Doga %A Ganz, Javier %A Bizzotto, Sara %A Kirkham, Samantha %A Hochepied, Tino %A Libert, Claude %A Galor, Alon %A Kim, Junho %A Lodato, Michael A %A Garaycoechea, Juan I %A Gawad, Charles %A West, Jay %A Walsh, Christopher A %A Park, Peter J %X Accurate somatic mutation detection from single-cell DNA sequencing is challenging due to amplification-related artifacts. To reduce this artifact burden, an improved amplification technique, primary template-directed amplification (PTA), was recently introduced. We analyzed whole-genome sequencing data from 52 PTA-amplified single neurons using SCAN2, a new genotyper we developed to leverage mutation signatures and allele balance in identifying somatic single-nucleotide variants (SNVs) and small insertions and deletions (indels) in PTA data. Our analysis confirms an increase in nonclonal somatic mutation in single neurons with age, but revises the estimated rate of this accumulation to 16 SNVs per year. We also identify artifacts in other amplification methods. Most importantly, we show that somatic indels increase by at least three per year per neuron and are enriched in functional regions of the genome such as enhancers and promoters. Our data suggest that indels in gene-regulatory elements have a considerable effect on genome integrity in human neurons. %B Nature Genetics %V 54 %P 1564-1571 %G eng %0 Journal Article %J Science %D 2022 %T Analysis of somatic mutations in 131 human brains reveals aging-associated hypermutability %A Bae, Taejeong %A Fasching, Liana %A Wang, Yifan %A Shin, Joo Heon %A Suvakov, Milovan %A Jang, Yeongjun %A Norton, Scott %A Dias, Caroline %A Mariani, Jessica %A Jourdon, Alexandre %A Wu, Feinan %A Panda, Arijit %A Pattni, Reenal %A Chahine, Yasmine %A Yeh, Rebecca %A Roberts, Rosalinda C %A Huttner, Anita %A Kleinman, Joel E %A Hyde, Thomas M %A Straub, Richard E %A Walsh, Christopher A %A Brain Somatic Mosaicism Network, BSM %A Urban, Alexander E %A Leckman, James F %A Weinberger, Daniel R %A Vaccarino, Flora M %A Abyzov, Alexej %X We analyzed 131 human brains (44 neurotypical, 19 with Tourette syndrome, 9 with schizophrenia, and 59 with autism) for somatic mutations after whole genome sequencing to a depth of more than 200×. Typically, brains had 20 to 60 detectable single-nucleotide mutations, but ~6% of brains harbored hundreds of somatic mutations. Hypermutability was associated with age and damaging mutations in genes implicated in cancers and, in some brains, reflected in vivo clonal expansions. Somatic duplications, likely arising during development, were found in ~5% of normal and diseased brains, reflecting background mutagenesis. Brains with autism were associated with mutations creating putative transcription factor binding motifs in enhancer-like regions in the developing brain. The top-ranked affected motifs corresponded to MEIS (myeloid ectopic viral integration site) transcription factors, suggesting a potential link between their involvement in gene regulation and autism. %B Science %V 377 %P 511-517 %G eng %N 6605 %0 Journal Article %J Nature Communications %D 2022 %T The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data %A Reiff, Sarah B %A Schroeder, Andrew J %A Kırlı, Koray %A Cosolo, Andrea %A Bakker, Clara %A Mercado, Luisa %A Lee, Soohyun %A Veit, Alexander D %A Balashov, Alexander K %A Vitzthum, Carl %A Ronchetti, William %A Pitman, Kent M %A Johnson, Jeremy %A Ehmsen, Shannon R %A Kerpedjiev, Peter %A Abdennur, Nezar %A Imakaev, Maxim %A Öztürk, Serkan Utku %A Çamoğlu, Uğur %A Mirny, Leonid A %A Gehlenborg, N* %A Alver, Burak H* %A Park, Peter J* %X The 4D Nucleome (4DN) Network aims to elucidate the complex structure and organization of chromosomes in the nucleus and the impact of their disruption in disease biology. We present the 4DN Data Portal ( https://data.4dnucleome.org/ ), a repository for datasets generated in the 4DN network and relevant external datasets. Datasets were generated with a wide range of experiments, including chromosome conformation capture assays such as Hi-C and other innovative sequencing and microscopy-based assays probing chromosome architecture. All together, the 4DN data portal hosts more than 1800 experiment sets and 36000 files. Results of sequencing-based assays from different laboratories are uniformly processed and quality-controlled. The portal interface allows easy browsing, filtering, and bulk downloads, and the integrated HiGlass genome browser allows interactive visualization and comparison of multiple datasets. The 4DN data portal represents a primary resource for chromosome contact and other nuclear architecture data for the scientific community. %B Nature Communications %V 13 %P 2365 %8 2022 May 02 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/35501320?dopt=Abstract %R 10.1038/s41467-022-29697-4 %0 Journal Article %J Nature Reviews Genetics %D 2022 %T Computational analysis of cancer genome sequencing data %A Cortés-Ciriano, Isidro %A Gulhan, Doga C %A Lee, Jake June-Koo %A Melloni, Giorgio EM %A Park, Peter J* %K Chromosome Mapping %K Computational Biology %K DNA Copy Number Variations %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Mutation %K Neoplasms %X Distilling biologically meaningful information from cancer genome sequencing data requires comprehensive identification of somatic alterations using rigorous computational methods. As the amount and complexity of sequencing data have increased, so has the number of tools for analysing them. Here, we describe the main steps involved in the bioinformatic analysis of cancer genomes, review key algorithmic developments and highlight popular tools and emerging technologies. These tools include those that identify point mutations, copy number alterations, structural variations and mutational signatures in cancer genomes. We also discuss issues in experimental design, the strengths and limitations of sequencing modalities and methodological challenges for the future. %B Nature Reviews Genetics %V 23 %P 298-314 %8 2022 May %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/34880424?dopt=Abstract %R 10.1038/s41576-021-00431-y %0 Journal Article %J Bioinformatics %D 2022 %T Pairs and Pairix: a file format and a tool for efficient storage and retrieval for Hi-C read pairs %A Lee, Soohyun %A Bakker, Clara %A Vitzthum, Carl %A Alver, Burak H %A Park, Peter J* %X SUMMARY: As the amount of three-dimensional chromosomal interaction data continues to increase, storing and accessing such data efficiently becomes paramount. We introduce Pairs, a block-compressed text file format for storing paired genomic coordinates from Hi-C data, and Pairix, an open-source C application to index and query Pairs files. Pairix (also available in Python and R) extends the functionalities of Tabix to paired coordinates data. We have also developed PairsQC, a collapsible HTML quality control report generator for Pairs files. AVAILABILITY: The format specification and source code are available at https://github.com/4dn-dcic/pairix, https://github.com/4dn-dcic/Rpairix and https://github.com/4dn-dcic/pairsqc. %B Bioinformatics %8 2022 Jan 03 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/34978573?dopt=Abstract %R 10.1093/bioinformatics/btab870 %0 Journal Article %J Cancer Discov %D 2022 %T Reverse Transcriptase Inhibition Disrupts Repeat Element Life Cycle in Colorectal Cancer %A Rajurkar, Mihir %A Parikh, Aparna R %A Solovyov, Alexander %A You, Eunae %A Kulkarni, Anupriya S %A Chu, Chong %A Xu, Katherine H %A Jaicks, Christopher %A Taylor, Martin S %A Wu, Connie %A Alexander, Katherine A %A Good, Charly R %A Szabolcs, Annamaria %A Gerstberger, Stefanie %A Tran, Antuan V %A Xu, Nova %A Ebright, Richard Y %A Van Seventer, Emily E %A Vo, Kevin D %A Tai, Eric C %A Lu, Chenyue %A Joseph-Chazan, Jasmin %A Raabe, Michael J %A Nieman, Linda T %A Desai, Niyati %A Arora, Kshitij S %A Ligorio, Matteo %A Thapar, Vishal %A Cohen, Limor %A Garden, Padric M %A Senussi, Yasmeen %A Zheng, Hui %A Allen, Jill N %A Blaszkowsky, Lawrence S %A Clark, Jeffrey W %A Goyal, Lipika %A Wo, Jennifer Y %A Ryan, David P %A Corcoran, Ryan B %A Deshpande, Vikram %A Rivera, Miguel N %A Aryee, Martin J %A Hong, Theodore S %A Berger, Shelley L %A Walt, David R %A Burns, Kathleen H %A Park, Peter J %A Greenbaum, Benjamin D %A Ting, David T %X Altered RNA expression of repetitive sequences and retrotransposition are frequently seen in colorectal cancer (CRC) implicating a functional importance of repeat activity in cancer progression. We show the nucleoside reverse transcriptase inhibitor 3TC targets activities of these repeat elements in CRC pre-clinical models with a preferential effect in P53 mutant cell lines linked with direct binding of P53 to repeat elements. We translate these findings to a human Phase 2 trial of single agent 3TC treatment in metastatic CRC with demonstration of clinical benefit in 9 of 32 patients. Analysis of 3TC effects on CRC tumorspheres demonstrates accumulation of immunogenic RNA:DNA hybrids linked with induction of interferon response genes and DNA damage response. Epigenetic and DNA damaging agents induce repeat RNAs and have enhanced cytotoxicity with 3TC. These findings identify a vulnerability in CRC by targeting the viral mimicry of repeat elements. %B Cancer Discov %8 2022 Mar 23 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/35320348?dopt=Abstract %R 10.1158/2159-8290.CD-21-1117 %0 Journal Article %J Nat Commun %D 2022 %T Single-cell gene fusion detection by scFusion %A Jin, Zijie %A Huang, Wenjian %A Shen, Ning %A Li, Juan %A Wang, Xiaochen %A Dong, Jiqiao %A Park, Peter J %A Xi, Ruibin %K Gene Fusion %K Sequence Analysis, RNA %K Single-Cell Analysis %K Software %X Gene fusions can play important roles in tumor initiation and progression. While fusion detection so far has been from bulk samples, full-length single-cell RNA sequencing (scRNA-seq) offers the possibility of detecting gene fusions at the single-cell level. However, scRNA-seq data have a high noise level and contain various technical artifacts that can lead to spurious fusion discoveries. Here, we present a computational tool, scFusion, for gene fusion detection based on scRNA-seq. We evaluate the performance of scFusion using simulated and five real scRNA-seq datasets and find that scFusion can efficiently and sensitively detect fusions with a low false discovery rate. In a T cell dataset, scFusion detects the invariant TCR gene recombinations in mucosal-associated invariant T cells that many methods developed for bulk data fail to detect; in a multiple myeloma dataset, scFusion detects the known recurrent fusion IgH-WHSC1, which is associated with overexpression of the WHSC1 oncogene. Our results demonstrate that scFusion can be used to investigate cellular heterogeneity of gene fusions and their transcriptional impact at the single-cell level. %B Nat Commun %V 13 %P 1084 %8 2022 02 28 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/35228538?dopt=Abstract %R 10.1038/s41467-022-28661-6 %0 Journal Article %J Nature %D 2022 %T Somatic mosaicism reveals clonal distributions of neocortical development %A Breuss, Martin W %A Yang, Xiaoxu %A Schlachetzki, Johannes C M %A Antaki, Danny %A Lana, Addison J %A Xu, Xin %A Chung, Changuk %A Chai, Guoliang %A Stanley, Valentina %A Song, Qiong %A Newmeyer, Traci F %A Nguyen, An %A O'Brien, Sydney %A Hoeksema, Marten A %A Cao, Beibei %A Nott, Alexi %A McEvoy-Venneri, Jennifer %A Pasillas, Martina P %A Barton, Scott T %A Copeland, Brett R %A Nahas, Shareef %A Van Der Kraan, Lucitia %A Ding, Yan %A NIMH Brain Somatic Mosaicism Network %A Glass, Christopher K %A Gleeson, Joseph G %K Cells, Cultured %K Clone Cells %K Microglia %K Mosaicism %K Neocortex %X The structure of the human neocortex underlies species-specific traits and reflects intricate developmental programs. Here we sought to reconstruct processes that occur during early development by sampling adult human tissues. We analysed neocortical clones in a post-mortem human brain through a comprehensive assessment of brain somatic mosaicism, acting as neutral lineage recorders1,2. We combined the sampling of 25 distinct anatomic locations with deep whole-genome sequencing in a neurotypical deceased individual and confirmed results with 5 samples collected from each of three additional donors. We identified 259 bona fide mosaic variants from the index case, then deconvolved distinct geographical, cell-type and clade organizations across the brain and other organs. We found that clones derived after the accumulation of 90-200 progenitors in the cerebral cortex tended to respect the midline axis, well before the anterior-posterior or ventral-dorsal axes, representing a secondary hierarchy following the overall patterning of forebrain and hindbrain domains. Clones across neocortically derived cells were consistent with a dual origin from both dorsal and ventral cellular populations, similar to rodents, whereas the microglia lineage appeared distinct from other resident brain cells. Our data provide a comprehensive analysis of brain somatic mosaicism across the neocortex and demonstrate cellular origins and progenitor distribution patterns within the human brain. %B Nature %V 604 %P 689-696 %8 2022 04 %G eng %N 7907 %1 http://www.ncbi.nlm.nih.gov/pubmed/35444276?dopt=Abstract %R 10.1038/s41586-022-04602-7 %0 Journal Article %J Nat Methods %D 2021 %T Micro-Meta App: an interactive tool for collecting microscopy metadata based on community specifications %A Rigano, Alessandro %A Ehmsen, Shannon %A Öztürk, Serkan Utku %A Ryan, Joel %A Balashov, Alexander %A Hammer, Mathias %A Kirli, Koray %A Boehm, Ulrike %A Brown, Claire M %A Bellve, Karl %A Chambers, James J %A Cosolo, Andrea %A Coleman, Robert A %A Faklaris, Orestis %A Fogarty, Kevin E %A Guilbert, Thomas %A Hamacher, Anna B %A Itano, Michelle S %A Keeley, Daniel P %A Kunis, Susanne %A Lacoste, Judith %A Laude, Alex %A Ma, Willa Y %A Marcello, Marco %A Montero-Llopis, Paula %A Nelson, Glyn %A Nitschke, Roland %A Pimentel, Jaime A %A Weidtkamp-Peters, Stefanie %A Park, Peter J %A Alver, Burak H %A Grunwald, David %A Strambio-De-Castillia, Caterina %X For quality, interpretation, reproducibility and sharing value, microscopy images should be accompanied by detailed descriptions of the conditions that were used to produce them. Micro-Meta App is an intuitive, highly interoperable, open-source software tool that was developed in the context of the 4D Nucleome (4DN) consortium and is designed to facilitate the extraction and collection of relevant microscopy metadata as specified by the recent 4DN-BINA-OME tiered-system of Microscopy Metadata specifications. In addition to substantially lowering the burden of quality assurance, the visual nature of Micro-Meta App makes it particularly suited for training purposes. %B Nat Methods %V 18 %P 1489-1495 %8 2021 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/34862503?dopt=Abstract %R 10.1038/s41592-021-01315-z %0 Journal Article %J Mobile DNA %D 2021 %T Whole-genome analysis reveals the contribution of non-coding de novo transposon insertions to autism spectrum disorder %A Borges-Monroy, Rebeca %A Chu, Chong %A Dias, Caroline %A Choi, Jaejoon %A Lee, Soohyun %A Gao, Yue %A Shin, Taehwan %A Park, Peter J %A Walsh, Christopher A %A Lee, Eunjung Alice %X BACKGROUND: Retrotransposons have been implicated as causes of Mendelian disease, but their role in autism spectrum disorder (ASD) has not been systematically defined, because they are only called with adequate sensitivity from whole genome sequencing (WGS) data and a large enough cohort for this analysis has only recently become available. RESULTS: We analyzed WGS data from a cohort of 2288 ASD families from the Simons Simplex Collection by establishing a scalable computational pipeline for retrotransposon insertion detection. We report 86,154 polymorphic retrotransposon insertions-including > 60% not previously reported-and 158 de novo retrotransposition events. The overall burden of de novo events was similar between ASD individuals and unaffected siblings, with 1 de novo insertion per 29, 117, and 206 births for Alu, L1, and SVA respectively, and 1 de novo insertion per 21 births total. However, ASD cases showed more de novo L1 insertions than expected in ASD genes. Additionally, we observed exonic insertions in loss-of-function intolerant genes, including a likely pathogenic exonic insertion in CSDE1, only in ASD individuals. CONCLUSIONS: These findings suggest a modest, but important, impact of intronic and exonic retrotransposon insertions in ASD, show the importance of WGS for their analysis, and highlight the utility of specific bioinformatic tools for high-throughput detection of retrotransposon insertions. %B Mobile DNA %V 12 %P 28 %8 2021 Nov 27 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/34838103?dopt=Abstract %R 10.1186/s13100-021-00256-w %0 Journal Article %J Development %D 2021 %T Cis-regulatory dissection of cone development reveals a broad role for Otx2 and Oc transcription factors %A Lonfat, Nicolas* %A Wang, Su* %A Lee, Changhee %A Garcia, Mauricio %A Choi, Jiho %A Park, Peter J %A Cepko, Connie %X The vertebrate retina is generated by retinal progenitor cells (RPCs), which produce >100 cell types. Although some RPCs produce many cell types, other RPCs produce restricted types of daughter cells, such as a cone photoreceptor and a horizontal cell (HC). We used genome-wide assays of chromatin structure to compare the profiles of a restricted cone/HC RPC and those of other RPCs in chicks. These data nominated regions of regulatory activity, which were tested in tissue, leading to the identification of many cis-regulatory modules (CRMs) active in cone/HC RPCs and developing cones. Two transcription factors, Otx2 and Oc1, were found to bind to many of these CRMs, including those near genes important for cone development and function, and their binding sites were required for activity. We also found that Otx2 has a predicted autoregulatory CRM. These results suggest that Otx2, Oc1 and possibly other Onecut proteins have a broad role in coordinating cone development and function. The many newly discovered CRMs for cones are potentially useful reagents for gene therapy of cone diseases. %B Development %V 148 %8 2021 May 01 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/33929509?dopt=Abstract %R 10.1242/dev.198549 %0 Journal Article %J Genome Biology %D 2021 %T Comprehensive identification of somatic nucleotide variants in human brain tissue %A Wang, Yifan %A Bae, Taejeong %A Thorpe, Jeremy %A Sherman, Maxwell A %A Jones, Attila G %A Cho, Sean %A Daily, Kenneth %A Dou, Yanmei %A Ganz, Javier %A Galor, Alon %A Lobon, Irene %A Pattni, Reenal %A Rosenbluh, Chaggai %A Tomasi, Simone %A Tomasini, Livia %A Yang, Xiaoxu %A Zhou, Bo %A Akbarian, Schahram %A Ball, Laurel L %A Bizzotto, Sara %A Emery, Sarah B %A Doan, Ryan %A Fasching, Liana %A Jang, Yeongjun %A Juan, David %A Lizano, Esther %A Luquette, Lovelace J %A Moldovan, John B %A Narurkar, Rujuta %A Oetjens, Matthew T %A Rodin, Rachel E %A Sekar, Shobana %A Shin, Joo Heon %A Soriano, Eduardo %A Straub, Richard E %A Zhou, Weichen %A Chess, Andrew %A Gleeson, Joseph G %A Marquès-Bonet, Tomas %A Park, Peter J %A Peters, Mette A %A Pevsner, Jonathan %A Walsh, Christopher A %A Weinberger, Daniel R %A Brain Somatic Mosaicism Network %A Vaccarino, Flora M %A Moran, John V %A Urban, Alexander E %A Kidd, Jeffrey M %A Mills, Ryan E %A Abyzov, Alexej %X BACKGROUND: Post-zygotic mutations incurred during DNA replication, DNA repair, and other cellular processes lead to somatic mosaicism. Somatic mosaicism is an established cause of various diseases, including cancers. However, detecting mosaic variants in DNA from non-cancerous somatic tissues poses significant challenges, particularly if the variants only are present in a small fraction of cells. RESULTS: Here, the Brain Somatic Mosaicism Network conducts a coordinated, multi-institutional study to examine the ability of existing methods to detect simulated somatic single-nucleotide variants (SNVs) in DNA mixing experiments, generate multiple replicates of whole-genome sequencing data from the dorsolateral prefrontal cortex, other brain regions, dura mater, and dural fibroblasts of a single neurotypical individual, devise strategies to discover somatic SNVs, and apply various approaches to validate somatic SNVs. These efforts lead to the identification of 43 bona fide somatic SNVs that range in variant allele fractions from ~ 0.005 to ~ 0.28. Guided by these results, we devise best practices for calling mosaic SNVs from 250× whole-genome sequencing data in the accessible portion of the human genome that achieve 90% specificity and sensitivity. Finally, we demonstrate that analysis of multiple bulk DNA samples from a single individual allows the reconstruction of early developmental cell lineage trees. CONCLUSIONS: This study provides a unified set of best practices to detect somatic SNVs in non-cancerous tissues. The data and methods are freely available to the scientific community and should serve as a guide to assess the contributions of somatic SNVs to neuropsychiatric diseases. %B Genome Biology %V 22 %P 92 %8 2021 03 29 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/33781308?dopt=Abstract %R 10.1186/s13059-021-02285-3 %0 Journal Article %J Nat Commun %D 2021 %T Comprehensive identification of transposable element insertions using multiple sequencing technologies %A Chu, Chong %A Borges-Monroy, Rebeca %A Viswanadham, Vinayak V %A Lee, Soohyun %A Li, Heng %A Lee, Eunjung Alice** %A Park, Peter J** %K DNA Transposable Elements %K Gene Rearrangement %K Genetic Variation %K Genome, Human %K Genomics %K Haplotypes %K Humans %K Molecular Sequence Annotation %K Mutagenesis, Insertional %K Pseudogenes %K Whole Genome Sequencing %X Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea . %B Nat Commun %V 12 %P 3836 %8 2021 06 22 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/34158502?dopt=Abstract %R 10.1038/s41467-021-24041-8 %0 Journal Article %J Genes Dev %D 2021 %T Essential histone chaperones collaborate to regulate transcription and chromatin integrity %A Viktorovskaya, Olga %A Chuang, James %A Jain, Dhawal %A Reim, Natalia I %A López-Rivera, Francheska %A Murawska, Magdalena %A Spatt, Dan %A Churchman, L Stirling %A Park, Peter J %A Winston, Fred %X Histone chaperones are critical for controlling chromatin integrity during transcription, DNA replication, and DNA repair. Three conserved and essential chaperones, Spt6, Spn1/Iws1, and FACT, associate with elongating RNA polymerase II and interact with each other physically and/or functionally; however, there is little understanding of their individual functions or their relationships with each other. In this study, we selected for suppressors of a temperature-sensitive spt6 mutation that disrupts the Spt6-Spn1 physical interaction and that also causes both transcription and chromatin defects. This selection identified novel mutations in FACT. Surprisingly, suppression by FACT did not restore the Spt6-Spn1 interaction, based on coimmunoprecipitation, ChIP, and mass spectrometry experiments. Furthermore, suppression by FACT bypassed the complete loss of Spn1. Interestingly, the FACT suppressor mutations cluster along the FACT-nucleosome interface, suggesting that they alter FACT-nucleosome interactions. In agreement with this observation, we showed that the spt6 mutation that disrupts the Spt6-Spn1 interaction caused an elevated level of FACT association with chromatin, while the FACT suppressors reduced the level of FACT-chromatin association, thereby restoring a normal Spt6-FACT balance on chromatin. Taken together, these studies reveal previously unknown regulation between histone chaperones that is critical for their essential in vivo functions. %B Genes Dev %V 35 %P 698-712 %8 2021 05 01 %G eng %N 9-10 %1 http://www.ncbi.nlm.nih.gov/pubmed/33888559?dopt=Abstract %R 10.1101/gad.348431.121 %0 Journal Article %J Dev Cell %D 2021 %T Negative elongation factor regulates muscle progenitor expansion for efficient myofiber repair and stem cell pool repopulation %A Robinson, Daniel C L %A Morten Ritso %A Nelson, Geoffrey M %A Zeinab Mokhtari %A Kiran Nakka %A Hina Bandukwala %A Goldman, Seth R %A Park, Peter J %A Mounier, Rémi %A Chazaud, Bénédicte %A Brand, Marjorie %A Rudnicki, Michael A %A Adelman, Karen %A Dilworth, F Jeffrey %K Animals %K Cell Differentiation %K Cells, Cultured %K Eye Proteins %K Mice %K Mice, Inbred C57BL %K Mice, Knockout %K Muscle Development %K Muscle, Skeletal %K Nerve Growth Factors %K Regeneration %K Satellite Cells, Skeletal Muscle %K Serpins %K Signal Transduction %K Transcription Factors %K Transcriptome %K Tumor Suppressor Protein p53 %X Negative elongation factor (NELF) is a critical transcriptional regulator that stabilizes paused RNA polymerase to permit rapid gene expression changes in response to environmental cues. Although NELF is essential for embryonic development, its role in adult stem cells remains unclear. In this study, through a muscle-stem-cell-specific deletion, we showed that NELF is required for efficient muscle regeneration and stem cell pool replenishment. In mechanistic studies using PRO-seq, single-cell trajectory analyses and myofiber cultures revealed that NELF works at a specific stage of regeneration whereby it modulates p53 signaling to permit massive expansion of muscle progenitors. Strikingly, transplantation experiments indicated that these progenitors are also necessary for stem cell pool repopulation, implying that they are able to return to quiescence. Thus, we identified a critical role for NELF in the expansion of muscle progenitors in response to injury and revealed that progenitors returning to quiescence are major contributors to the stem cell pool repopulation. %B Dev Cell %V 56 %P 1014-1029.e7 %8 2021 04 05 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/33735618?dopt=Abstract %R 10.1016/j.devcel.2021.02.025 %0 Journal Article %J Cell Res %D 2021 %T Somatic mutation accumulation seen through a single-molecule lens %A Luquette, Lovelace J %A Park, Peter J %B Cell Res %V 31 %P 949-950 %8 2021 Sep %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/34316001?dopt=Abstract %R 10.1038/s41422-021-00537-2 %0 Journal Article %J Nature Communications %D 2021 %T The origins and genetic interactions of KRAS mutations are allele- and tissue-specific %A Cook, Joshua H* %A Melloni, Giorgio E. M.* %A Gulhan, Doga C. %A Park,Peter J.** %A Haigis, Kevin M.** %X Mutational activation of KRAS promotes the initiation and progression of cancers, especially in the colorectum, pancreas, lung, and blood plasma, with varying prevalence of specific activating missense mutations. Although epidemiological studies connect specific alleles to clinical outcomes, the mechanisms underlying the distinct clinical characteristics of mutant KRAS alleles are unclear. Here, we analyze 13,492 samples from these four tumor types to examine allele- and tissue-specific genetic properties associated with oncogenic KRAS mutations. The prevalence of known mutagenic mechanisms partially explains the observed spectrum of KRAS activating mutations. However, there are substantial differences between the observed and predicted frequencies for many alleles, suggesting that biological selection underlies the tissue-specific frequencies of mutant alleles. Consistent with experimental studies that have identified distinct signaling properties associated with each mutant form of KRAS, our genetic analysis reveals that each KRAS allele is associated with a distinct tissuespecific comutation network. Moreover, we identify tissue-specific genetic dependencies associated with specific mutant KRAS alleles. Overall, this analysis demonstrates that the genetic interactions of oncogenic KRAS mutations are allele- and tissue-specific, underscoring the complexity that drives their clinical consequences. %B Nature Communications %V 12 %G eng %N 1808 %0 Journal Article %J Science %D 2021 %T Landmarks of human embryonic development inscribed in somatic mutations %A Bizzotto, Sara* %A Dou, Yanmei* %A Ganz, Javier* %A Doan, Ryan N. %A Kwon, Minseok %A Bohrson, Craig L. %A Kim, Sonia N. %A Bae, Taejeong %A Abyzov, Alexej %A NIMH Brain Somatic Mosaicism Network %A Park, Peter J** %A Walsh, Christopher A** %X Although cell lineage information is fundamental to understanding organismal development, very little direct information is available for humans. We performed high-depth (250×) whole-genome sequencing of multiple tissues from three individuals to identify hundreds of somatic single-nucleotide variants (sSNVs). Using these variants as "endogenous barcodes" in single cells, we reconstructed early embryonic cell divisions. Targeted sequencing of clonal sSNVs in different organs (about 25,000×) and in more than 1000 cortical single cells, as well as single-nucleus RNA sequencing and single-nucleus assay for transposase-accessible chromatin sequencing of ~100,000 cortical single cells, demonstrated asymmetric contributions of early progenitors to extraembryonic tissues, distinct germ layers, and organs. Our data suggest onset of gastrulation at an effective progenitor pool of about 170 cells and about 50 to 100 founders for the forebrain. Thus, mosaic mutations provide a permanent record of human embryonic development at very high resolution. %B Science %V 371 %P 1249-1253 %G eng %N 6535 %0 Journal Article %J Bioinformatics %D 2021 %T BamSnap: a lightweight viewer for sequencing reads in BAM files %A Kwon, Minseok %A Lee, Soohyun %A Berselli, Michele %A Chu, Chong %A Park, Peter J %X SUMMARY: Despite the improvement in variant detection algorithms, visual inspection of the read-level data remains an essential step for accurate identification of variants in genome analysis. We developed BamSnap, an efficient BAM file viewer utilizing a graphics library and BAM indexing. In contrast to existing viewers, BamSnap can generate high-quality snapshots rapidly, with customized tracks and layout. As an example, we produced read-level images at 1000 genomic loci for >2500 whole-genomes. AVAILABILITY: BamSnap is freely available at https://github.com/parklab/bamsnap. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics %V 37 %P 263-4 %8 2021 Jan 08 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/33416869?dopt=Abstract %R 10.1093/bioinformatics/btaa1101 %0 Journal Article %J EMBO Mol Med %D 2021 %T Deletions in CWH43 cause idiopathic normal pressure hydrocephalus %A Yang, Hong Wei %A Lee, Semin %A Yang, Dejun %A Dai, Huijun %A Zhang, Yan %A Han, Lei %A Zhao, Sijun %A Zhang, Shuo %A Ma, Yan %A Johnson, Marciana F %A Rattray, Anna K %A Johnson, Tatyana A %A Wang, George %A Zheng, Shaokuan %A Carroll, Rona S %A Park, Peter J %A Johnson, Mark D %X Idiopathic normal pressure hydrocephalus (iNPH) is a neurological disorder that occurs in about 1% of individuals over age 60 and is characterized by enlarged cerebral ventricles, gait difficulty, incontinence, and cognitive decline. The cause and pathophysiology of iNPH are largely unknown. We performed whole exome sequencing of DNA obtained from 53 unrelated iNPH patients. Two recurrent heterozygous loss of function deletions in CWH43 were observed in 15% of iNPH patients and were significantly enriched 6.6-fold and 2.7-fold, respectively, when compared to the general population. Cwh43 modifies the lipid anchor of glycosylphosphatidylinositol-anchored proteins. Mice heterozygous for CWH43 deletion appeared grossly normal but displayed hydrocephalus, gait and balance abnormalities, decreased numbers of ependymal cilia, and decreased localization of glycosylphosphatidylinositol-anchored proteins to the apical surfaces of choroid plexus and ependymal cells. Our findings provide novel mechanistic insights into the origins of iNPH and demonstrate that it represents a distinct disease entity. %B EMBO Mol Med %P e13249 %8 2021 Jan 18 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/33459505?dopt=Abstract %R 10.15252/emmm.202013249 %0 Journal Article %J Cancer Res %D 2021 %T Heterogeneity and Clonal Evolution of Acquired PARP Inhibitor Resistance in TP53 and BRCA1-Deficient Cells %A Färkkilä, Anniina %A Rodríguez, Alfredo %A Oikkonen, Jaana %A Gulhan, Doga C %A Nguyen, Huy %A Domínguez, Julieta %A Ramos, Sandra %A Mills, Caitlin E %A Pérez-Villatoro, Fernando %A Lazaro, Jean-Bernard %A Zhou, Jia %A Clairmont, Connor S %A Moreau, Lisa A %A Park, Peter J %A Sorger, Peter K %A Hautaniemi, Sampsa %A Frias, Sara %A D'Andrea, Alan D %X Homologous recombination (HR)-deficient cancers are sensitive to poly-ADP ribose polymerase inhibitors (PARPi), which have shown clinical efficacy in the treatment of high-grade serous cancers (HGSC). However, the majority of patients will relapse, and acquired PARPi resistance is emerging as a pressing clinical problem. Here we generated seven single-cell clones with acquired PARPi resistance derived from a PARPi-sensitive TP53 -/- and BRCA1 -/- epithelial cell line generated using CRISPR/Cas9. These clones showed diverse resistance mechanisms, and some clones presented with multiple mechanisms of resistance at the same time. Genomic analysis of the clones revealed unique transcriptional and mutational profiles and increased genomic instability in comparison with a PARPi-sensitive cell line. Clonal evolutionary analyses suggested that acquired PARPi resistance arose via clonal selection from an intrinsically unstable and heterogenous cell population in the sensitive cell line, which contained preexisting drug-tolerant cells. Similarly, clonal and spatial heterogeneity in tumor biopsies from a clinical patient with BRCA1-mutant HGSC with acquired PARPi resistance was observed. In an imaging-based drug screening, the clones showed heterogenous responses to targeted therapeutic agents, indicating that not all PARPi-resistant clones can be targeted with just one therapy. Furthermore, PARPi-resistant clones showed mechanism-dependent vulnerabilities to the selected agents, demonstrating that a deeper understanding on the mechanisms of resistance could lead to improved targeting and biomarkers for HGSC with acquired PARPi resistance. SIGNIFICANCE: This study shows that BRCA1-deficient cells can give rise to multiple genomically and functionally heterogenous PARPi-resistant clones, which are associated with various vulnerabilities that can be targeted in a mechanism-specific manner. %B Cancer Res %V 81 %P 2774-2787 %8 2021 May 15 %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/33514515?dopt=Abstract %R 10.1158/0008-5472.CAN-20-2912 %0 Journal Article %J Bioinformatics %D 2021 %T HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data %A Jain, Dhawal %A Chu, Chong %A Alver, Burak Han %A Lee, Soohyun %A Lee, Eunjung Alice %A Park, Peter J* %K Chromatin %K Chromosomes %K DNA Transposable Elements %K Humans %K Molecular Conformation %K Whole Genome Sequencing %X Hi-C is a common technique for assessing 3D chromatin conformation. Recent studies have shown that long-range interaction information in Hi-C data can be used to generate chromosome-length genome assemblies and identify large-scale structural variations. Here, we demonstrate the use of Hi-C data in detecting mobile transposable element (TE) insertions genome-wide. Our pipeline Hi-C-based TE analyzer (HiTea) capitalizes on clipped Hi-C reads and is aided by a high proportion of discordant read pairs in Hi-C data to detect insertions of three major families of active human TEs. Despite the uneven genome coverage in Hi-C data, HiTea is competitive with the existing callers based on whole-genome sequencing (WGS) data and can supplement the WGS-based characterization of the TE-insertion landscape. We employ the pipeline to identify TE-insertions from human cell-line Hi-C samples. AVAILABILITY AND IMPLEMENTATION: HiTea is available at https://github.com/parklab/HiTea and as a Docker image. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics %V 37 %P 1045-1051 %8 2021 05 23 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/33136153?dopt=Abstract %R 10.1093/bioinformatics/btaa923 %0 Journal Article %J Nat Neurosci %D 2021 %T The landscape of somatic mutation in cerebral cortex of autistic and neurotypical individuals revealed by ultra-deep whole-genome sequencing %A Rodin, Rachel E* %A Dou, Yanmei* %A Kwon, Minseok %A Sherman, Maxwell A %A D'Gama, Alissa M %A Doan, Ryan N %A Rento, Lariza M %A Girskis, Kelly M %A Bohrson, Craig L %A Kim, Sonia N %A Nadig, Ajay %A Luquette, Lovelace J %A Gulhan, Doga C %A Brain Somatic Mosaicism Network, BSM %A Park, Peter J** %A Walsh, Christopher A** %X We characterize the landscape of somatic mutations-mutations occurring after fertilization-in the human brain using ultra-deep (~250×) whole-genome sequencing of prefrontal cortex from 59 donors with autism spectrum disorder (ASD) and 15 control donors. We observe a mean of 26 somatic single-nucleotide variants per brain present in ≥4% of cells, with enrichment of mutations in coding and putative regulatory regions. Our analysis reveals that the first cell division after fertilization produces ~3.4 mutations, followed by 2-3 mutations in subsequent generations. This suggests that a typical individual possesses ~80 somatic single-nucleotide variants present in ≥2% of cells-comparable to the number of de novo germline mutations per generation-with about half of individuals having at least one potentially function-altering somatic mutation somewhere in the cortex. ASD brains show an excess of somatic mutations in neural enhancer sequences compared with controls, suggesting that mosaic enhancer mutations may contribute to ASD risk. %B Nat Neurosci %V 24 %P 176-185 %8 2021 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/33432195?dopt=Abstract %R 10.1038/s41593-020-00765-6 %0 Journal Article %J Nat Neurosci %D 2021 %T Large mosaic copy number variations confer autism risk %A Sherman, Maxwell A %A Rodin, Rachel E %A Genovese, Giulio %A Dias, Caroline %A Barton, Alison R %A Mukamel, Ronen E %A Berger, Bonnie %A Park, Peter J** %A Walsh, Christopher A** %A Loh, Po-Ru** %X Although germline de novo copy number variants (CNVs) are known causes of autism spectrum disorder (ASD), the contribution of mosaic (early-developmental) copy number variants (mCNVs) has not been explored. In this study, we assessed the contribution of mCNVs to ASD by ascertaining mCNVs in genotype array intensity data from 12,077 probands with ASD and 5,500 unaffected siblings. We detected 46 mCNVs in probands and 19 mCNVs in siblings, affecting 2.8-73.8% of cells. Probands carried a significant burden of large (>4-Mb) mCNVs, which were detected in 25 probands but only one sibling (odds ratio = 11.4, 95% confidence interval = 1.5-84.2, P = 7.4 × 10). Event size positively correlated with severity of ASD symptoms (P = 0.016). Surprisingly, we did not observe mosaic analogues of the short de novo CNVs recurrently observed in ASD (eg, 16p11.2). We further experimentally validated two mCNVs in postmortem brain tissue from 59 additional probands. These results indicate that mCNVs contribute a previously unexplained component of ASD risk. %B Nat Neurosci %V 24 %P 197-203 %8 2021 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/33432194?dopt=Abstract %R 10.1038/s41593-020-00766-5 %0 Journal Article %J Curr Opin Genet Dev %D 2021 %T Resources and challenges for integrative analysis of nuclear architecture data %A Jung, Youngsook L %A Kirli, Koray %A Alver, Burak H %A Park, Peter J %X A large amount of genomic data for profiling three-dimensional genome architecture have accumulated from large-scale consortium projects as well as from individual laboratories. In this review, we summarize recent landmark datasets and collections in the field. We describe the challenges in collection, annotation, and analysis of these data, particularly for integration of sequencing and microscopy data. We introduce efforts from consortia and independent groups to harmonize diverse datasets. As the resolution and throughput of sequencing and imaging technologies continue to increase, more efficient utilization and integration of collected data will be critical for a better understanding of nuclear architecture. %B Curr Opin Genet Dev %V 67 %P 103-110 %8 2021 Jan 12 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/33450522?dopt=Abstract %R 10.1016/j.gde.2020.12.009 %0 Journal Article %J Nature Genetics %D 2020 %T Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing %A Cortés-Ciriano, Isidro %A Lee, Jake June Koo %A Xi, Ruibin %A Jain, Dhawal %A Jung, Youngsook L. %A Yang, Lixing %A Gordenin, Dmitry %A Klimczak, Leszek J. %A Zhang, Cheng Zhong %A Pellman, David S. %A PCAWG Structural Variation Working Group %A Park,Peter J. %A PCAWG Consortium %X Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer. %B Nature Genetics %V 52 %P 331-341 %G eng %N 3 %0 Journal Article %J Nature Biotechnology %D 2020 %T Accurate detection of mosaic variants in sequencing data without matched controls %A Dou, Yanmei %A Kwon, Minseok %A Rodin, Rachel E. %A Cortés-Ciriano, Isidro %A Doan, Ryan %A J. Luquette, Lovelace %A Galor, Alon %A Bohrson, Craig %A Christopher A. Walsh %A Park, P J %X

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80–90{\%} of the mosaic single-nucleotide variants and 60–80{\%} of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.

%B Nature Biotechnology %V 38 %P 314-319 %G eng %N 3 %0 Journal Article %J JCO Precis Oncol %D 2020 %T Systematic Assessment of Tumor Purity and Its Clinical Implications %A Haider, Syed %A Tyekucheva, Svitlana %A Prandi, Davide %A Fox, Natalie S %A Ahn, Jaeil %A Xu, Andrew Wei %A Pantazi, Angeliki %A Park, Peter J %A Laird, Peter W %A Sander, Chris %A Wang, Wenyi %A Demichelis, Francesca %A Loda, Massimo %A Boutros, Paul C %A Cancer Genome Atlas Research Network %X PURPOSE: The tumor microenvironment is complex, comprising heterogeneous cellular populations. As molecular profiles are frequently generated using bulk tissue sections, they represent an admixture of multiple cell types (including immune, stromal, and cancer cells) interacting with each other. Therefore, these molecular profiles are confounded by signals emanating from many cell types. Accurate assessment of residual cancer cell fraction is crucial for parameterization and interpretation of genomic analyses, as well as for accurately interpreting the clinical properties of the tumor. MATERIALS AND METHODS: To benchmark cancer cell fraction estimation methods, 10 estimators were applied to a clinical cohort of 333 patients with prostate cancer. These methods include gold-standard multiobserver pathology estimates, as well as estimates inferred from genome, epigenome, and transcriptome data. In addition, two methods based on genomic and transcriptomic profiles were used to quantify tumor purity in 4,497 tumors across 12 cancer types. Bulk mRNA and microRNA profiles were subject to in silico deconvolution to estimate cancer cell-specific mRNA and microRNA profiles. RESULTS: We present a systematic comparison of 10 tumor purity estimation methods on a cohort of 333 prostate tumors. We quantify variation among purity estimation methods and demonstrate how this influences interpretation of clinico-genomic analyses. Our data show poor concordance between pathologic and molecular purity estimates, necessitating caution when interpreting molecular results. Limited concordance between DNA- and mRNA-derived purity estimates remained a general pan-cancer phenomenon when tested in an additional 4,497 tumors spanning 12 cancer types. CONCLUSION: The choice of tumor purity estimation method may have a profound impact on the interpretation of genomic assays. Taken together, these data highlight the need for improved assessment of tumor purity and quantitation of its influences on the molecular hallmarks of cancers. %B JCO Precis Oncol %V 4 %8 2020 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/33015524?dopt=Abstract %R 10.1200/PO.20.00016 %0 Journal Article %J Nucleic Acids Res %D 2020 %T The conserved elongation factor Spn1 is required for normal transcription, histone modifications, and splicing in Saccharomyces cerevisiae %A Reim, Natalia I* %A Chuang, James* %A Jain, Dhawal* %A Alver, Burak H %A Park, Peter J %A Winston, Fred %X Spn1/Iws1 is a conserved protein involved in transcription and chromatin dynamics, yet its general in vivo requirement for these functions is unknown. Using a Spn1 depletion system in Saccharomyces cerevisiae, we demonstrate that Spn1 broadly influences several aspects of gene expression on a genome-wide scale. We show that Spn1 is globally required for normal mRNA levels and for normal splicing of ribosomal protein transcripts. Furthermore, Spn1 maintains the localization of H3K36 and H3K4 methylation across the genome and is required for normal histone levels at highly expressed genes. Finally, we show that the association of Spn1 with the transcription machinery is strongly dependent on its binding partner, Spt6, while the association of Spt6 and Set2 with transcribed regions is partially dependent on Spn1. Taken together, our results show that Spn1 affects multiple aspects of gene expression and provide additional evidence that it functions as a histone chaperone in vivo. %B Nucleic Acids Res %8 2020 Sep 17 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/32941642?dopt=Abstract %R 10.1093/nar/gkaa745 %0 Journal Article %J Genome Biol %D 2020 %T Dysregulation of cancer genes by recurrent intergenic fusions %A Yun, Jae Won %A Yang, Lixing %A Park, Hye-Young %A Lee, Chang-Woo %A Cha, Hongui %A Shin, Hyun-Tae %A Noh, Ka-Won %A Choi, Yoon-La %A Park, Woong-Yang** %A Park, Peter J** %X BACKGROUND: Gene fusions have been studied extensively, as frequent drivers of tumorigenesis as well as potential therapeutic targets. In many well-known cases, breakpoints occur at two intragenic positions, leading to in-frame gene-gene fusions that generate chimeric mRNAs. However, fusions often occur with intergenic breakpoints, and the role of such fusions has not been carefully examined. RESULTS: We analyze whole-genome sequencing data from 268 patients to catalog gene-intergenic and intergenic-intergenic fusions and characterize their impact. First, we discover that, in contrast to the common assumption, chimeric oncogenic transcripts-such as those involving ETV4, ERG, RSPO3, and PIK3CA-can be generated by gene-intergenic fusions through splicing of the intervening region. Second, we find that over-expression of an upstream or downstream gene by a fusion-mediated repositioning of a regulatory sequence is much more common than previously suspected, with enhancers sometimes located megabases away. We detect a number of recurrent fusions, such as those involving ANO3, RGS9, FUT5, CHI3L1, OR1D4, and LIPG in breast; IGF2 in colon; ETV1 in prostate; and IGF2BP3 and SIX2 in thyroid cancers. CONCLUSION: Our findings elucidate the potential oncogenic function of intergenic fusions and highlight the wide-ranging consequences of structural rearrangements in cancer genomes. %B Genome Biol %V 21 %P 166 %8 2020 Jul 06 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/32631391?dopt=Abstract %R 10.1186/s13059-020-02076-2 %0 Journal Article %J Science Advances %D 2020 %T Epigenetic transcriptional reprogramming by WT1 mediates a repair response during podocyte injury %A Ettou, Sandrine* %A Jung, Youngsook L* %A Tomoya Miyoshi %A Jain, Dhawal %A Hiratsuka, Ken %A Valerie Schumacher %A Taglienti, Mary E %A Morizane, Ryuji %A Park, Peter J** %A Kreidberg, Jordan A** %X In the context of human disease, the mechanisms whereby transcription factors reprogram gene expression in reparative responses to injury are not well understood. We have studied the mechanisms of transcriptional reprogramming in disease using murine kidney podocytes as a model for tissue injury. Podocytes are a crucial component of glomeruli, the filtration units of each nephron. Podocyte injury is the initial event in many processes that lead to end-stage kidney disease. Wilms tumor-1 (WT1) is a master regulator of gene expression in podocytes, binding nearly all genes known to be crucial for maintenance of the glomerular filtration barrier. Using murine models and human kidney organoids, we investigated WT1-mediated transcriptional reprogramming during the course of podocyte injury. Reprogramming the transcriptome involved highly dynamic changes in the binding of WT1 to target genes during a reparative injury response, affecting chromatin state and expression levels of target genes. %B Science Advances %V 6 %P eabb5460 %8 2020 Jul %G eng %N 30 %1 http://www.ncbi.nlm.nih.gov/pubmed/32754639?dopt=Abstract %R 10.1126/sciadv.abb5460 %0 Journal Article %J JCO Precis Oncol %D 2020 %T Genomic Determinants of De Novo Resistance to Immune Checkpoint Blockade in Mismatch Repair-Deficient Endometrial Cancer %A Gulhan, Doga C %A Garcia, Elizabeth %A Lee, Elizabeth K %A Lindemann, Neal I %A Liu, Joyce F %A Matulonis, Ursula A %A Park, Peter J %A Konstantinopoulos, Panagiotis A %B JCO Precis Oncol %V 4 %P 492-497 %8 2020 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/32494760?dopt=Abstract %R 10.1200/po.20.00009 %0 Journal Article %J Genes %D 2020 %T Genomics of MPNST (GeM) Consortium: Rationale and Study Design for Multi-Omic Characterization of NF1-Associated and Sporadic MPNSTs %A Miller, David T %A Cortés-Ciriano, Isidro %A Pillay, Nischalan %A Hirbe, Angela C %A Snuderl, Matija %A Bui, Marilyn M %A Piculell, Katherine %A Al-Ibraheemi, Alyaa %A Dickson, Brendan C %A Hart, Jesse %A Jones, Kevin %A Jordan, Justin T %A Kim, Raymond H %A Lindsay, Daniel %A Nishida, Yoshihiro %A Ullrich, Nicole J %A Wang, Xia %A Park, Peter J %A Flanagan, Adrienne M %X The Genomics of Malignant Peripheral Nerve Sheath Tumor (GeM) Consortium is an international collaboration focusing on multi-omic analysis of malignant peripheral nerve sheath tumors (MPNSTs), the most aggressive tumor associated with neurofibromatosis type 1 (NF1). Here we present a summary of current knowledge gaps, a description of our consortium and the cohort we have assembled, and an overview of our plans for multi-omic analysis of these tumors. We propose that our analysis will lead to a better understanding of the order and timing of genetic events related to MPNST initiation and progression. Our ten institutions have assembled 96 fresh frozen NF1-related (63%) and sporadic MPNST specimens from 86 subjects with corresponding clinical and pathological data. Clinical data have been collected as part of the International MPNST Registry. We will characterize these tumors with bulk whole genome sequencing, RNAseq, and DNA methylation profiling. In addition, we will perform multiregional analysis and temporal sampling, with the same methodologies, on a subset of nine subjects with NF1-related MPNSTs to assess tumor heterogeneity and cancer evolution. Subsequent multi-omic analyses of additional archival specimens will include deep exome sequencing (500×) and high density copy number arrays for both validation of results based on fresh frozen tumors, and to assess further tumor heterogeneity and evolution. Digital pathology images are being collected in a cloud-based platform for consensus review. The result of these efforts will be the largest MPNST multi-omic dataset with correlated clinical and pathological information ever assembled. %B Genes %V 11 %8 2020 04 02 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/32252413?dopt=Abstract %R 10.3390/genes11040387 %0 Journal Article %J Curr Protoc Hum Genet %D 2020 %T Identification and Genotyping of Transposable Element Insertions From Genome Sequencing Data %A Chu, Chong %A Zhao, Boxun %A Park, Peter J %A Lee, Eunjung Alice %X Transposable element (TE) mobilization is a significant source of genomic variation and has been associated with various human diseases. The exponential growth of population-scale whole-genome sequencing and rapid innovations in long-read sequencing technologies provide unprecedented opportunities to study TE insertions and their functional impact in human health and disease. Identifying TE insertions, however, is challenging due to the repetitive nature of the TE sequences. Here, we review computational approaches to detecting and genotyping TE insertions using short- and long-read sequencing and discuss the strengths and weaknesses of different approaches. © 2020 Wiley Periodicals LLC. %B Curr Protoc Hum Genet %V 107 %P e102 %8 2020 09 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/32662945?dopt=Abstract %R 10.1002/cphg.102 %0 Journal Article %J Nature %D 2020 %T Mechanisms and therapeutic implications of hypermutation in gliomas %A Touat, Mehdi %A Li, Yvonne Y %A Boynton, Adam N %A Spurr, Liam F %A Iorgulescu, J Bryan %A Bohrson, Craig L %A Cortes-Ciriano, Isidro %A Birzu, Cristina %A Geduldig, Jack E %A Pelton, Kristine %A Lim-Fat, Mary Jane %A Pal, Sangita %A Ferrer-Luna, Ruben %A Ramkissoon, Shakti H %A Dubois, Frank %A Bellamy, Charlotte %A Currimjee, Naomi %A Bonardi, Juliana %A Qian, Kenin %A Ho, Patricia %A Malinowski, Seth %A Taquet, Leon %A Jones, Robert E %A Shetty, Aniket %A Chow, Kin-Hoe %A Sharaf, Radwa %A Pavlick, Dean %A Albacker, Lee A %A Younan, Nadia %A Baldini, Capucine %A Verreault, Maïté %A Giry, Marine %A Guillerm, Erell %A Ammari, Samy %A Beuvon, Frédéric %A Mokhtari, Karima %A Alentorn, Agusti %A Dehais, Caroline %A Houillier, Caroline %A Laigle-Donadey, Florence %A Psimaras, Dimitri %A Lee, Eudocia Q %A Nayak, Lakshmi %A McFaline-Figueroa, J Ricardo %A Carpentier, Alexandre %A Cornu, Philippe %A Capelle, Laurent %A Mathon, Bertrand %A Barnholtz-Sloan, Jill S %A Chakravarti, Arnab %A Bi, Wenya Linda %A Chiocca, E Antonio %A Fehnel, Katie Pricola %A Sanda Alexandrescu %A Chi, Susan N %A Haas-Kogan, Daphne %A Batchelor, Tracy T %A Frampton, Garrett M %A Alexander, Brian M %A Huang, Raymond Y %A Ligon, Azra H %A Coulet, Florence %A Delattre, Jean-Yves %A Hoang-Xuan, Khê %A Meredith, David M %A Santagata, Sandro %A Duval, Alex %A Sanson, Marc %A Cherniack, Andrew D %A Wen, Patrick Y %A Reardon, David A %A Marabelle, Aurélien %A Park, Peter J %A Idbaih, Ahmed %A Beroukhim, Rameen %A Bandopadhayay, Pratiti %A Bielle, Franck %A Ligon, Keith L %K Animals %K Antineoplastic Agents, Alkylating %K Brain Neoplasms %K DNA Mismatch Repair %K Gene Frequency %K Genome, Human %K Glioma %K Humans %K Male %K Mice %K Microsatellite Repeats %K Mutagenesis %K Mutation %K Phenotype %K Prognosis %K Programmed Cell Death 1 Receptor %K Sequence Analysis, DNA %K Temozolomide %K Xenograft Model Antitumor Assays %X A high tumour mutational burden (hypermutation) is observed in some gliomas; however, the mechanisms by which hypermutation develops and whether it predicts the response to immunotherapy are poorly understood. Here we comprehensively analyse the molecular determinants of mutational burden and signatures in 10,294 gliomas. We delineate two main pathways to hypermutation: a de novo pathway associated with constitutional defects in DNA polymerase and mismatch repair (MMR) genes, and a more common post-treatment pathway, associated with acquired resistance driven by MMR defects in chemotherapy-sensitive gliomas that recur after treatment with the chemotherapy drug temozolomide. Experimentally, the mutational signature of post-treatment hypermutated gliomas was recapitulated by temozolomide-induced damage in cells with MMR deficiency. MMR-deficient gliomas were characterized by a lack of prominent T cell infiltrates, extensive intratumoral heterogeneity, poor patient survival and a low rate of response to PD-1 blockade. Moreover, although bulk analyses did not detect microsatellite instability in MMR-deficient gliomas, single-cell whole-genome sequencing analysis of post-treatment hypermutated glioma cells identified microsatellite mutations. These results show that chemotherapy can drive the acquisition of hypermutated populations without promoting a response to PD-1 blockade and supports the diagnostic use of mutational burden and signatures in cancer. %B Nature %V 580 %P 517-523 %8 2020 04 %G eng %N 7804 %1 http://www.ncbi.nlm.nih.gov/pubmed/32322066?dopt=Abstract %R 10.1038/s41586-020-2209-9 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2020 %T Parallel RNA and DNA analysis after deep sequencing (PRDD-seq) reveals cell type-specific lineage patterns in human brain %A Huang, August Yue %A Li, Pengpeng %A Rodin, Rachel E %A Kim, Sonia N %A Dou, Yanmei %A Kenny, Connor J %A Akula, Shyam K %A Hodge, Rebecca D %A Bakken, Trygve E %A Miller, Jeremy A %A Lein, Ed S %A Park, Peter J %A Lee, Eunjung Alice %A Walsh, Christopher A %K Cell Lineage %K Cerebral Cortex %K High-Throughput Nucleotide Sequencing %K Humans %K Mutation Accumulation %K Neural Stem Cells %K neurogenesis %K Sequence Analysis, DNA %K Single-Cell Analysis %X Elucidating the lineage relationships among different cell types is key to understanding human brain development. Here we developed parallel RNA and DNA analysis after deep sequencing (PRDD-seq), which combines RNA analysis of neuronal cell types with analysis of nested spontaneous DNA somatic mutations as cell lineage markers, identified from joint analysis of single-cell and bulk DNA sequencing by single-cell MosaicHunter (scMH). PRDD-seq enables simultaneous reconstruction of neuronal cell type, cell lineage, and sequential neuronal formation ("birthdate") in postmortem human cerebral cortex. Analysis of two human brains showed remarkable quantitative details that relate mutation mosaic frequency to clonal patterns, confirming an early divergence of precursors for excitatory and inhibitory neurons, and an "inside-out" layer formation of excitatory neurons as seen in other species. In addition our analysis allows an estimate of excitatory neuron-restricted precursors (about 10) that generate the excitatory neurons within a cortical column. Inhibitory neurons showed complex, subtype-specific patterns of neurogenesis, including some patterns of development conserved relative to mouse, but also some aspects of primate cortical interneuron development not seen in mouse. PRDD-seq can be broadly applied to characterize cell identity and lineage from diverse archival samples with single-cell resolution and in potentially any developmental or disease condition. %B Proc Natl Acad Sci U S A %V 117 %P 13886-13895 %8 2020 06 23 %G eng %N 25 %1 http://www.ncbi.nlm.nih.gov/pubmed/32522880?dopt=Abstract %R 10.1073/pnas.2006163117 %0 Journal Article %J Nat Commun %D 2020 %T A user guide for the online exploration and visualization of PCAWG data %A Goldman, Mary J* %A Zhang, Junjun* %A Fonseca, Nuno A* %A Cortés-Ciriano, Isidro* %A Xiang, Qian %A Craft, Brian %A Piñeiro-Yáñez, Elena %A O'Connor, Brian D %A Bazant, Wojciech %A Barrera, Elisabet %A Muñoz-Pomer, Alfonso %A Petryszak, Robert %A Füllgrabe, Anja %A Al-Shahrour, Fatima %A Keays, Maria %A Haussler, David %A Weinstein, John N %A Huber, Wolfgang %A Valencia, Alfonso %A Park, Peter J %A Papatheodorou, Irene %A Zhu, Jingchun %A Ferretti, Vincent %A Vazquez, Miguel %K Chromothripsis %K Computational Biology %K Data Analysis %K Databases, Genetic %K Genome, Human %K Genomics %K Humans %K Internet %K Mutation %K Neoplasms %K Software %K User-Computer Interface %K Whole Genome Sequencing %X The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation. %B Nat Commun %V 11 %P 3400 %8 2020 07 07 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/32636365?dopt=Abstract %R 10.1038/s41467-020-16785-6 %0 Journal Article %J Genome Biology %D 2020 %T HiNT: a computational method for detecting copy number variations and translocations from Hi-C data %A Wang, Su %A Lee, Soohyun %A Chu, Chong %A Jain, Dhawal %A Kerpedjiev, Peter %A Nelson, Geoffrey M. %A Walsh, Jennifer M. %A Alver, Burak H. %A Park,Peter J. %X The three-dimensional conformation of a genome can be profiled using Hi-C, a technique that combines chromatin conformation capture with high-throughput sequencing. However, structural variations often yield features that can be mistaken for chromosomal interactions. Here, we describe a computational method HiNT (Hi-C for copy Number variation and Translocation detection), which detects copy number variations and interchromosomal translocations within Hi-C data with breakpoints at single base-pair resolution. We demonstrate that HiNT outperforms existing methods on both simulated and real data. We also show that Hi-C can supplement whole-genome sequencing in structure variant detection by locating breakpoints in repetitive regions. %B Genome Biology %V 21 %P 73 %G eng %N 1 %0 Journal Article %J Nature Communications %D 2020 %T Immunogenomic profiling determines responses to combined PARP and PD-1 inhibition in ovarian cancer %A Färkkliä, Anniina %A Gulhan, Doga C. %A Casado, Julia %A Jacobson, Connor A. %A Nguyen, Huy %A Kochupurakkal, Bose %A Maliga, Zoltan %A Yapp, Clarence %A Chen, Yu-An %A Schapiro, Denis %A Zhou, Yinghui %A Graham, Julie R. %A Dezube, Bruce J. %A Munster, Pamela %A Santagata, Sandro %A Garcia, Elizabeth %A Rodig, Scott %A Lako, Ana %A Chowdhury, Dipanjan %A Shapiro, Geoffrey I. %A Matulonis, Ursula A. %A Park,Peter J. %A Hautaniemi, Sampsa %A Sorger, Peter K. %A Swisher, Elizabeth M. %A D'Andrea, Alan D. %A Konstantinopoulos, Panagiotis A. %X Combined PARP and immune checkpoint inhibition has yielded encouraging results in ovarian cancer, but predictive biomarkers are lacking. We performed immunogenomic profiling and highly multiplexed single-cell imaging on tumor samples from patients enrolled in a Phase I/II trial of niraparib and pembrolizumab in ovarian cancer (NCT02657889). We identify two determinants of response; mutational signature 3 reflecting defective homologous recombination DNA repair, and positive immune score as a surrogate of interferon-primed exhausted CD8 + T-cells in the tumor microenvironment. Presence of one or both features associates with an improved outcome while concurrent absence yields no responses. Single-cell spatial analysis reveals prominent interactions of exhausted CD8 + T-cells and PD-L1 + macrophages and PD-L1 + tumor cells as mechanistic determinants of response. Furthermore, spatial analysis of two extreme responders shows differential clustering of exhausted CD8 + T-cells with PD-L1 + macrophages in the first, and exhausted CD8 + T-cells with cancer cells harboring genomic PD-L1 and PD-L2 amplification in the second. %B Nature Communications %V 11 %P 1459 %G eng %N 1 %0 Journal Article %J Nature %D 2020 %T Pan-cancer analysis of whole genomes %A Pan Cancer Analysis of Whole Genomes Consortium, ICGC/TCGA %X Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1-3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10-18. %B Nature %V 578 %P 82-93 %G eng %N 7793 %0 Journal Article %J Nature %D 2020 %T Patterns of somatic structural variation in human cancer genomes %A Li, Yilong %A Roberts, Nicola D. %A Wala Jeremiah A. %A Shapira, Ofer %A Schumacher, Steven E. %A Kumar, Kiran %A Khurana, Ekta %A Waszak, Sebastian %A Korbel, Jan O. %A Haber, James E. %A Imielinski, Marcin %A PCAWG Structural Variation Working Group %A Weischenfeldt, Joachim %A Beroukhim, Rameen %A Campbell, Peter J. %A Pan Cancer Analysis of Whole Genomes Consortium %X A key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments that range in size from kilobases to whole chromosomes1-7. Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types8. Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions-as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2-7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and-in liver cancer-frequently activate the telomerase gene TERT. A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act. %B Nature %V 578 %P 112-121 %G eng %N 7793 %0 Journal Article %J Nature Genetics %D 2020 %T Pan-cancer analysis of whole genome identifies driver rearrangements promoted by LINE-1 retrotransposition %A Rodriguez-Martin, Bernardo %A Alvarez, Eva G. %A Baez-Ortega, Adrian %A Zamora, Jorge %A Supek, Fran %A Demeulemeester, Jonas %A Santamarina, Martin %A Ju, Young Seok %A Temes, Javier %A Garcia-Souto, Daniel %A Detering, Harald %A Li, Yilong %A Rodriguez-Castro, Jorge %A Dueso-Barroso, Ana %A Bruzos, Alicia L. %A Dentro, Stefan C. %A Blanco, Miguel G. %A Contino, Gianmarco %A Ardeljan, Daniel %A Tojo, Marta %A Roberts, Nicola D. %A Zumalave, Sonia %A Edwards, Paul A.W. %A Weischenfeldt, Joachim %A Puiggròs, Montserrat %A Chong, Zechen %A Chen, Ken %A Lee, Eunjung Alice %A Wala, Jeremiah A. %A Raine, Keiran %A Butler, Adam %A Waszak, Sebastian M. %A Navarro, Fabio C.P. %A Schumacher, Steven E. %A Monlong, Jean %A Maura, Francesco %A Bolli, Niccolo %A Bourque, Guillaume %A Gerstein, Mark %A Park,Peter J. %A Wedge, David C. %A Beroukhim, Rameen %A Torrents, David %A Korbel, Jan O. %A Martincorena, Inigo %A Fitzgerald, Rebecca C. %A Van Loo, Peter %A Kazazian, Haig H. %A Burns, Kathleen H. %A PCAWG Structural Variation Working Group %A Campbell, Peter J. %A Tubio, Jose M.C. %A PCAWG Consortium %X About half of all cancers have somatic integrations of retrotransposons. Here, to characterize their role in oncogenesis, we analyzed the patterns and mechanisms of somatic retrotransposition in 2,954 cancer genomes from 38 histological cancer subtypes within the framework of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. We identified 19,166 somatically acquired retrotransposition events, which affected 35% of samples and spanned a range of event types. Long interspersed nuclear element (LINE-1; L1 hereafter) insertions emerged as the first most frequent type of somatic structural variation in esophageal adenocarcinoma, and the second most frequent in head-and-neck and colorectal cancers. Aberrant L1 integrations can delete megabase-scale regions of a chromosome, which sometimes leads to the removal of tumor-suppressor genes, and can induce complex translocations and large-scale duplications. Somatic retrotranspositions can also initiate breakage-fusion-bridge cycles, leading to high-level amplification of oncogenes. These observations illuminate a relevant role of L1 retrotransposition in remodeling the cancer genome, with potential implications for the development of human tumors. %B Nature Genetics %V 52 %P 306-319 %G eng %N 3 %0 Journal Article %J Nature Communications %D 2020 %T Genomic footprints of activated telomere maintenance mechanisms in cancer %A Sieverling, Lina %A Hong, Chen %A Koser, Sandra D. %A Ginsbach, Philip %A Kleinheinz, Kortine %A Hutter, Barbara %A Braun, Delia M. %A Cortés-Ciriano, Isidro %A Xi, Ruibin %A Kabbe, Rolf %A Park,Peter J. %A Eils, Roland %A Schlesner, Matthias %A PCAWG Structural Variation Woking Group %A Brors, Benedikt %A Rippe, Karsten %A Jones, David T.W. %A Feuerbach, Lars %A PCAWG Consortium %X Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we dissect whole-genome sequencing data of over 2500 matched tumor-control samples from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXXtrunc) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor samples contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXXtrunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer. %B Nature Communications %V 11 %G eng %N 733 %0 Journal Article %J Bioinformatics %D 2020 %T GiniQC: a measure for quantifying noise in single-cell Hi-C data %A Horton, Connor A. %A Alver, Burak %A Park,Peter J. %X Single-cell Hi-C (scHi-C) allows the study of cell-to-cell variability in chromatin structure and dynamics. However, the high level of noise inherent in current scHi-C protocols necessitates careful assessment of data quality before biological conclusions can be drawn. Here we present GiniQC, which quantifies unevenness in the distribution of inter-chromosomal reads in the scHi-C contact matrix to measure the level of noise. Our examples show the utility of GiniQC in assessing the quality of scHi-C data as a complement to existing quality control measures. We also demonstrate how GiniQC can help inform the impact of various data processing steps on data quality. %B Bioinformatics %G eng %0 Journal Article %J N Engl J Med %D 2019 %T Patient-Customized Oligonucleotide Therapy for a Rare Genetic Disease %A Kim, Jinkuk %A Hu, Chunguang %A Moufawad El Achkar, Christelle %A Black, Lauren E %A Douville, Julie %A Larson, Austin %A Pendergast, Mary K %A Goldkind, Sara F %A Lee, Eunjung A %A Kuniholm, Ashley %A Soucy, Aubrie %A Vaze, Jai %A Belur, Nandkishore R %A Fredriksen, Kristina %A Stojkovska, Iva %A Tsytsykova, Alla %A Armant, Myriam %A DiDonato, Renata L %A Choi, Jaejoon %A Cornelissen, Laura %A Pereira, Luis M %A Augustine, Erika F %A Genetti, Casie A %A Dies, Kira %A Barton, Brenda %A Williams, Lucinda %A Goodlett, Benjamin D %A Riley, Bobbie L %A Pasternak, Amy %A Berry, Emily R %A Pflock, Kelly A %A Chu, Stephen %A Reed, Chantal %A Tyndall, Kimberly %A Agrawal, Pankaj B %A Beggs, Alan H %A Grant, P Ellen %A Urion, David K %A Snyder, Richard O %A Waisbren, Susan E. %A Poduri, Annapurna %A Park, Peter J %A Patterson, Al %A Biffi, Alessandra %A Mazzulli, Joseph R %A Bodamer, Olaf %A Berde, Charles B %A Yu, Timothy W %X Genome sequencing is often pivotal in the diagnosis of rare diseases, but many of these conditions lack specific treatments. We describe how molecular diagnosis of a rare, fatal neurodegenerative condition led to the rational design, testing, and manufacture of milasen, a splice-modulating antisense oligonucleotide drug tailored to a particular patient. Proof-of-concept experiments in cell lines from the patient served as the basis for launching an "N-of-1" study of milasen within 1 year after first contact with the patient. There were no serious adverse events, and treatment was associated with objective reduction in seizures (determined by electroencephalography and parental reporting). This study offers a possible template for the rapid development of patient-customized treatments. (Funded by Mila's Miracle Foundation and others.). %B N Engl J Med %8 2019 Oct 09 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/31597037?dopt=Abstract %R 10.1056/NEJMoa1813279 %0 Journal Article %J Nature Communications %D 2019 %T Identification of somatic mutations in single cell DNA sequencing data using a spatial model of allelic imbalance %A Luquette, Joe L. %A Bohrson, Craig L. %A Sherman, Max %A Park,Peter J. %X Recent advances in single cell technology have enabled dissection of cellular heterogeneity in great detail. However, analysis of single cell DNA sequencing data remains challenging due to bias and artifacts that arise during DNA extraction and whole-genome amplification, including allelic imbalance and dropout. Here, we present a framework for statistical estimation of allele-specific amplification imbalance at any given position in single cell whole-genome sequencing data by utilizing the allele frequencies of heterozygous single nucleotide polymorphisms in the neighborhood. The resulting allelic imbalance profile is critical for determining whether the variant allele fraction of an observed mutation is consistent with the expected fraction for a true variant. This method, implemented in SCAN-SNV (Single Cell ANalysis of SNVs), substantially improves the identification of somatic variants in single cells. Our allele balance framework is broadly applicable to genotype analysis of any variant type in any data that might exhibit allelic imbalance. %B Nature Communications %V 10 %P 3908 %G eng %N 1 %0 Journal Article %J Nature Genetics %D 2019 %T Detecting the mutational signature of homologous recombination deficiency in clinical samples %A Gulhan, Doga C. %A Lee, Jake June-Koo %A Melloni, Giorgio E. M. %A Cortés-Ciriano, Isidro %A Park,Peter J. %X Mutations in BRCA1 and/or BRCA2 (BRCA1/2) are the most common indication of deficiency in the homologous recombination (HR) DNA repair pathway. However, recent genome-wide analyses have shown that the same pattern of mutations found in BRCA1/2-mutant tumors is also present in several other tumors. Here, we present a new computational tool called Signature Multivariate Analysis (SigMA), which can be used to accurately detect the mutational signature associated with HR deficiency from targeted gene panels. Whereas previous methods require whole-genome or whole-exome data, our method detects the HR-deficiency signature even from low mutation counts, by using a likelihood-based measure combined with machine-learning techniques. Cell lines that we identify as HR deficient show a significant response to poly (ADP-ribose) polymerase (PARP) inhibitors; patients with ovarian cancer whom we found to be HR deficient show a significantly longer overall survival with platinum regimens. By enabling panel-based identification of mutational signatures, our method substantially increases the number of patients that may be considered for treatments targeting HR deficiency. %B Nature Genetics %V 51 %P 912-919 %G eng %N 5 %0 Journal Article %J Nature Genetics %D 2019 %T Linked-read analysis identifies mutations in single-cell DNA-sequencing data %A Bohrson, Craig L. %A Barton, Alison R. %A Lodato, Michael A. %A Rodin, Rachel E. %A Luquette, Lovelace J %A Viswanadham, Vinay V. %A Gulhan, Doga C. %A Cortés-Ciriano, Isidro %A Sherman, Maxwell A. %A Kwon, Minseok %A Coulter, Michael E. %A Galor, Alon %A Christopher A. Walsh %A Park,Peter J. %X Whole-genome sequencing of DNA from single cells has the potential to reshape our understanding of mutational heterogeneity in normal and diseased tissues. However, a major difficulty is distinguishing amplification artifacts from biologically derived somatic mutations. Here, we describe linked-read analysis (LiRA), a method that accurately identifies somatic singlenucleotide variants (sSNVs) by using read-level phasing with nearby germline heterozygous polymorphisms, thereby enabling the characterization of mutational signatures and estimation of somatic mutation rates in single cells. %B Nature Genetics %V 51 %P 749-754 %G eng %0 Journal Article %J Cell Rep %D 2019 %T Small-Molecule and CRISPR Screening Converge to Reveal Receptor Tyrosine Kinase Dependencies in Pediatric Rhabdoid Tumors %A Oberlick, Elaine M %A Rees, Matthew G %A Seashore-Ludlow, Brinton %A Vazquez, Francisca %A Nelson, Geoffrey M %A Dharia, Neekesh V %A Weir, Barbara A %A Tsherniak, Aviad %A Ghandi, Mahmoud %A Krill-Burger, John M %A Meyers, Robin M %A Wang, Xiaofeng %A Montgomery, Phil %A Root, David E %A Bieber, Jake M %A Radko, Sandi %A Cheah, Jaime H %A Hon, C Suk-Yee %A Shamji, Alykhan F %A Clemons, Paul A %A Park, Peter J %A Dyer, Michael A %A Golub, Todd R %A Stegmaier, Kimberly %A Hahn, William C %A Stewart, Elizabeth A %A Schreiber, Stuart L %A Roberts, Charles W M %K Animals %K Antineoplastic Agents %K Cell Line, Tumor %K CRISPR-Cas Systems %K Female %K HEK293 Cells %K Humans %K Mice %K Mice, Nude %K Mutation %K Protein Kinase Inhibitors %K Protein Tyrosine Phosphatase, Non-Receptor Type 11 %K Rhabdoid Tumor %K Small Molecule Libraries %X Cancer is often seen as a disease of mutations and chromosomal abnormalities. However, some cancers, including pediatric rhabdoid tumors (RTs), lack recurrent alterations targetable by current drugs and need alternative, informed therapeutic options. To nominate potential targets, we performed a high-throughput small-molecule screen complemented by a genome-scale CRISPR-Cas9 gene-knockout screen in a large number of RT and control cell lines. These approaches converged to reveal several receptor tyrosine kinases (RTKs) as therapeutic targets, with RTK inhibition effective in suppressing RT cell growth in vitro and against a xenograft model in vivo. RT cell lines highly express and activate (phosphorylate) different RTKs, creating dependency without mutation or amplification. Downstream of RTK signaling, we identified PTPN11, encoding the pro-growth signaling protein SHP2, as a shared dependency across all RT cell lines. This study demonstrates that large-scale perturbational screening can uncover vulnerabilities in cancers with "quiet" genomes. %B Cell Rep %V 28 %P 2331-2344.e8 %8 2019 08 27 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/31461650?dopt=Abstract %R 10.1016/j.celrep.2019.07.021 %0 Journal Article %J Genome Biol %D 2019 %T Global impact of somatic structural variation on the DNA methylome of human cancers %A Zhang, Yiqun %A Yang, Lixing %A Kucherlapati, Melanie %A Hadjipanayis, Angela %A Pantazi, Angeliki %A Bristow, Christopher A %A Lee, Eunjung Alice %A Mahadeshwar, Harshad S %A Tang, Jiabin %A Zhang, Jianhua %A Seth, Sahil %A Lee, Semin %A Ren, Xiaojia %A Song, Xingzhi %A Sun, Huandong %A Seidman, Jonathan %A Luquette, Lovelace J %A Xi, Ruibin %A Chin, Lynda %A Protopopov, Alexei %A Park, Peter J %A Kucherlapati, Raju %A Creighton, Chad J %X BACKGROUND: Genomic rearrangements exert a heavy influence on the molecular landscape of cancer. New analytical approaches integrating somatic structural variants (SSVs) with altered gene features represent a framework by which we can assign global significance to a core set of genes, analogous to established methods that identify genes non-randomly targeted by somatic mutation or copy number alteration. While recent studies have defined broad patterns of association involving gene transcription and nearby SSV breakpoints, global alterations in DNA methylation in the context of SSVs remain largely unexplored. RESULTS: By data integration of whole genome sequencing, RNA sequencing, and DNA methylation arrays from more than 1400 human cancers, we identify hundreds of genes and associated CpG islands (CGIs) for which the nearby presence of a somatic structural variant (SSV) breakpoint is recurrently associated with altered expression or DNA methylation, respectively, independently of copy number alterations. CGIs with SSV-associated increased methylation are predominantly promoter-associated, while CGIs with SSV-associated decreased methylation are enriched for gene body CGIs. Rearrangement of genomic regions normally having higher or lower methylation is often involved in SSV-associated CGI methylation alterations. Across cancers, the overall structural variation burden is associated with a global decrease in methylation, increased expression in methyltransferase genes and DNA damage response genes, and decreased immune cell infiltration. CONCLUSION: Genomic rearrangement appears to have a major role in shaping the cancer DNA methylome, to be considered alongside commonly accepted mechanisms including histone modifications and disruption of DNA methyltransferases. %B Genome Biol %V 20 %P 209 %8 2019 Oct 15 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/31610796?dopt=Abstract %R 10.1186/s13059-019-1818-9 %0 Journal Article %J Genome Biology %D 2019 %T An enhanced genetic model of colorectal cancer progression history %A Yang, Lixing %A Wang, Su %A Lee, Jake June-Koo %A Lee, Semin %A Lee, Eunjung %A Shinbrot, E %A Wheeler, DA %A Kucherlapati, R %A Park, Peter J %X

Background: The classical genetic model of colorectal cancer presents APC mutations as the earliest genomic alterations, followed by KRAS and TP53 mutations. However, the timing and relative order of clonal expansion and other types of genomic alterations, such as genomic rearrangements, are still unclear.

Results: Here, we perform comprehensive bioinformatic analysis to dissect the relative timing of somatic genetic alterations in 63 colorectal cancers with whole-genome sequencing data. Utilizing allele fractions of somatic single nucleotide variants as molecular clocks while accounting for the presence of copy number changes and structural alterations, we identify key events in the evolution of colorectal tumors. We find that driver point mutations, gene fusions, and arm-level copy losses typically arise early in tumorigenesis; different mechanisms act on distinct genomic regions to drive DNA copy changes; and chromothripsis-clustered rearrangements previously thought to occur as a single catastrophic event-is frequent and may occur multiple times independently in the same tumor through different mechanisms. Furthermore, our computational approach reveals that, in contrast to recent studies, selection is often present on subclones and that multiple evolutionary models can operate in a single tumor at different stages.

Conclusion: Combining these results, we present a refined tumor progression model which significantly expands our understanding of the tumorigenic process of human colorectal cancer.

Keywords: Aneuploidy; Kataegis; Tumor evolution; Tumor heterogeneity.

%B Genome Biology %V 20 %P 168 %G eng %N 1 %0 Journal Article %J Bioinformatics %D 2019 %T Tibanna: software for scalable execution of portable pipelines on the cloud %A Lee, Soohyun %A Johnson, Jeremy %A Vitzthum, Carl %A Kirli, Koray %A Alver, Burak H. %A Park,Peter J. %X We introduce Tibanna, an open-source software tool for automated execution of bioinformatics pipelines on Amazon Web Services (AWS). Tibanna accepts reproducible and portable pipeline standards including Common Workflow Language (CWL), Workflow Description Language (WDL) and Docker. It adopts a strategy of isolation and optimization of individual executions, combined with a serverless scheduling approach. Pipelines are executed and monitored using local commands or the Python Application Programming Interface (API) and cloud configuration is automatically handled. Tibanna is well suited for projects with a range of computational requirements, including those with large and widely fluctuating loads. Notably, it has been used to process terabytes of data for the 4D Nucleome (4DN) Network. %B Bioinformatics %G eng %0 Journal Article %J Cell %D 2019 %T Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma %A Lee, Jake June-Koo %A Park, Seongyeol %A Park, Hansol %A Kim, Sehui %A Jongkeun Lee %A Lee, Junehawk %A Youk, Jeonghwan %A Yi, Kijong %A An, Yohan %A In Kyu Park %A Kang, Chang Hyun %A Chung, Doo Hyun %A Kim, Tae Min %A Jeon, Yoon Kyung %A Dongwan Hong %A Park,Peter J. %A Ju, Young Seok %A Kim, Young Tae %X Mutational processes giving rise to lung adenocarcinomas (LADCs) in non-smokers remain elusive. We analyzed 138 LADC whole genomes, including 83 cases with minimal contribution of smoking-associated mutational signature. Genomic rearrangements were not correlated with smoking-associated mutations and frequently served as driver events of smoking-signature-low LADCs. Complex genomic rearrangements, including chromothripsis and chromoplexy, generated 74% of known fusion oncogenes, including EML4-ALK, CD74-ROS1, and KIF5B-RET. Unlike other collateral rearrangements, these fusion-oncogene-associated rearrangements were frequently copy-number-balanced, representing a genomic signature of early oncogenesis. Analysis of mutation timing revealed that fusions and point mutations of canonical oncogenes were often acquired in the early decades of life. During a long latency, cancer-related genes were disrupted or amplified by complex rearrangements. The genomic landscape was different between subgroups-EGFR-mutant LADCs had frequent whole-genome duplications with p53 mutations, whereas fusion-oncogene-driven LADCs had frequent SETD2 mutations. Our study highlights LADC oncogenesis driven by endogenous mutational processes. %B Cell %V 177 %P 1842-1857 %G eng %N 7 %0 Journal Article %J Cancer Research %D 2019 %T MDM2 and MDM4 Are Therapeutic Vulnerabilities in Malignant Rhabdoid Tumors %A Howard, Thomas P. %A Arnoff, Taylor E. %A Song, Melinda R. %A Giacomelli, Andrew O. %A Wang, Xiaofeng %A Hong, Andrew L. %A Dharia, Neekesh V. %A Wang, Su %A Vazquez, Francisca %A Pham, Minh-Tam %A Morgan, Ann M. %A Wachter, Franziska %A Bird, Gregory H. %A Kugener, Guillaume %A Oberlick, Elaine M. %A Rees, Matthew G. %A Tiv, Hong L. %A Hwang, Justin H. %A Walsh, Katherine H. %A Cook, April %A Krill-Burger, John M. %A Tsherniak, Aviad %A Gokhale, Prafulla C. %A Park,Peter J. %A Stegmaier, Kimberly %A Walensky, Loren D. %A Hahn, William C. %A Roberts, Charles W.M. %X Malignant rhabdoid tumors (MRT) are highly aggressive pediatric cancers that respond poorly to current therapies. In this study, we screened several MRT cell lines with large-scale RNAi, CRISPR-Cas9, and small-molecule libraries to identify potential drug targets specific for these cancers. We discovered MDM2 and MDM4, the canonical negative regulators of p53, as significant vulnerabilities. Using two compounds currently in clinical development, idasanutlin (MDM2-specific) and ATSP-7041 (MDM2/4-dual), we show that MRT cells were more sensitive than other p53 wild-type cancer cell lines to inhibition of MDM2 alone as well as dual inhibition of MDM2/4. These compounds caused significant upregulation of the p53 pathway in MRT cells, and sensitivity was ablated by CRISPR-Cas9–mediated inactivation of TP53. We show that loss of SMARCB1, a subunit of the SWI/SNF (BAF) complex mutated in nearly all MRTs, sensitized cells to MDM2 and MDM2/4 inhibition by enhancing p53-mediated apoptosis. Both MDM2 and MDM2/4 inhibition slowed MRT xenograft growth in vivo, with a 5-day idasanutlin pulse causing marked regression of all xenografts, including durable complete responses in 50% of mice. Together, these studies identify a genetic connection between mutations in the SWI/SNF chromatin-remodeling complex and the tumor suppressor gene TP53 and provide preclinical evidence to support the targeting of MDM2 and MDM4 in this often-fatal pediatric cancer. %B Cancer Research %V 79 %G eng %N 9 %0 Journal Article %J Journal of Experimental & Clinical Cancer Research %D 2019 %T MicroRNA-29a activates a multicomponent growth and invasion program in glioblastoma %A Zhao, Yun %A Huang, Wei %A Kim, Tae-Min %A Jung, Yuchae %A Menon, Lata G. %A Xing, Hongyan %A Li, Hongwei %A Carroll, Rona S. %A Park,Peter J. %A Yang, Hong Wei %A Johnson, Mark D. %X Glioblastoma is a malignant brain tumor characterized by rapid growth, diffuse invasion and therapeutic resistance. We recently used microRNA expression profiles to subclassify glioblastoma into five genetically and clinically distinct subclasses, and showed that microRNAs both define and contribute to the phenotypes of these subclasses. Here we show that miR-29a activates a multi-faceted growth and invasion program that promotes glioblastoma aggressiveness. %B Journal of Experimental & Clinical Cancer Research %V 38 %G eng %N 36 %0 Journal Article %J Genome Research %D 2019 %T A dynamic and integrated epigenetic program at distal regions orchestrates transcriptional responses to VEGFA %A Wang, Shiyan %A Chen, Jiahuan %A Garcia, Sara P. %A Liang, Xiaodong %A Zhang, Fang %A Yan, Pengyi %A Yu, Huijing %A Wei, Weiting %A Li, Zixuan %A Wang, Jingfang %A Le, Huangying %A Han, Zeguang %A Luo, Xusheng %A Day, Daniel S. %A Stevens, Sean M. %A Zhang, Yan %A Park,Peter J. %A Liu, Zhi-jie %A Sun, Kun %A Yuan, Guo-Cheng %A Pu, William T. %A Zhang, Bing %X Cell behaviors are dictated by epigenetic and transcriptional programs. Little is known about how extracellular stimuli modulate these programs to reshape gene expression and control cell behavioral responses. Here, we interrogated the epigenetic and transcriptional response of endothelial cells to VEGFA treatment and found rapid chromatin changes that mediate broad transcriptomic alterations. VEGFA-responsive genes were associated with active promoters, but changes in promoter histone marks were not tightly linked to gene expression changes. VEGFA altered transcription factor occupancy and the distal epigenetic landscape, which profoundly contributed to VEGFA-dependent changes in gene expression. Integration of gene expression, dynamic enhancer, and transcription factor occupancy changes induced by VEGFA yielded a VEGFA-regulated transcriptional regulatory network, which revealed that the small MAF transcription factors are master regulators of the VEGFA transcriptional program and angiogenesis. Collectively these results revealed that extracellular stimuli rapidly reconfigure the chromatin landscape to coordinately regulate biological responses. %B Genome Research %V 29 %P 193-207 %G eng %0 Journal Article %J Nature Communications %D 2019 %T BRD9 defines a SWI/SNF sub-complex and constitutes a specific vulnerability in malignant rhabdoid tumors %A Wang, Xiaofeng %A Wang, Su %A Troisi, Emma C. %A Howard, Thomas P. %A Haswell, Jeffrey R. %A Wolf, Bennett K. %A Hawk, William H. %A Ramos, Pilar %A Oberlick, Elaine M. %A Tzvetkov, Evgeni P. %A Vazquez, Francisca %A Hahn, William C. %A Park,Peter J.** %A Roberts, Charles W.M.** %X Bromodomain-containing protein 9 (BRD9) is a recently identified subunit of SWI/SNF(BAF) chromatin remodeling complexes, yet its function is poorly understood. Here, using a genome-wide CRISPR-Cas9 screen, we show that BRD9 is a specific vulnerability in pediatric malignant rhabdoid tumors (RTs), which are driven by inactivation of the SMARCB1 subunit of SWI/SNF. We find that BRD9 exists in a unique SWI/SNF sub-complex that lacks SMARCB1, which has been considered a core subunit. While SMARCB1-containing SWI/SNF complexes are bound preferentially at enhancers, we show that BRD9-containing complexes exist at both promoters and enhancers. Mechanistically, we show that SMARCB1 loss causes increased BRD9 incorporation into SWI/SNF thus providing insight into BRD9 vulnerability in RTs. Underlying the dependency, while its bromodomain is dispensable, the DUF3512 domain of BRD9 is essential for SWI/SNF integrity in the absence of SMARCB1. Collectively, our results reveal a BRD9-containing SWI/SNF subcomplex is required for the survival of SMARCB1-mutant RTs. %B Nature Communications %G eng %0 Journal Article %J Science %D 2018 %T Aging and neurodegeneration are associated with increased mutations in single human neurons %A Lodato, Michael A* %A Rodin, Rachel E* %A Bohrson, Craig L* %A Coulter, Michael E* %A Barton, Alison R* %A Kwon, Minseok* %A Sherman, Maxwell A %A Vitzthum, Carl M %A Luquette, Lovelace J %A Yandava, Chandri %A Yang, Pengwei %A Chittenden, Thomas W %A Hatem, Nicole E %A Ryu, Steven C %A Woodworth, Mollie B %A Park, Peter J** %A Walsh, Christopher A** %X It has long been hypothesized that aging and neurodegeneration are associated with somatic mutation in neurons; however, methodological hurdles have prevented testing this hypothesis directly. We used single-cell whole-genome sequencing to perform genome-wide somatic single-nucleotide variant (sSNV) identification on DNA from 161 single neurons from the prefrontal cortex and hippocampus of fifteen normal individuals (aged 4 months to 82 years) as well as nine individuals affected by early-onset neurodegeneration due to genetic disorders of DNA repair (Cockayne syndrome and Xeroderma pigmentosum). sSNVs increased approximately linearly with age in both areas (with a higher rate in hippocampus) and were more abundant in neurodegenerative disease. The accumulation of somatic mutations with age-which we term genosenium-shows age-related, region-related, and disease-related molecular signatures, and may be important in other human age-associated conditions. %B Science %V 359 %P 555-559 %8 2017 Dec 07 %G eng %N 6375 %1 http://www.ncbi.nlm.nih.gov/pubmed/29217584?dopt=Abstract %R 10.1126/science.aao4426 %0 Journal Article %J Genome Biol %D 2018 %T HiGlass: web-based visual exploration and analysis of genome interaction maps %A Kerpedjiev, Peter %A Abdennur, Nezar %A Lekschas, Fritz %A McCallum, Chuck %A Dinkla, Kasper %A Hendrik Strobelt %A Luber, Jacob M %A Ouellette, Scott B %A Azhir, Alaleh %A Kumar, Nikhil %A Hwang, Jeewon %A Lee, Soohyun %A Alver, Burak H %A Hanspeter Pfister %A Mirny, Leonid A %A Park, Peter J %A Nils Gehlenborg %K Chromosome Mapping %K Genome %K Internet %K User-Computer Interface %X We present HiGlass, an open source visualization tool built on web technologies that provides a rich interface for rapid, multiplex, and multiscale navigation of 2D genomic maps alongside 1D genomic tracks, allowing users to combine various data types, synchronize multiple visualization modalities, and share fully customizable views with others. We demonstrate its utility in exploring different experimental conditions, comparing the results of analyses, and creating interactive snapshots to share with collaborators and the broader public. HiGlass is accessible online at http://higlass.io and is also available as a containerized application that can be run on any platform. %B Genome Biol %V 19 %P 125 %8 2018 08 24 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/30143029?dopt=Abstract %R 10.1186/s13059-018-1486-1 %0 Journal Article %J Genome Research %D 2018 %T Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data. %A Fan, J.* %A Lee, H. O.* %A Lee, S. %A Ryu, D. E. %A Lee, S. %A Xue, C. %A Kim, S. J. %A Kim, K. %A Barkas, N. %A Park,Peter J. %A Park, W. Y. %A Karchenko, P. V. %X Characterization of intratumoral heterogeneity is critical to cancer therapy, as the presence of phenotypically diverse cell populations commonly fuels relapse and resistance to treatment. Although genetic variation is a well-studied source of intratumoral heterogeneity, the functional impact of most genetic alterations remains unclear. Even less understood is the relative importance of other factors influencing heterogeneity, such as epigenetic state or tumor microenvironment. To investigate the relationship between genetic and transcriptional heterogeneity in a context of cancer progression, we devised a computational approach called HoneyBADGER to identify copy number variation and loss of heterozygosity in individual cells from single-cell RNA-sequencing data. By integrating allele and normalized expression information, HoneyBADGER is able to identify and infer the presence of subclone-specific alterations in individual cells and reconstruct the underlying subclonal architecture. By examining several tumor types, we show that HoneyBADGER is effective at identifying deletions, amplifications, and copy-neutral loss-of-heterozygosity events and is capable of robustly identifying subclonal focal alterations as small as 10 megabases. We further apply HoneyBADGER to analyze single cells from a progressive multiple myeloma patient to identify major genetic subclones that exhibit distinct transcriptional signatures relevant to cancer progression. Other prominent transcriptional subpopulations within these tumors did not line up with the genetic subclonal structure and were likely driven by alternative, nonclonal mechanisms. These results highlight the need for integrative analysis to understand the molecular and phenotypic heterogeneity in cancer. %B Genome Research %V 28 %P 1217-1227 %G eng %N 8 %0 Journal Article %J Development %D 2018 %T EED, a member of the polycomb group, is required for nephron differentiation and the maintenance of nephron progenitor cells. %A L. Zhang %A Ettou, S. %A Khalid, M. %A Taglienti, M. %A Jain, D. %A Jung, Y. L. %A Seager, C. %A Liu, Y. %A Ng, K. H. %A Park,Peter J. %A Kreidberg, J. A. %X Epigenetic regulation of gene expression has a crucial role allowing for the self-renewal and differentiation of stem and progenitor populations during organogenesis. The mammalian kidney maintains a population of self-renewing stem cells that differentiate to give rise to thousands of nephrons, which are the functional units that carry out filtration to maintain physiological homeostasis. The polycomb repressive complex 2 (PRC2) epigenetically represses gene expression during development by placing the H3K27me3 mark on histone H3 at promoter and enhancer sites, resulting in gene silencing. To understand the role of PRC2 in nephron differentiation, we conditionally inactivated the Eed gene, which encodes a nonredundant component of the PRC2 complex, in nephron progenitor cells. Resultant kidneys were smaller and showed premature loss of progenitor cells. The progenitors in Eedmutant mice that were induced to differentiate did not develop into properly formed nephrons. Lhx1, normally expressed in the renal vesicle, was overexpressed in kidneys of Eed mutant mice. Thus, PRC2 has a crucial role in suppressing the expression of genes that maintain the progenitor state, allowing nephron differentiation to proceed. %B Development %V 145 %G eng %N 14 %0 Journal Article %J BMC Pediatrics %D 2018 %T The BabySeq project: implementing genomic sequencing in newborns. %A Holm, I. A. %A Agrawal, P. B. %A Ceyhan-Birsoy, O. %A Christensen, K. D. %A Fayer, S. %A Frankel, L. A. %A Genetti, C. A. %A Krier, J. B. %A LaMay, R. C. %A Levy, H. L. %A McGuire, A. L. %A Parad, R. B. %A Park,Peter J. %A Pereira, S. %A Rehm, H. L. %A Schwartz, T. S. %A Waisbren, S. E. %A Yu, T. W. %A BabySeq Project Team %A Green, R. C. %A Beggs, A. H. %X

BACKGROUND:

The greatest opportunity for lifelong impact of genomic sequencing is during the newborn period. The "BabySeq Project" is a randomized trial that explores the medical, behavioral, and economic impacts of integrating genomic sequencing into the care of healthy and sick newborns.

METHODS:

Families of newborns are enrolled from Boston Children's Hospital and Brigham and Women's Hospital nurseries, and half are randomized to receive genomic sequencing and a report that includes monogenic disease variants, recessive carrier variants for childhood onset or actionable disorders, and pharmacogenomic variants. All families participate in a disclosure session, which includes the return of results for those in the sequencing arm. Outcomes are collected through review of medical records and surveys of parents and health care providers and include the rationale for choice of genes and variants to report; what genomic data adds to the medical management of sick and healthy babies; and the medical, behavioral, and economic impacts of integrating genomic sequencing into the care of healthy and sick newborns.

DISCUSSION:

The BabySeq Project will provide empirical data about the risks, benefits and costs of newborn genomic sequencing and will inform policy decisions related to universal genomic screening of newborns.

TRIAL REGISTRATION:

The study is registered in ClinicalTrials.gov Identifier: NCT02422511 . Registration date: 10 April 2015.

KEYWORDS:

Ethical, legal, social implications; Methods; Newborn screening; Newborn sequencing; Randomized trial; Whole exome sequencing

%B BMC Pediatrics %V 18 %P 225 %G eng %N 1 %0 Journal Article %J Cell Reports %D 2018 %T A Pan-Cancer Compendium of Genes Deregulated by Somatic Genomic Rearrangement across More Than 1,400 Cases %A Zhang, Yiqun %A Yang, Lixing %A Kucherlapati, Melanie %A Chen, Fengju %A Hadjipanayis, Angela %A Pantazi, Angeliki %A Bristow, Christopher A. %A Lee, Eunjung A. %A Mahadeshwar, Harshad S. %A Tang, Jiabin %A Zhang, Jianhua %A Seth, Sahil %A Lee, Semin %A Ren, Xiaojia %A Song, Xingzhi %A Sun, Huandong %A Seidman, Jonathan %A Luquette, Lovelace J. %A Xi, Ruibin %A Chin, Lynda %A Protopopov, Alexei %A Li, Wei %A Park,Peter J. %A Kucherlapati, Raju %A Creighton, Chad J. %X A systematic cataloging of genes affected by genomic rearrangement, using multiple patient cohorts and cancer types, can provide insight into cancer-relevant alterations outside of exomes. By integrative analysis of whole-genome sequencing (predominantly low pass) and gene expression data from 1,448 cancers involving 18 histopathological types in The Cancer Genome Atlas, we identified hundreds of genes for which the nearby presence (within 100 kb) of a somatic structural variant (SV) breakpoint is associated with altered expression. While genomic rearrangements are associated with widespread copy-number alteration (CNA) patterns, approximately 1,100 genes-including overexpressed cancer driver genes (e.g., TERT, ERBB2, CDK12, CDK4) and underexpressed tumor suppressors (e.g., TP53, RB1, PTEN, STK11)-show SV-associated deregulation independent of CNA. SVs associated with the disruption of topologically associated domains, enhancer hijacking, or fusion transcripts are implicated in gene upregulation. For cancer-relevant pathways, SVs considerably expand our understanding of how genes are affected beyond point mutation or CNA. %B Cell Reports %V 24 %P 515-527 %G eng %N 2 %0 Journal Article %J Trends in Genetics %D 2018 %T Detecting Somatic Mutations in Normal Cells. %A Dou, Yanmei* %A Gold, Heather D.* %A Luquette, Lovelace J.* %A Park,Peter J. %X Somatic mutations have been studied extensively in the context of cancer. Recent studies have demonstrated that high-throughput sequencing data can be used to detect somatic mutations in non-tumor cells. Analysis of such mutations allows us to better understand the mutational processes in normal cells, explore cell lineages in development, and examine potential associations with age-related disease. We describe here approaches for characterizing somatic mutations in normal and non-tumor disease tissues. We discuss several experimental designs and common pitfalls in somatic mutation detection, as well as more recent developments such as phasing and linked-read technology. With the dramatically increasing numbers of samples undergoing genome sequencing, bioinformatic analysis will enable the characterization of somatic mutations and their impact on non-cancer tissues. %B Trends in Genetics %V 35 %P 545-557 %G eng %N 7 %0 Journal Article %J Cell %D 2018 %T Comprehensive Characterization of Cancer Driver Genes and Mutations. %A Bailey, M. H. %A Tokheim, C. %A Porta-Pardo, E. %A Sengupta, S. %A Bertrand, D. %A Weerasinghe, A. %A Colaprico, A. %A Wendl, M. C. %A Kim, J. %A Reardon, B. %A Ng, P. K. %A Jeong, K. J. %A Cao, S. %A Z. Wang %A Gao, J. %A Gao, Q. %A Wang, F. %A Liu, E. M. %A Mularoni, L. %A Rubio-Perez, C. %A Nagarajan, N. %A Cortes-Ciriano, I. %A Zhou, D. C. %A Liang, W. W. %A Hess, J. M. %A Yellapantula, V. D. %A Tamborero, D. %A Gonzalez-Perez, A. %A Suphavilai, C. %A Ko, J. Y. %A Khurana, E. %A Park,Peter J. %A Van Allen, E. M. %A Liang, H. %A MC3 Working Group %A Cancer Genome Atlas Research Network %A Lawrence, M. S. %A Godzik, A. %A Lopez-Bigas N. %A Stuart, J. %A Wheeler, D. %A Getz, G. %A Chen, K. %A Lazar, A. J. %A Mills, G. B. %A Karchin, R. %A Ding, L. %X Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors. %B Cell %V 173 %P 371-385 %G eng %N 2 %0 Journal Article %J Nucleic Acids Research %D 2018 %T PaSD-qc: quality control for single cell whole-genome sequencing data using power spectral density estimation. %A Sherman, Maxwell A. %A Barton, Allison R. %A Lodato, Michael A. %A Vitzthum, Carl %A Coulter, Michael E. %A Christopher A. Walsh %A Park,Peter J. %X Single cell whole-genome sequencing (scWGS) is providing novel insights into the nature of genetic heterogeneity in normal and diseased cells. However, the whole-genome amplification process required for scWGS introduces biases into the resulting sequencing that can confound downstream analysis. Here, we present a statistical method, with an accompanying package PaSD-qc (Power Spectral Density-qc), that evaluates the properties and quality of single cell libraries. It uses a modified power spectral density to assess amplification uniformity, amplicon size distribution, autocovariance and inter-sample consistency as well as to identify chromosomes with aberrant read-density profiles due either to copy alterations or poor amplification. These metrics provide a standard way to compare the quality of single cell samples as well as yield information necessary to improve variant calling strategies. We demonstrate the usefulness of this tool in comparing the properties of scWGS protocols, identifying potential chromosomal copy number variation, determining chromosomal and subchromosomal regions of poor amplification, and selecting high-quality libraries from low-coverage data for deep sequencing. The software is available free and open-source at https://github.com/parklab/PaSDqc. %B Nucleic Acids Research %V 46 %P e20 %G eng %N 4 %0 Journal Article %J Nat Commun %D 2017 %T A molecular portrait of microsatellite instability across multiple cancers %A Cortes-Ciriano, Isidro* %A Lee, Sejoon* %A Park, Woong-Yang %A Kim, Tae-Min** %A Park, Peter J** %X Microsatellite instability (MSI) refers to the hypermutability of short repetitive sequences in the genome caused by impaired DNA mismatch repair. Although MSI has been studied for decades, large amounts of sequencing data now available allows us to examine the molecular fingerprints of MSI in greater detail. Here, we analyse ∼8,000 exomes and ∼1,000 whole genomes of cancer patients across 23 cancer types. Our analysis reveals that the frequency of MSI events is highly variable within and across tumour types. We also identify genes in DNA repair and oncogenic pathways recurrently subject to MSI and uncover non-coding loci that frequently display MSI. Finally, we propose a highly accurate exome-based predictive model for the MSI phenotype. These results advance our understanding of the genomic drivers and consequences of MSI, and our comprehensive catalogue of tumour-type-specific MSI loci will enable panel-based MSI testing to identify patients who are likely to benefit from immunotherapy. %B Nat Commun %V 8 %P 15180 %8 2017 Jun 06 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/28585546?dopt=Abstract %R 10.1038/ncomms15180 %0 Journal Article %J Nature %D 2017 %T The 4D nucleome project %A Dekker, Job %A Belmont, Andrew S %A Guttman, Mitchell %A Leshyk, Victor O %A Lis, John T %A Lomvardas, Stavros %A Mirny, Leonid A %A O'Shea, Clodagh C %A Park, Peter J %A Ren, Bing %A Politz, Joan C Ritland %A Shendure, Jay %A Zhong, Sheng %A 4D Nucleome Network %X The 4D Nucleome Network aims to develop and apply approaches to map the structure and dynamics of the human and mouse genomes in space and time with the goal of gaining deeper mechanistic insights into how the nucleus is organized and functions. The project will develop and benchmark experimental and computational approaches for measuring genome conformation and nuclear organization, and investigate how these contribute to gene regulation and other genome functions. Validated experimental technologies will be combined with biophysical approaches to generate quantitative models of spatial genome organization in different biological states, both in cell populations and in single cells. %B Nature %V 549 %P 219-226 %8 2017 Sep 13 %G eng %N 7671 %1 http://www.ncbi.nlm.nih.gov/pubmed/28905911?dopt=Abstract %R 10.1038/nature23884 %0 Journal Article %J Genes Dev %D 2017 %T Bivalent complexes of PRC1 with orthologs of BRD4 and MOZ/MORF target developmental genes in Drosophila %A Kang, Hyuckjoon* %A Jung, Youngsook L* %A McElroy, Kyle A %A Zee, Barry M %A Wallace, Heather A %A Woolnough, Jessica L %A Park, Peter J %A Kuroda, Mitzi I %K Acetylation %K Animals %K Binding Sites %K Cell Differentiation %K Cells, Cultured %K Drosophila melanogaster %K Drosophila Proteins %K Embryo, Nonmammalian %K Gene Expression Regulation, Developmental %K Gene Silencing %K Genes, Developmental %K Human Embryonic Stem Cells %K Humans %K Multiprotein Complexes %K Polycomb Repressive Complex 1 %K Protein Binding %X Regulatory decisions in Drosophila require Polycomb group (PcG) proteins to maintain the silent state and Trithorax group (TrxG) proteins to oppose silencing. Since PcG and TrxG are ubiquitous and lack apparent sequence specificity, a long-standing model is that targeting occurs via protein interactions; for instance, between repressors and PcG proteins. Instead, we found that Pc-repressive complex 1 (PRC1) purifies with coactivators Fs(1)h [female sterile (1) homeotic] and Enok/Br140 during embryogenesis. Fs(1)h is a TrxG member and the ortholog of BRD4, a bromodomain protein that binds to acetylated histones and is a key transcriptional coactivator in mammals. Enok and Br140, another bromodomain protein, are orthologous to subunits of a mammalian MOZ/MORF acetyltransferase complex. Here we confirm PRC1-Br140 and PRC1-Fs(1)h interactions and identify their genomic binding sites. PRC1-Br140 bind developmental genes in fly embryos, with analogous co-occupancy of PRC1 and a Br140 ortholog, BRD1, at bivalent loci in human embryonic stem (ES) cells. We propose that identification of PRC1-Br140 "bivalent complexes" in fly embryos supports and extends the bivalency model posited in mammalian cells, in which the coexistence of H3K4me3 and H3K27me3 at developmental promoters represents a poised transcriptional state. We further speculate that local competition between acetylation and deacetylation may play a critical role in the resolution of bivalent protein complexes during development. %B Genes Dev %V 31 %P 1988-2002 %8 2017 Oct 01 %G eng %N 19 %1 http://www.ncbi.nlm.nih.gov/pubmed/29070704?dopt=Abstract %R 10.1101/gad.305987.117 %0 Journal Article %J Oncologist %D 2017 %T Clinical Application of Targeted Deep Sequencing in Solid-Cancer Patients and Utility for Biomarker-Selected Clinical Trials %A Kim, Seung Tae %A Kim, Kyoung-Mee %A Kim, Nayoung K D %A Park, Joon Oh %A Ahn, Soomin %A Yun, Jae-Won %A Kim, Kyu-Tae %A Park, Se Hoon %A Park, Peter J %A Kim, Hee Cheol %A Sohn, Tae Sung %A Choi, Dong Il %A Cho, Jong Ho %A Heo, Jin Seok %A Kwon, Wooil %A Lee, Hyuk %A Min, Byung-Hoon %A Hong, Sung No %A Park, Young Suk %A Lim, Ho Yeong %A Kang, Won Ki %A Park, Woong-Yang %A Lee, Jeeyun %X Molecular profiling of actionable mutations in refractory cancer patients has the potential to enable "precision medicine," wherein individualized therapies are guided based on genomic profiling. The molecular-screening program was intended to route participants to different candidate drugs in trials based on clinical-sequencing reports. In this screening program, we used a custom target-enrichment panel consisting of cancer-related genes to interrogate single-nucleotide variants, insertions and deletions, copy number variants, and a subset of gene fusions. From August 2014 through April 2015, 654 patients consented to participate in the program at Samsung Medical Center. Of these patients, 588 passed the quality control process for the 381-gene cancer-panel test, and 418 patients were included in the final analysis as being eligible for any anticancer treatment (127 gastric cancer, 122 colorectal cancer, 62 pancreatic/biliary tract cancer, 67 sarcoma/other cancer, and 40 genitourinary cancer patients). Of the 418 patients, 55 (12%) harbored a biomarker that guided them to a biomarker-selected clinical trial, and 184 (44%) patients harbored at least one genomic alteration that was potentially targetable. This study demonstrated that the panel-based sequencing program resulted in an increased rate of trial enrollment of metastatic cancer patients into biomarker-selected clinical trials. Given the expanding list of biomarker-selected trials, the guidance percentage to matched trials is anticipated to increase. IMPLICATIONS FOR PRACTICE: This study demonstrated that the panel-based sequencing program resulted in an increased rate of trial enrollment of metastatic cancer patients into biomarker-selected clinical trials. Given the expanding list of biomarker-selected trials, the guidance percentage to matched trials is anticipated to increase. %B Oncologist %V 22 %P 1169-1177 %8 2017 Oct %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/28701572?dopt=Abstract %R 10.1634/theoncologist.2017-0020 %0 Journal Article %J Nat Commun %D 2017 %T Prevalence and detection of low-allele-fraction variants in clinical cancer samples %A Shin, Hyun-Tae* %A Choi, Yoon-La* %A Yun, Jae Won* %A Kim, Nayoung K D* %A Kim, Sook-Young %A Jeon, Hyo Jeong %A Nam, Jae-Yong %A Lee, Chung %A Ryu, Daeun %A Kim, Sang Cheol %A Park, Kyunghee %A Lee, Eunjin %A Bae, Joon Seol %A Son, Dae Soon %A Joung, Je-Gun %A Lee, Jeeyun %A Kim, Seung Tae %A Ahn, Myung-Ju %A Lee, Se-Hoon %A Ahn, Jin Seok %A Lee, Woo Yong %A Oh, Bo Young %A Park, Yeon Hee %A Lee, Jeong Eon %A Lee, Kwang Hyuk %A Kim, Hee Cheol %A Kim, Kyoung-Mee %A Im, Young-Hyuck %A Park, Keunchil %A Park, Peter J** %A Park, Woong-Yang** %X Accurate detection of genomic alterations using high-throughput sequencing is an essential component of precision cancer medicine. We characterize the variant allele fractions (VAFs) of somatic single nucleotide variants and indels across 5095 clinical samples profiled using a custom panel, CancerSCAN. Our results demonstrate that a significant fraction of clinically actionable variants have low VAFs, often due to low tumor purity and treatment-induced mutations. The percentages of mutations under 5% VAF across hotspots in EGFR, KRAS, PIK3CA, and BRAF are 16%, 11%, 12%, and 10%, respectively, with 24% for EGFR T790M and 17% for PIK3CA E545. For clinical relevance, we describe two patients for whom targeted therapy achieved remission despite low VAF mutations. We also characterize the read depths necessary to achieve sensitivity and specificity comparable to current laboratory assays. These results show that capturing low VAF mutations at hotspots by sufficient sequencing coverage and carefully tuned algorithms is imperative for a clinical assay. %B Nat Commun %V 8 %P 1377 %8 2017 Nov 09 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/29123093?dopt=Abstract %R 10.1038/s41467-017-01470-y %0 Journal Article %J Nat Commun %D 2017 %T VEGF amplifies transcription through ETS1 acetylation to enable angiogenesis %A Chen, Jiahuan %A Fu, Yi %A Day, Daniel S %A Sun, Ye %A Wang, Shiyan %A Liang, Xiaodong %A Gu, Fei %A Zhang, Fang %A Stevens, Sean M %A Pingzhu Zhou %A Li, Kai %A Zhang, Yan %A Lin, Ruei-Zeng %A Smith, Lois E H %A Jin Zhang %A Sun, Kun %A Melero-Martin, Juan M %A Han, Zeguang %A Park, Peter J %A Zhang, Bing %A Pu, William T %X Release of promoter-proximally paused RNA polymerase II (RNAPII) is a recently recognized transcriptional regulatory checkpoint. The biological roles of RNAPII pause release and the mechanisms by which extracellular signals control it are incompletely understood. Here we show that VEGF stimulates RNAPII pause release by stimulating acetylation of ETS1, a master endothelial cell transcriptional regulator. In endothelial cells, ETS1 binds transcribed gene promoters and stimulates their expression by broadly increasing RNAPII pause release. 34 VEGF enhances ETS1 chromatin occupancy and increases ETS1 acetylation, enhancing its binding to BRD4, which recruits the pause release machinery and increases RNAPII pause release. Endothelial cell angiogenic responses in vitro and in vivo require ETS1-mediated transduction of VEGF signaling to release paused RNAPII. Our results define an angiogenic pathway in which VEGF enhances ETS1-BRD4 interaction to broadly promote RNAPII pause release and drive angiogenesis.Promoter proximal RNAPII pausing is a rate-limiting transcriptional mechanism. Chen et al. show that this process is essential in angiogenesis by demonstrating that the endothelial master transcription factor ETS1 promotes global RNAPII pause release, and that this process is governed by VEGF. %B Nat Commun %V 8 %P 383 %8 2017 08 29 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/28851877?dopt=Abstract %R 10.1038/s41467-017-00405-x %0 Journal Article %J J Clin Oncol %D 2017 %T Clonal History and Genetic Predictors of Transformation Into Small-Cell Carcinomas From Lung Adenocarcinomas %A Lee, June-Koo %A Lee, Junehawk %A Kim, Sehui %A Kim, Soyeon %A Youk, Jeonghwan %A Park, Seongyeol %A An, Yohan %A Keam, Bhumsuk %A Kim, Dong-Wan %A Heo, Dae Seog %A Kim, Young Tae %A Kim, Jin-Soo %A Kim, Se Hyun %A Lee, Jong Seok %A Lee, Se-Hoon %A Park, Keunchil %A Ku, Ja-Lok %A Jeon, Yoon Kyung %A Chung, Doo Hyun %A Park, Peter J %A Kim, Joon %A Kim, Tae Min %A Ju, Young Seok %X Purpose Histologic transformation of EGFR mutant lung adenocarcinoma (LADC) into small-cell lung cancer (SCLC) has been described as one of the major resistant mechanisms for epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors (TKIs). However, the molecular pathogenesis is still unclear. Methods We investigated 21 patients with advanced EGFR-mutant LADCs that were transformed into EGFR TKI-resistant SCLCs. Among them, whole genome sequencing was applied for nine tumors acquired at various time points from four patients to reconstruct their clonal evolutionary history and to detect genetic predictors for small-cell transformation. The findings were validated by immunohistochemistry in 210 lung cancer tissues. Results We identified that EGFR TKI-resistant LADCs and SCLCs share a common clonal origin and undergo branched evolutionary trajectories. The clonal divergence of SCLC ancestors from the LADC cells occurred before the first EGFR TKI treatments, and the complete inactivation of both RB1 and TP53 were observed from the early LADC stages in sequenced tumors. We extended the findings by immunohistochemistry in the early-stage LADC tissues of 75 patients treated with EGFR TKIs; inactivation of both Rb and p53 was strikingly more frequent in the small-cell-transformed group than in the nontransformed group (82% v 3%; odds ratio, 131; 95% CI, 19.9 to 859). Among patients registered in a predefined cohort (n = 65), an EGFR mutant LADC that harbored completely inactivated Rb and p53 had a 43× greater risk of small-cell transformation (relative risk, 42.8; 95% CI, 5.88 to 311). Branch-specific mutational signature analysis revealed that apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC)-induced hypermutation was frequent in the branches toward small-cell transformation. Conclusion EGFR TKI-resistant SCLCs are branched out early from the LADC clones that harbor completely inactivated RB1 and TP53. The evaluation of RB1 and TP53 status in EGFR TKI-treated LADCs is informative in predicting small-cell transformation. %B J Clin Oncol %P JCO2016719096 %8 2017 May 12 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/28498782?dopt=Abstract %R 10.1200/JCO.2016.71.9096 %0 Journal Article %J Cancer Res %D 2017 %T Engineering and Functional Characterization of Fusion Genes Identifies Novel Oncogenic Drivers of Cancer %A Lu, Hengyu %A Villafane, Nicole %A Dogruluk, Turgut %A Grzeskowiak, Caitlin L %A Kong, Kathleen %A Tsang, Yiu Huen %A Zagorodna, Oksana %A Pantazi, Angeliki %A Yang, Lixing %A Neill, Nicholas J %A Kim, Young Won %A Creighton, Chad J %A Verhaak, Roel G %A Mills, Gordon B %A Park, Peter J %A Kucherlapati, Raju %A Scott, Kenneth L %X Oncogenic gene fusions drive many human cancers, but tools to more quickly unravel their functional contributions are needed. Here we describe methodology permitting fusion gene construction for functional evaluation. Using this strategy, we engineered the known fusion oncogenes, BCR-ABL1, EML4-ALK, and ETV6-NTRK3, as well as 20 previously uncharacterized fusion genes identified in The Cancer Genome Atlas datasets. In addition to confirming oncogenic activity of the known fusion oncogenes engineered by our construction strategy, we validated five novel fusion genes involving MET, NTRK2, and BRAF kinases that exhibited potent transforming activity and conferred sensitivity to FDA-approved kinase inhibitors. Our fusion construction strategy also enabled domain-function studies of BRAF fusion genes. Our results confirmed other reports that the transforming activity of BRAF fusions results from truncation-mediated loss of inhibitory domains within the N-terminus of the BRAF protein. BRAF mutations residing within this inhibitory region may provide a means for BRAF activation in cancer, therefore we leveraged the modular design of our fusion gene construction methodology to screen N-terminal domain mutations discovered in tumors that are wild-type at the BRAF mutation hotspot, V600. We identified an oncogenic mutation, F247L, whose expression robustly activated the MAPK pathway and sensitized cells to BRAF and MEK inhibitors. When applied broadly, these tools will facilitate rapid fusion gene construction for subsequent functional characterization and translation into personalized treatment strategies. Cancer Res; 77(13); 1-11. ©2017 AACR. %B Cancer Res %8 2017 May 16 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/28512244?dopt=Abstract %R 10.1158/0008-5472.CAN-16-2745 %0 Journal Article %J Cancer Cell %D 2017 %T A Pan-Cancer Proteogenomic Atlas of PI3K/AKT/mTOR Pathway Alterations %A Zhang, Yiqun %A Kwok-Shing Ng, Patrick %A Kucherlapati, Melanie %A Chen, Fengju %A Liu, Yuexin %A Tsang, Yiu Huen %A De Velasco, Guillermo %A Jeong, Kang Jin %A Akbani, Rehan %A Hadjipanayis, Angela %A Pantazi, Angeliki %A Bristow, Christopher A %A Lee, Eunjung %A Mahadeshwar, Harshad S %A Tang, Jiabin %A Zhang, Jianhua %A Yang, Lixing %A Seth, Sahil %A Lee, Semin %A Ren, Xiaojia %A Song, Xingzhi %A Sun, Huandong %A Seidman, Jonathan %A Luquette, Lovelace J %A Xi, Ruibin %A Chin, Lynda %A Protopopov, Alexei %A Westbrook, Thomas F %A Shelley, Carl Simon %A Choueiri, Toni K %A Ittmann, Michael %A Van Waes, Carter %A Weinstein, John N %A Liang, Han %A Henske, Elizabeth P %A Godwin, Andrew K %A Park, Peter J %A Kucherlapati, Raju %A Scott, Kenneth L %A Mills, Gordon B %A Kwiatkowski, David J %A Creighton, Chad J %X Molecular alterations involving the PI3K/AKT/mTOR pathway (including mutation, copy number, protein, or RNA) were examined across 11,219 human cancers representing 32 major types. Within specific mutated genes, frequency, mutation hotspot residues, in silico predictions, and functional assays were all informative in distinguishing the subset of genetic variants more likely to have functional relevance. Multiple oncogenic pathways including PI3K/AKT/mTOR converged on similar sets of downstream transcriptional targets. In addition to mutation, structural variations and partial copy losses involving PTEN and STK11 showed evidence for having functional relevance. A substantial fraction of cancers showed high mTOR pathway activity without an associated canonical genetic or genomic alteration, including cancers harboring IDH1 or VHL mutations, suggesting multiple mechanisms for pathway activation. %B Cancer Cell %V 31 %P 820-832.e3 %8 2017 Jun 12 %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/28528867?dopt=Abstract %R 10.1016/j.ccell.2017.04.013 %0 Journal Article %J Cell Stem Cell %D 2017 %T DUSP9 Modulates DNA Hypomethylation in Female Mouse Pluripotent Stem Cells %A Choi, Jiho %A Clement, Kendell %A Huebner, Aaron J %A Webster, Jamie %A Rose, Christopher M %A Brumbaugh, Justin %A Walsh, Ryan M %A Lee, Soohyun %A Savol, Andrej %A Etchegaray, Jean-Pierre %A Gu, Hongcang %A Boyle, Patrick %A Elling, Ulrich %A Mostoslavsky, Raul %A Sadreyev, Ruslan %A Park, Peter J %A Gygi, Steven P %A Meissner, Alexander %A Hochedlinger, Konrad %X

Blastocyst-derived embryonic stem cells (ESCs) and gonad-derived embryonic germ cells (EGCs) represent two classic types of pluripotent cell lines, yet their molecular equivalence remains incompletely understood. Here, we compare genome-wide methylation patterns between isogenic ESC and EGC lines to define epigenetic similarities and differences. Surprisingly, we find that sex rather than cell type drives methylation patterns in ESCs and EGCs. Cell fusion experiments further reveal that the ratio of X chromosomes to autosomes dictates methylation levels, with female hybrids being hypomethylated and male hybrids being hypermethylated. We show that the X-linked MAPK phosphatase DUSP9 is upregulated in female compared to male ESCs, and its heterozygous loss in female ESCs leads to male-like methylation levels. However, male and female blastocysts are similarly hypomethylated, indicating that sex-specific methylation differences arise in culture. Collectively, our data demonstrate the epigenetic similarity of sex-matched ESCs and EGCs and identify DUSP9 as a regulator of female-specific hypomethylation.

%B Cell Stem Cell %V 20 %P 706-719.e7 %8 2017 May 04 %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/28366588?dopt=Abstract %R 10.1016/j.stem.2017.03.002 %0 Journal Article %J Science %D 2017 %T Intersection of diverse neuronal genomes and neuropsychiatric disease: The Brain Somatic Mosaicism Network %A McConnell, Michael J %A Moran, John V %A Abyzov, Alexej %A Akbarian, Schahram %A Bae, Taejeong %A Cortes-Ciriano, Isidro %A Erwin, Jennifer A %A Fasching, Liana %A Flasch, Diane A %A Freed, Donald %A Ganz, Javier %A Jaffe, Andrew E %A Kwan, Kenneth Y %A Kwon, Minseok %A Lodato, Michael A %A Mills, Ryan E %A Paquola, Apua C M %A Rodin, Rachel E %A Rosenbluh, Chaggai %A Sestan, Nenad %A Sherman, Maxwell A %A Shin, Joo Heon %A Song, Saera %A Straub, Richard E %A Thorpe, Jeremy %A Weinberger, Daniel R %A Urban, Alexander E %A Zhou, Bo %A Gage, Fred H %A Lehner, Thomas %A Senthil, Geetha %A Walsh, Christopher A %A Chess, Andrew %A Courchesne, Eric %A Gleeson, Joseph G %A Kidd, Jeffrey M %A Park, Peter J %A Pevsner, Jonathan %A Vaccarino, Flora M %A Brain Somatic Mosaicism Network, BSM %X

Neuropsychiatric disorders have a complex genetic architecture. Human genetic population-based studies have identified numerous heritable sequence and structural genomic variants associated with susceptibility to neuropsychiatric disease. However, these germline variants do not fully account for disease risk. During brain development, progenitor cells undergo billions of cell divisions to generate the ~80 billion neurons in the brain. The failure to accurately repair DNA damage arising during replication, transcription, and cellular metabolism amid this dramatic cellular expansion can lead to somatic mutations. Somatic mutations that alter subsets of neuronal transcriptomes and proteomes can, in turn, affect cell proliferation and survival and lead to neurodevelopmental disorders. The long life span of individual neurons and the direct relationship between neural circuits and behavior suggest that somatic mutations in small populations of neurons can significantly affect individual neurodevelopment. The Brain Somatic Mosaicism Network has been founded to study somatic mosaicism both in neurotypical human brains and in the context of complex neuropsychiatric disorders.

%B Science %V 356 %8 2017 Apr 28 %G eng %N 6336 %1 http://www.ncbi.nlm.nih.gov/pubmed/28450582?dopt=Abstract %R 10.1126/science.aal1641 %0 Journal Article %J Nucleic Acids Res %D 2017 %T NGSCheckMate: software for validating sample identity in next-generation sequencing studies within and across data types %A Lee, Sejoon* %A Lee, Soohyun* %A Ouellette, Scott %A Park, Woong-Yang %A Lee, Eunjung A** %A Park, Peter J** %X

In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files. This tool uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms, considering depth-dependent behavior of similarity metrics for identical and unrelated samples. Our evaluation shows that NGSCheckMate is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNA-seq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth (>0.5X). An alignment-free module can be run directly on FASTQ files for a quick initial check. We recommend using this software as a QC step in NGS studies. AVAILABILITY: https://github.com/parklab/NGSCheckMate.

%B Nucleic Acids Res %8 2017 Mar 23 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/28369524?dopt=Abstract %R 10.1093/nar/gkx193 %0 Journal Article %J Mol Cell %D 2017 %T Spt5 Plays Vital Roles in the Control of Sense and Antisense Transcription Elongation %A Shetty, Ameet* %A Kallgren, Scott P* %A Demel, Carina %A Maier, Kerstin C %A Spatt, Dan %A Alver, Burak H %A Cramer, Patrick %A Park, Peter J %A Winston, Fred %X

Spt5 is an essential and conserved factor that functions in transcription and co-transcriptional processes. However, many aspects of the requirement for Spt5 in transcription are poorly understood. We have analyzed the consequences of Spt5 depletion in Schizosaccharomyces pombe using four genome-wide approaches. Our results demonstrate that Spt5 is crucial for a normal rate of RNA synthesis and distribution of RNAPII over transcription units. In the absence of Spt5, RNAPII localization changes dramatically, with reduced levels and a relative accumulation over the first ∼500 bp, suggesting that Spt5 is required for transcription past a barrier. Spt5 depletion also results in widespread antisense transcription initiating within this barrier region. Deletions of this region alter the distribution of RNAPII on the sense strand, suggesting that the barrier observed after Spt5 depletion is normally a site at which Spt5 stimulates elongation. Our results reveal a global requirement for Spt5 in transcription elongation.

%B Mol Cell %V 66 %P 77-88.e5 %8 2017 Apr 06 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/28366642?dopt=Abstract %R 10.1016/j.molcel.2017.02.023 %0 Journal Article %J Nat Commun %D 2017 %T The SWI/SNF chromatin remodelling complex is required for maintenance of lineage specific enhancers. %A Alver, Burak H* %A Kim, Kimberly H* %A Lu, Ping %A Wang, Xiaofeng %A Manchester, Haley E %A Wang, Weishan %A Haswell, Jeffrey R %A Park, Peter J** %A Roberts, Charles W M** %X

Genes encoding subunits of SWI/SNF (BAF) chromatin remodelling complexes are collectively altered in over 20% of human malignancies, but the mechanisms by which these complexes alter chromatin to modulate transcription and cell fate are poorly understood. Utilizing mouse embryonic fibroblast and cancer cell line models, here we show via ChIP-seq and biochemical assays that SWI/SNF complexes are preferentially targeted to distal lineage specific enhancers and interact with p300 to modulate histone H3 lysine 27 acetylation. We identify a greater requirement for SWI/SNF at typical enhancers than at most super-enhancers and at enhancers in untranscribed regions than in transcribed regions. Our data further demonstrate that SWI/SNF-dependent distal enhancers are essential for controlling expression of genes linked to developmental processes. Our findings thus establish SWI/SNF complexes as regulators of the enhancer landscape and provide insight into the roles of SWI/SNF in cellular fate control.

%B Nat Commun %V 8 %P 14648 %8 2017 Mar 06 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/28262751?dopt=Abstract %R 10.1038/ncomms14648 %0 Journal Article %J Nat Genet %D 2017 %T SMARCB1-mediated SWI/SNF complex function is essential for enhancer regulation %A Wang, Xiaofeng* %A Lee, Ryan S* %A Alver, Burak H* %A Haswell, Jeffrey R %A Wang, Su %A Mieczkowski, Jakub %A Drier, Yotam %A Gillespie, Shawn M %A Archer, Tenley C %A Wu, Jennifer N %A Tzvetkov, Evgeni P %A Troisi, Emma C %A Pomeroy, Scott L %A Biegel, Jaclyn A %A Tolstorukov, Michael Y %A Bernstein, Bradley E** %A Park, Peter J** %A Roberts, Charles W M** %X

SMARCB1 (also known as SNF5, INI1, and BAF47), a core subunit of the SWI/SNF (BAF) chromatin-remodeling complex, is inactivated in nearly all pediatric rhabdoid tumors. These aggressive cancers are among the most genomically stable, suggesting an epigenetic mechanism by which SMARCB1 loss drives transformation. Here we show that, despite having indistinguishable mutational landscapes, human rhabdoid tumors exhibit distinct enhancer H3K27ac signatures, which identify remnants of differentiation programs. We show that SMARCB1 is required for the integrity of SWI/SNF complexes and that its loss alters enhancer targeting-markedly impairing SWI/SNF binding to typical enhancers, particularly those required for differentiation, while maintaining SWI/SNF binding at super-enhancers. We show that these retained super-enhancers are essential for rhabdoid tumor survival, including some that are shared by all subtypes, such as SPRY1, and other lineage-specific super-enhancers, such as SOX2 in brain-derived rhabdoid tumors. Taken together, our findings identify a new chromatin-based epigenetic mechanism underlying the tumor-suppressive activity of SMARCB1.

%B Nat Genet %V 49 %P 289-295 %8 2017 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/27941797?dopt=Abstract %R 10.1038/ng.3746 %0 Journal Article %J Nat Genet %D 2017 %T ARID1A loss impairs enhancer-mediated gene regulation and drives colon cancer in mice %A Mathur, Radhika %A Alver, Burak H %A San Roman, Adrianna K %A Wilson, Boris G %A Wang, Xiaofeng %A Agoston, Agoston T %A Park, Peter J %A Shivdasani, Ramesh A %A Roberts, Charles W M %X

Genes encoding subunits of SWI/SNF (BAF) chromatin-remodeling complexes are collectively mutated in ∼20% of all human cancers. Although ARID1A is the most frequent target of mutations, the mechanism by which its inactivation promotes tumorigenesis is unclear. Here we demonstrate that Arid1a functions as a tumor suppressor in the mouse colon, but not the small intestine, and that invasive ARID1A-deficient adenocarcinomas resemble human colorectal cancer (CRC). These tumors lack deregulation of APC/β-catenin signaling components, which are crucial gatekeepers in common forms of intestinal cancer. We find that ARID1A normally targets SWI/SNF complexes to enhancers, where they function in coordination with transcription factors to facilitate gene activation. ARID1B preserves SWI/SNF function in ARID1A-deficient cells, but defects in SWI/SNF targeting and control of enhancer activity cause extensive dysregulation of gene expression. These findings represent an advance in colon cancer modeling and implicate enhancer-mediated gene regulation as a principal tumor-suppressor function of ARID1A.

%B Nat Genet %V 49 %P 296-302 %8 2017 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/27941798?dopt=Abstract %R 10.1038/ng.3744 %0 Journal Article %J G3 (Bethesda) %D 2017 %T upSET, the Drosophila homologue of SET3, Is Required for Viability and the Proper Balance of Active and Repressive Chromatin Marks %A McElroy, Kyle A %A Jung, Youngsook L %A Zee, Barry M %A Wang, Charlotte I %A Park, Peter J %A Kuroda, Mitzi I %X

Chromatin plays a critical role in faithful implementation of gene expression programs. Different post-translational modifications (PTMs) of histone proteins reflect the underlying state of gene activity, and many chromatin proteins write, erase, bind, or are repelled by, these histone marks. One such protein is UpSET, the Drosophila homolog of yeast Set3 and mammalian KMT2E (MLL5). Here, we show that UpSET is necessary for the proper balance between active and repressed states. Using CRISPR/Cas-9 editing, we generated S2 cells that are mutant for upSET We found that loss of UpSET is tolerated in S2 cells, but that heterochromatin is misregulated, as evidenced by a strong decrease in H3K9me2 levels assessed by bulk histone PTM quantification. To test whether this finding was consistent in the whole organism, we deleted the upSET coding sequence using CRISPR/Cas-9, which we found to be lethal in both sexes in flies. We were able to rescue this lethality using a tagged upSET transgene, and found that UpSET protein localizes to transcriptional start sites (TSS) of active genes throughout the genome. Misregulated heterochromatin is apparent by suppressed position effect variegation of the w(m4) allele in heterozygous upSET-deleted flies. Using nascent-RNA sequencing in the upSET-mutant S2 lines, we show that this result applies to heterochromatin genes generally. Our findings support a critical role for UpSET in maintaining heterochromatin, perhaps by delimiting the active chromatin environment.

%B G3 (Bethesda) %V 7 %P 625-635 %8 2017 Feb 09 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/28064188?dopt=Abstract %R 10.1534/g3.116.037788 %0 Journal Article %J Nature %D 2017 %T Integrated genomic and molecular characterization of cervical cancer %A Cancer Genome Atlas Research Network, TCGA %X

Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, the largest comprehensive genomic study of cervical cancer to date. We observed striking APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A, and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered novel amplifications in immune targets CD274/PD-L1 and PDCD1LG2/PD-L2, and the BCAR4 lncRNA that has been associated with response to lapatinib. HPV integration was observed in all HPV18-related cases and 76% of HPV16-related cases, and was associated with structural aberrations and increased target gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumors with high frequencies of KRAS, ARID1A, and PTEN mutations. Integrative clustering of 178 samples identified Keratin-low Squamous, Keratin-high Squamous, and Adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers.

%B Nature %V 543 %P 378-84 %8 2017 Jan 23 %G eng %N 7645 %1 http://www.ncbi.nlm.nih.gov/pubmed/28112728?dopt=Abstract %R 10.1038/nature21386 %0 Journal Article %J Nature %D 2017 %T Integrated genomic characterization of oesophageal carcinoma %A Cancer Genome Atlas Research Network, TCGA %X

Oesophageal cancers are prominent worldwide; however, there are few targeted therapies and survival rates for these cancers remain dismal. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies.

%B Nature %V 541 %P 169-175 %8 2017 Jan 12 %G eng %N 7636 %1 http://www.ncbi.nlm.nih.gov/pubmed/28052061?dopt=Abstract %R 10.1038/nature20805 %0 Journal Article %J PLoS genetics %D 2016 %T The impact of environmental and endogenous damage on somatic mutation load in human skin fibroblasts. %A Saini, Natalie %A Roberts, Steven A %A Klimczak, Leszek J %A Chan, Kin %A Grimm, Sara A %A Dai, Shuangshuang %A Fargo, David C %A Boyer, Jayne C %A Kaufmann, William K %A Taylor, Jack A %A Lee, Eunjung %A Cortes-Ciriano, Isidro %A Park, Peter J %A Schurman, Shepherd H %A Malc, Ewa P %A Mieczkowski, Piotr A %A Gordenin, Dmitry A %B PLoS genetics %V 12 %P e1006385 %G eng %N 10 %0 Journal Article %J Genome Biol %D 2016 %T Comprehensive analysis of promoter-proximal RNA polymerase II pausing across mammalian cell types. %A Day, Daniel S* %A Zhang, Bing* %A Stevens, Sean M %A Ferrari, Francesco %A Larschan, Erica N %A Park, Peter J** %A Pu, William T** %X

BACKGROUND: For many genes, RNA polymerase II stably pauses before transitioning to productive elongation. Although polymerase II pausing has been shown to be a mechanism for regulating transcriptional activation, the extent to which it is involved in control of mammalian gene expression and its relationship to chromatin structure remain poorly understood. RESULTS: Here, we analyze 85 RNA polymerase II chromatin immunoprecipitation (ChIP)-sequencing experiments from 35 different murine and human samples, as well as related genome-wide datasets, to gain new insights into the relationship between polymerase II pausing and gene regulation. Across cell and tissue types, paused genes (pausing index > 2) comprise approximately 60 % of expressed genes and are repeatedly associated with specific biological functions. Paused genes also have lower cell-to-cell expression variability. Increased pausing has a non-linear effect on gene expression levels, with moderately paused genes being expressed more highly than other paused genes. The highest gene expression levels are often achieved through a novel pause-release mechanism driven by high polymerase II initiation. In three datasets examining the impact of extracellular signals, genes responsive to stimulus have slightly lower pausing index on average than non-responsive genes, and rapid gene activation is linked to conditional pause-release. Both chromatin structure and local sequence composition near the transcription start site influence pausing, with divergent features between mammals and Drosophila. Most notably, in mammals pausing is positively correlated with histone H2A.Z occupancy at promoters. CONCLUSIONS: Our results provide new insights into the contribution of RNA polymerase II pausing in mammalian gene regulation and chromatin structure.

%B Genome Biol %V 17 %P 120 %8 2016 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/27259512?dopt=Abstract %R 10.1186/s13059-016-0984-2 %0 Journal Article %J Nucleic Acids Res %D 2016 %T Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants. %A Xi, Ruibin %A Lee, Semin %A Xia, Yuchao %A Kim, Tae-Min %A Park, Peter J %X

Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing methods. We apply this algorithm to low coverage whole-genome sequencing data from peripheral blood of nearly a thousand patients across eleven cancer types in The Cancer Genome Atlas (TCGA) to identify cancer-predisposing CNV regions. We confirm known regions and discover new ones including those covering KMT2C, GOLPH3, ERBB2 and PLAG1 Analysis of colorectal cancer genomes in particular reveals novel recurrent CNVs including deletions at two chromatin-remodeling genes RERE and NPM2 This method will be useful to many researchers interested in profiling CNVs from whole-genome sequencing data.

%B Nucleic Acids Res %8 2016 Jun 3 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/27260798?dopt=Abstract %R 10.1093/nar/gkw491 %0 Journal Article %J Am J Hum Genet %D 2016 %T Analyzing Somatic Genome Rearrangements in Human Cancers by Using Whole-Exome Sequencing. %A Yang, Lixing* %A Lee, Mi-Sook* %A Lu, Hengyu* %A Oh, Doo-Yi %A Yeon Jeong Kim %A Donghyun Park %A Park, Gahee %A Ren, Xiaojia %A Bristow, Christopher A %A Haseley, Psalm S %A Lee, Soohyun %A Pantazi, Angeliki %A Kucherlapati, Raju %A Park, Woong-Yang %A Scott, Kenneth L** %A Choi, Yoon-La** %A Park, Peter J** %X

Although exome sequencing data are generated primarily to detect single-nucleotide variants and indels, they can also be used to identify a subset of genomic rearrangements whose breakpoints are located in or near exons. Using >4,600 tumor and normal pairs across 15 cancer types, we identified over 9,000 high confidence somatic rearrangements, including a large number of gene fusions. We find that the 5' fusion partners of functional fusions are often housekeeping genes, whereas the 3' fusion partners are enriched in tyrosine kinases. We establish the oncogenic potential of ROR1-DNAJC6 and CEP85L-ROS1 fusions by showing that they can promote cell proliferation in vitro and tumor formation in vivo. Furthermore, we found that ∼4% of the samples have massively rearranged chromosomes, many of which are associated with upregulation of oncogenes such as ERBB2 and TERT. Although the sensitivity of detecting structural alterations from exomes is considerably lower than that from whole genomes, this approach will be fruitful for the multitude of exomes that have been and will be generated, both in cancer and in other diseases.

%B Am J Hum Genet %V 98 %P 843-56 %8 2016 May 5 %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/27153396?dopt=Abstract %R 10.1016/j.ajhg.2016.03.017 %0 Journal Article %J Nat Commun %D 2016 %T MNase titration reveals differences between nucleosome occupancy and chromatin accessibility. %A Mieczkowski, Jakub %A Cook, April %A Bowman, Sarah K %A Mueller, Britta %A Alver, Burak H %A Kundu, Sharmistha %A Deaton, Aimee M %A Urban, Jennifer A %A Larschan, Erica %A Park, Peter J %A Kingston, Robert E %A Tolstorukov, Michael Y %X

Chromatin accessibility plays a fundamental role in gene regulation. Nucleosome placement, usually measured by quantifying protection of DNA from enzymatic digestion, can regulate accessibility. We introduce a metric that uses micrococcal nuclease (MNase) digestion in a novel manner to measure chromatin accessibility by combining information from several digests of increasing depths. This metric, MACC (MNase accessibility), quantifies the inherent heterogeneity of nucleosome accessibility in which some nucleosomes are seen preferentially at high MNase and some at low MNase. MACC interrogates each genomic locus, measuring both nucleosome location and accessibility in the same assay. MACC can be performed either with or without a histone immunoprecipitation step, and thereby compares histone and non-histone protection. We find that changes in accessibility at enhancers, promoters and other regulatory regions do not correlate with changes in nucleosome occupancy. Moreover, high nucleosome occupancy does not necessarily preclude high accessibility, which reveals novel principles of chromatin regulation.

%B Nat Commun %V 7 %P 11485 %8 2016 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/27151365?dopt=Abstract %R 10.1038/ncomms11485 %0 Journal Article %J BMC Genomics %D 2016 %T Next-generation sequencing-based detection of germline L1-mediated transductions. %A Tica, Jelena* %A Lee, Eunjung* %A Untergasser, Andreas %A Meiers, Sascha %A Garfield, David A %A Gokcumen, Omer %A Furlong, Eileen E M %A Park, Peter J %A Stütz, Adrian M** %A Korbel, Jan O** %X

BACKGROUND: While active LINE-1 (L1) elements possess the ability to mobilize flanking sequences to different genomic loci through a process termed transduction influencing genomic content and structure, an approach for detecting polymorphic germline non-reference transductions in massively-parallel sequencing data has been lacking. RESULTS: Here we present the computational approach TIGER (Transduction Inference in GERmline genomes), enabling the discovery of non-reference L1-mediated transductions by combining L1 discovery with detection of unique insertion sequences and detailed characterization of insertion sites. We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome. We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species. CONCLUSIONS: By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

%B BMC Genomics %V 17 %P 342 %8 2016 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/27161561?dopt=Abstract %R 10.1186/s12864-016-2670-x %0 Journal Article %J Annu Rev Pathol %D 2016 %T Mechanisms and Consequences of Cancer Genome Instability: Lessons from Genome Sequencing Studies. %A Lee, June-Koo %A Choi, Yoon-La %A Kwon, Mijung %A Park, Peter J %X

During tumor evolution, cancer cells can accumulate numerous genetic alterations, ranging from single nucleotide mutations to whole-chromosomal changes. Although a great deal of progress has been made in the past decades in characterizing genomic alterations, recent cancer genome sequencing studies have provided a wealth of information on the detailed molecular profiles of such alterations in various types of cancers. Here, we review our current understanding of the mechanisms and consequences of cancer genome instability, focusing on the findings uncovered through analysis of exome and whole-genome sequencing data. These analyses have shown that most cancers have evidence of genome instability, and the degree of instability is variable within and between cancer types. Importantly, we describe some recent evidence supporting the idea that chromosomal instability could be a major driving force in tumorigenesis and cancer evolution, actively shaping the genomes of cancer cells to maximize their survival advantage. Expected final online publication date for the Annual Review of Pathology: Mechanisms of Disease Volume 11 is May 23, 2016. Please see http://www.annualreviews.org/catalog/pubdates.aspx for revised estimates.

%B Annu Rev Pathol %8 2016 Feb 22 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/26907526?dopt=Abstract %R 10.1146/annurev-pathol-012615-044446 %0 Journal Article %J Elife %D 2016 %T Resolving rates of mutation in the brain using single-neuron genomics. %A Evrony, Gilad D* %A Lee, Eunjung* %A Park, Peter J** %A Walsh, Christopher A** %X

Whether somatic mutations contribute functional diversity to brain cells is a long-standing question. Single-neuron genomics enables direct measurement of somatic mutation rates in human brain and promises to answer this question. A recent study (Upton et al., 2015) reported high rates of somatic LINE-1 element (L1) retrotransposition in the hippocampus and cerebral cortex that would have major implications for normal brain function, and further claimed these mutation events preferentially impact genes important for neuronal function. We identify errors in single-cell sequencing approach, bioinformatic analysis, and validation methods that led to thousands of false-positive artifacts being mistakenly interpreted as somatic mutation events. Our reanalysis of the data supports a corrected mutation frequency (0.2 per cell) more than fifty-fold lower than reported, inconsistent with the authors' conclusion of 'ubiquitous' L1 mosaicism, but consistent with L1 elements mobilizing occasionally. Through consideration of the challenges and pitfalls identified, we provide a foundation and framework for designing single-cell genomics studies.

%B Elife %V 5 %8 2016 Feb 22 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/26901440?dopt=Abstract %R 10.7554/eLife.12966 %0 Journal Article %J Brief Bioinform %D 2016 %T Evaluation of somatic copy number estimation tools for whole-exome sequencing data %A Nam, Jae-Yong %A Kim, Nayoung K D %A Kim, Sang Cheol %A Joung, Je-Gun %A Xi, Ruibin %A Lee, Semin %A Park, Peter J** %A Park, Woong-Yang** %K Algorithms %K Chromosome Mapping %K DNA Copy Number Variations %K Exome %K High-Throughput Nucleotide Sequencing %K Humans %K Reproducibility of Results %K Sensitivity and Specificity %K Sequence Analysis, DNA %K Software %X

Whole-exome sequencing (WES) has become a standard method for detecting genetic variants in human diseases. Although the primary use of WES data has been the identification of single nucleotide variations and indels, these data also offer a possibility of detecting copy number variations (CNVs) at high resolution. However, WES data have uneven read coverage along the genome owing to the target capture step, and the development of a robust WES-based CNV tool is challenging. Here, we evaluate six WES somatic CNV detection tools: ADTEx, CONTRA, Control-FREEC, EXCAVATOR, ExomeCNV and Varscan2. Using WES data from 50 kidney chromophobe, 50 bladder urothelial carcinoma, and 50 stomach adenocarcinoma patients from The Cancer Genome Atlas, we compared the CNV calls from the six tools with a reference CNV set that was identified by both single nucleotide polymorphism array 6.0 and whole-genome sequencing data. We found that these algorithms gave highly variable results: visual inspection reveals significant differences between the WES-based segmentation profiles and the reference profile, as well as among the WES-based profiles. Using a 50% overlap criterion, 13-77% of WES CNV calls were covered by CNVs from the reference set, up to 21% of the copy gains were called as losses or vice versa, and dramatic differences in CNV sizes and CNV numbers were observed. Overall, ADTEx and EXCAVATOR had the best performance with relatively high precision and sensitivity. We suggest that the current algorithms for somatic CNV detection from WES data are limited in their performance and that more robust algorithms are needed.

%B Brief Bioinform %V 17 %P 185-92 %8 2016 Mar %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/26210357?dopt=Abstract %R 10.1093/bib/bbv055 %0 Journal Article %J Cancer Cell %D 2016 %T Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma %A Zheng, Siyuan %A Cherniack, Andrew D %A Dewal, Ninad %A Moffitt, Richard A %A Danilova, Ludmila %A Murray, Bradley A %A Lerario, Antonio M %A Else, Tobias %A Knijnenburg, Theo A %A Ciriello, Giovanni %A Kim, Seungchan %A Assie, Guillaume %A Morozova, Olena %A Akbani, Rehan %A Shih, Juliann %A Hoadley, Katherine A %A Choueiri, Toni K %A Waldmann, Jens %A Mete, Ozgur %A Robertson, A Gordon %A Wu, Hsin-Ta %A Raphael, Benjamin J %A Shao, Lina %A Meyerson, Matthew %A Demeure, Michael J %A Beuschlein, Felix %A Gill, Anthony J %A Sidhu, Stan B %A Almeida, Madson Q %A Fragoso, Maria C B V %A Cope, Leslie M %A Kebebew, Electron %A Habra, Mouhammed A %A Whitsett, Timothy G %A Bussey, Kimberly J %A Rainey, William E %A Asa, Sylvia L %A Bertherat, Jérôme %A Fassnacht, Martin %A Wheeler, David A %A Cancer Genome Atlas Research Network %A Hammer, Gary D %A Giordano, Thomas J %A Verhaak, Roel G W %B Cancer Cell %V 30 %P 363 %8 2016 08 08 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/27505681?dopt=Abstract %R 10.1016/j.ccell.2016.07.013 %0 Journal Article %J Fly %D 2016 %T Correspondence of Drosophila Polycomb Group proteins with broad H3K27me3 silent domains. %A Jung, Youngsook L* %A Kang, Hyuckjoon* %A Park, Peter J %A Kuroda, Mitzi I %X

The Polycomb group (PcG) proteins are key conserved regulators of development, initially discovered in Drosophila and now strongly implicated in human disease. Nevertheless, differing silencing properties between the Drosophila and mammalian PcG systems have been observed. While specific DNA targeting sites for PcG proteins called Polycomb response elements (PREs) have been identified only in Drosophila, involvement of non-coding RNAs for PcG targeting has been favored in mammals. Another difference lies in the distribution patterns of PcG proteins. In mouse and human cells, PcG proteins show broad distributions, significantly overlapping with H3K27me3 domains. In contrast, only sharp peaks on PRE regions are observed for most PcG proteins in Drosophila, raising the question of how large domains of H3K27me3, up to many tens of kilobases, are formed and maintained in Drosophila. In this Extra View, we provide evidence that PcG distributions on silent chromatin in Drosophila are considerably broader than previously detected. Using BioTAP-XL, a chromatin crosslinking and tandem affinity purification approach, we find a broad, rather than PRE-limited overlap of PcG proteins with H3K27me3, suggesting a conserved spreading mechanism for PcG in flies and mammals.

%B Fly %8 2016 Mar 3 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/26940990?dopt=Abstract %R 10.1080/19336934.2016.1151988 %0 Journal Article %J Mod Pathol %D 2016 %T Intravenous leiomyomatosis: an unusual intermediate between benign and malignant uterine smooth muscle tumors. %A Ordulu, Zehra %A Nucci, Marisa R %A Dal Cin, Paola %A Hollowell, Monica L %A Otis, Christopher N %A Hornick, Jason L %A Park, Peter J %A Kim, Tae-Min %A Quade, Bradley J %A Morton, Cynthia C %X

Intravenous leiomyomatosis is an unusual smooth muscle neoplasm with quasi-malignant intravascular growth but a histologically banal appearance. Herein, we report expression and molecular cytogenetic analyses of a series of 12 intravenous leiomyomatosis cases to better understand the pathogenesis of intravenous leiomyomatosis. All cases were analyzed for the expression of HMGA2, MDM2, and CDK4 proteins by immunohistochemistry based on our previous finding of der(14)t(12;14)(q14.3;q24) in intravenous leiomyomatosis. Seven of 12 (58%) intravenous leiomyomatosis cases expressed HMGA2, and none expressed MDM2 or CDK4. Colocalization of hybridization signals for probes from the HMGA2 locus (12q14.3) and from 14q24 by interphase fluorescence in situ hybridization (FISH) was detected in a mean of 89.2% of nuclei in HMGA2-positive cases by immunohistochemistry, but in only 12.4% of nuclei in negative cases, indicating an association of HMGA2 expression and this chromosomal rearrangement (P=8.24 × 10(-10)). Four HMGA2-positive cases had greater than two HMGA2 hybridization signals per cell. No cases showed loss of a hybridization signal by interphase FISH for the frequently deleted region of 7q22 in uterine leiomyomata. One intravenous leiomyomatosis case analyzed by array comparative genomic hybridization revealed complex copy number variations. Finally, expression profiling was performed on three intravenous leiomyomatosis cases. Interestingly, hierarchical cluster analysis of the expression profiles revealed segregation of the intravenous leiomyomatosis cases with leiomyosarcoma rather than with myometrium, uterine leiomyoma of the usual histological type, or plexiform leiomyoma. These findings suggest that intravenous leiomyomatosis cases share some molecular cytogenetic characteristics with uterine leiomyoma, and expression profiles similar to that of leiomyosarcoma cases, further supporting their intermediate, quasi-malignant behavior.Modern Pathology advance online publication, 19 February 2016; doi:10.1038/modpathol.2016.36.

%B Mod Pathol %8 2016 Feb 19 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/26892441?dopt=Abstract %R 10.1038/modpathol.2016.36 %0 Journal Article %J Cell %D 2016 %T Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. %A Ceccarelli, Michele %A Barthel, Floris P %A Malta, Tathiane M %A Sabedot, Thais S %A Salama, Sofie R %A Murray, Bradley A %A Morozova, Olena %A Newton, Yulia %A Radenbaugh, Amie %A Pagnotta, Stefano M %A Anjum, Samreen %A Wang, Jiguang %A Manyam, Ganiraju %A Zoppoli, Pietro %A Ling, Shiyun %A Rao, Arjun A %A Grifford, Mia %A Cherniack, Andrew D %A Zhang, Hailei %A Poisson, Laila %A Carlotti, Carlos Gilberto %A Tirapelli, Daniela Pretti da Cunha %A Rao, Arvind %A Mikkelsen, Tom %A Lau, Ching C %A Yung, W K Alfred %A Rabadan, Raul %A Huse, Jason %A Brat, Daniel J %A Lehman, Norman L %A Barnholtz-Sloan, Jill S %A Zheng, Siyuan %A Hess, Kenneth %A Rao, Ganesh %A Meyerson, Matthew %A Beroukhim, Rameen %A Cooper, Lee %A Akbani, Rehan %A Wrensch, Margaret %A Haussler, David %A Aldape, Kenneth D %A Laird, Peter W %A Gutmann, David H %A TCGA Research Network %A Noushmehr, Houtan %A Iavarone, Antonio %A Verhaak, Roel G W %X

Therapy development for adult diffuse glioma is hindered by incomplete knowledge of somatic glioma driving alterations and suboptimal disease classification. We defined the complete set of genes associated with 1,122 diffuse grade II-III-IV gliomas from The Cancer Genome Atlas and used molecular profiles to improve disease classification, identify molecular correlations, and provide insights into the progression from low- to high-grade disease. Whole-genome sequencing data analysis determined that ATRX but not TERT promoter mutations are associated with increased telomere length. Recent advances in glioma classification based on IDH mutation and 1p/19q co-deletion status were recapitulated through analysis of DNA methylation profiles, which identified clinically relevant molecular subsets. A subtype of IDH mutant glioma was associated with DNA demethylation and poor outcome; a group of IDH-wild-type diffuse glioma showed molecular similarity to pilocytic astrocytoma and relatively favorable survival. Understanding of cohesive disease groups may aid improved clinical outcomes.

%B Cell %V 164 %P 550-63 %8 2016 Jan 28 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/26824661?dopt=Abstract %R 10.1016/j.cell.2015.12.028 %0 Journal Article %J Cell Rep %D 2016 %T Multilevel Genomics-Based Taxonomy of Renal Cell Carcinoma. %A Chen, Fengju %A Zhang, Yiqun %A Şenbabaoğlu, Yasin %A Ciriello, Giovanni %A Yang, Lixing %A Reznik, Ed %A Shuch, Brian %A Micevic, Goran %A De Velasco, Guillermo %A Shinbrot, Eve %A Noble, Michael S %A Lu, Yiling %A Covington, Kyle R %A Xi, Liu %A Drummond, Jennifer A %A Muzny, Donna %A Kang, Hyojin %A Lee, Junehawk %A Tamboli, Pheroze %A Reuter, Victor %A Shelley, Carl Simon %A Kaipparettu, Benny A %A Bottaro, Donald P %A Godwin, Andrew K %A Gibbs, Richard A %A Getz, Gad %A Kucherlapati, Raju %A Park, Peter J %A Sander, Chris %A Henske, Elizabeth P %A Zhou, Jane H %A Kwiatkowski, David J %A Ho, Thai H %A Choueiri, Toni K %A Hsieh, James J %A Akbani, Rehan %A Mills, Gordon B %A Hakimi, A Ari %A Wheeler, David A %A Creighton, Chad J %X

On the basis of multidimensional and comprehensive molecular characterization (including DNA methalylation and copy number, RNA, and protein expression), we classified 894 renal cell carcinomas (RCCs) of various histologic types into nine major genomic subtypes. Site of origin within the nephron was one major determinant in the classification, reflecting differences among clear cell, chromophobe, and papillary RCC. Widespread molecular changes associated with TFE3 gene fusion or chromatin modifier genes were present within a specific subtype and spanned multiple subtypes. Differences in patient survival and in alteration of specific pathways (including hypoxia, metabolism, MAP kinase, NRF2-ARE, Hippo, immune checkpoint, and PI3K/AKT/mTOR) could further distinguish the subtypes. Immune checkpoint markers and molecular signatures of T cell infiltrates were both highest in the subtype associated with aggressive clear cell RCC. Differences between the genomic subtypes suggest that therapeutic strategies could be tailored to each RCC disease subset.

%B Cell Rep %8 2016 Mar 2 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/26947078?dopt=Abstract %R 10.1016/j.celrep.2016.02.024 %0 Journal Article %J Nature %D 2015 %T The histone chaperone CAF-1 safeguards somatic cell identity. %A Cheloufi, Sihem %A Elling, Ulrich %A Hopfgartner, Barbara %A Jung, Youngsook L %A Murn, Jernej %A Ninova, Maria %A Hubmann, Maria %A Badeaux, Aimee I %A Euong Ang, Cheen %A Tenen, Danielle %A Wesche, Daniel J %A Abazova, Nadezhda %A Hogue, Max %A Tasdemir, Nilgun %A Brumbaugh, Justin %A Rathert, Philipp %A Jude, Julian %A Ferrari, Francesco %A Blanco, Andres %A Fellner, Michaela %A Wenzel, Daniel %A Zinner, Marietta %A Vidal, Simon E %A Bell, Oliver %A Stadtfeld, Matthias %A Chang, Howard Y %A Almouzni, Genevieve %A Lowe, Scott W %A Rinn, John %A Wernig, Marius %A Aravin, Alexei %A Shi, Yang %A Park, Peter J %A Penninger, Josef M %A Zuber, Johannes %A Hochedlinger, Konrad %X

Cellular differentiation involves profound remodelling of chromatic landscapes, yet the mechanisms by which somatic cell identity is subsequently maintained remain incompletely understood. To further elucidate regulatory pathways that safeguard the somatic state, we performed two comprehensive RNA interference (RNAi) screens targeting chromatin factors during transcription-factor-mediated reprogramming of mouse fibroblasts to induced pluripotent stem cells (iPS cells). Subunits of the chromatin assembly factor-1 (CAF-1) complex, including Chaf1a and Chaf1b, emerged as the most prominent hits from both screens, followed by modulators of lysine sumoylation and heterochromatin maintenance. Optimal modulation of both CAF-1 and transcription factor levels increased reprogramming efficiency by several orders of magnitude and facilitated iPS cell formation in as little as 4 days. Mechanistically, CAF-1 suppression led to a more accessible chromatin structure at enhancer elements early during reprogramming. These changes were accompanied by a decrease in somatic heterochromatin domains, increased binding of Sox2 to pluripotency-specific targets and activation of associated genes. Notably, suppression of CAF-1 also enhanced the direct conversion of B cells into macrophages and fibroblasts into neurons. Together, our findings reveal the histone chaperone CAF-1 to be a novel regulator of somatic cell identity during transcription-factor-induced cell-fate transitions and provide a potential strategy to modulate cellular plasticity in a regenerative setting.

%B Nature %V 528 %P 218-24 %8 2015 Dec 10 %G eng %N 7581 %1 http://www.ncbi.nlm.nih.gov/pubmed/26659182?dopt=Abstract %R 10.1038/nature15749 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2015 %T Pericentromeric satellite repeat expansions through RNA-derived DNA intermediates in cancer. %A Bersani, Francesca %A Lee, Eunjung %A Kharchenko, Peter V %A Xu, Andrew W %A Liu, Mingzhu %A Xega,Kristina %A MacKenzie, Olivia C %A Brannigan, Brian W %A Wittner, Ben S %A Hyunchul Jung %A Ramaswamy, Sridhar %A Park, Peter J %A Maheswaran,Shyamala %A Ting, David T %A Haber, Daniel A %X

Aberrant transcription of the pericentromeric human satellite II (HSATII) repeat is present in a wide variety of epithelial cancers. In deriving experimental systems to study its deregulation, we observed that HSATII expression is induced in colon cancer cells cultured as xenografts or under nonadherent conditions in vitro, but it is rapidly lost in standard 2D cultures. Unexpectedly, physiological induction of endogenous HSATII RNA, as well as introduction of synthetic HSATII transcripts, generated cDNA intermediates in the form of DNA/RNA hybrids. Single molecule sequencing of tumor xenografts showed that HSATII RNA-derived DNA (rdDNA) molecules are stably incorporated within pericentromeric loci. Suppression of RT activity using small molecule inhibitors reduced HSATII copy gain. Analysis of whole-genome sequencing data revealed that HSATII copy number gain is a common feature in primary human colon tumors and is associated with a lower overall survival. Together, our observations suggest that cancer-associated derepression of specific repetitive sequences can promote their RNA-driven genomic expansion, with potential implications on pericentromeric architecture.

%B Proc Natl Acad Sci U S A %V 112 %P 15148-53 %8 2015 Dec 8 %G eng %N 49 %1 http://www.ncbi.nlm.nih.gov/pubmed/26575630?dopt=Abstract %R 10.1073/pnas.1518008112 %0 Journal Article %J Cell %D 2015 %T The Molecular Taxonomy of Primary Prostate Cancer. %A Cancer Genome Atlas Research Network, TCGA %X

There is substantial heterogeneity among primary prostate cancers, evident in the spectrum of molecular abnormalities and its variable clinical course. As part of The Cancer Genome Atlas (TCGA), we present a comprehensive molecular analysis of 333 primary prostate carcinomas. Our results revealed a molecular taxonomy in which 74% of these tumors fell into one of seven subtypes defined by specific gene fusions (ERG, ETV1/4, and FLI1) or mutations (SPOP, FOXA1, and IDH1). Epigenetic profiles showed substantial heterogeneity, including an IDH1 mutant subset with a methylator phenotype. Androgen receptor (AR) activity varied widely and in a subtype-specific manner, with SPOP and FOXA1 mutant tumors having the highest levels of AR-induced transcripts. 25% of the prostate cancers had a presumed actionable lesion in the PI3K or MAPK signaling pathways, and DNA repair genes were inactivated in 19%. Our analysis reveals molecular heterogeneity among primary prostate cancers, as well as potentially actionable molecular defects.

%B Cell %V 163 %P 1011-25 %8 2015 Nov 5 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/26544944?dopt=Abstract %R 10.1016/j.cell.2015.10.025 %0 Journal Article %J Nat Biotechnol %D 2015 %T A comparison of genetically matched cell lines reveals the equivalence of human iPSCs and ESCs. %A Choi, Jiho* %A Lee, Soohyun* %A Mallard, William %A Clement, Kendell %A Tagliazucchi, Guidantonio Malagoli %A Lim, Hotae %A Choi, In Young %A Ferrari, Francesco %A Tsankov, Alexander M %A Pop, Ramona %A Lee, Gabsang %A Rinn, John L %A Meissner, Alexander %A Park, Peter J** %A Hochedlinger, Konrad** %X

The equivalence of human induced pluripotent stem cells (hiPSCs) and human embryonic stem cells (hESCs) remains controversial. Here we use genetically matched hESC and hiPSC lines to assess the contribution of cellular origin (hESC vs. hiPSC), the Sendai virus (SeV) reprogramming method and genetic background to transcriptional and DNA methylation patterns while controlling for cell line clonality and sex. We find that transcriptional and epigenetic variation originating from genetic background dominates over variation due to cellular origin or SeV infection. Moreover, the 49 differentially expressed genes we detect between genetically matched hESCs and hiPSCs neither predict functional outcome nor distinguish an independently derived, larger set of unmatched hESC and hiPSC lines. We conclude that hESCs and hiPSCs are molecularly and functionally equivalent and cannot be distinguished by a consistent gene expression signature. Our data further imply that genetic background variation is a major confounding factor for transcriptional and epigenetic comparisons of pluripotent cell lines, explaining some of the previously observed differences between genetically unmatched hESCs and hiPSCs.

%B Nat Biotechnol %V 33 %P 1173-81 %8 2015 Oct 26 %G ENG %N 11 %1 http://www.ncbi.nlm.nih.gov/pubmed/26501951?dopt=Abstract %R 10.1038/nbt.3388 %0 Journal Article %J Nat Genet %D 2015 %T Intron retention is a widespread mechanism of tumor-suppressor inactivation %A Hyunchul Jung %A Donghoon Lee %A Jongkeun Lee %A Donghyun Park %A Yeon Jeong Kim %A Park, Woong-Yang %A Hong, Dongwan** %A Park, PJ** %A Lee, Eunjung** %X

-A substantial fraction of disease-causing mutations are pathogenic through aberrant splicing. Although genome profiling studies have identified somatic single-nucleotide variants (SNVs) in cancer, the extent to which these variants trigger abnormal splicing has not been systematically examined. Here we analyzed RNA sequencing and exome data from 1,812 patients with cancer and identified ∼900 somatic exonic SNVs that disrupt splicing. At least 163 SNVs, including 31 synonymous ones, were shown to cause intron retention or exon skipping in an allele-specific manner, with ∼70% of the SNVs occurring on the last base of exons. Notably, SNVs causing intron retention were enriched in tumor suppressors, and 97% of these SNVs generated a premature termination codon, leading to loss of function through nonsense-mediated decay or truncated protein. We also characterized the genomic features predictive of such splicing defects. Overall, this work demonstrates that intron retention is a common mechanism of tumor-suppressor inactivation.

%B Nat Genet %V 47 %P 1242-8 %G eng %N 11 %0 Journal Article %J Science %D 2015 %T Somatic mutation in single human neurons tracks developmental and transcriptional history. %A Lodato, Michael A* %A Woodworth, Mollie B* %A Lee, Semin* %A Evrony, Gilad D %A Mehta, Bhaven K %A Karger, Amir %A Lee, Soohyun %A Chittenden, Thomas W %A D'Gama, Alissa M %A Cai, Xuyu %A Luquette, Lovelace J %A Lee, Eunjung %A Park, Peter J** %A Walsh, Christopher A** %X

Neurons live for decades in a postmitotic state, their genomes susceptible to DNA damage. Here we survey the landscape of somatic single-nucleotide variants (SNVs) in the human brain. We identified thousands of somatic SNVs by single-cell sequencing of 36 neurons from the cerebral cortex of three normal individuals. Unlike germline and cancer SNVs, which are often caused by errors in DNA replication, neuronal mutations appear to reflect damage during active transcription. Somatic mutations create nested lineage trees, allowing them to be dated relative to developmental landmarks and revealing a polyclonal architecture of the human cerebral cortex. Thus, somatic mutations in the brain represent a durable and ongoing record of neuronal life history, from development through postmitotic function.

%B Science %V 350 %P 94-8 %8 2015 Oct 2 %G eng %N 6256 %1 http://www.ncbi.nlm.nih.gov/pubmed/26430121?dopt=Abstract %R 10.1126/science.aab1785 %0 Journal Article %J N Engl J Med %D 2015 %T Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas %A Cancer Genome Atlas Research Network, TCGA %X

BACKGROUND: Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas.

METHODS: We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes.

RESULTS: Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma.

CONCLUSIONS: The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q codeletion or carried a TP53 mutation. Most lower-grade gliomas without an IDH mutation were molecularly and clinically similar to glioblastoma. (Funded by the National Institutes of Health.).

%B N Engl J Med %V 372 %P 2481-98 %G eng %N 26 %0 Journal Article %J Nature %D 2015 %T Comprehensive genomic characterization of head and neck squamous cell carcinomas. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Carcinoma, Squamous Cell %K DNA Copy Number Variations %K DNA, Neoplasm %K Female %K Gene Expression Regulation, Neoplastic %K Genome, Human %K Genomics %K Head and Neck Neoplasms %K Humans %K Male %K Molecular Targeted Therapy %K Mutation %K Oncogenes %K RNA, Neoplasm %K Signal Transduction %K Transcription Factors %X

The Cancer Genome Atlas profiled 279 head and neck squamous cell carcinomas (HNSCCs) to provide a comprehensive landscape of somatic genomic alterations. Here we show that human-papillomavirus-associated tumours are dominated by helical domain mutations of the oncogene PIK3CA, novel alterations involving loss of TRAF3, and amplification of the cell cycle gene E2F1. Smoking-related HNSCCs demonstrate near universal loss-of-function TP53 mutations and CDKN2A inactivation with frequent copy number alterations including amplification of 3q26/28 and 11q13/22. A subgroup of oral cavity tumours with favourable clinical outcomes displayed infrequent copy number alterations in conjunction with activating mutations of HRAS or PIK3CA, coupled with inactivating mutations of CASP8, NOTCH1 and TP53. Other distinct subgroups contained loss-of-function alterations of the chromatin modifier NSD1, WNT pathway genes AJUBA and FAT1, and activation of oxidative stress factor NFE2L2, mainly in laryngeal tumours. Therapeutic candidate alterations were identified in most HNSCCs.

%B Nature %V 517 %P 576-82 %8 2015 Jan 29 %G eng %N 7536 %1 http://www.ncbi.nlm.nih.gov/pubmed/25631445?dopt=Abstract %R 10.1038/nature14129 %0 Journal Article %J Cell %D 2015 %T Genomic Classification of Cutaneous Melanoma. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Databases, Genetic %K Humans %K Melanoma %K Mutation %K National Cancer Institute (U.S.) %K Skin Neoplasms %K United States %X

We describe the landscape of genomic alterations in cutaneous melanomas through DNA, RNA, and protein-based analysis of 333 primary and/or metastatic melanomas from 331 patients. We establish a framework for genomic classification into one of four subtypes based on the pattern of the most prevalent significantly mutated genes: mutant BRAF, mutant RAS, mutant NF1, and Triple-WT (wild-type). Integrative analysis reveals enrichment of KIT mutations and focal amplifications and complex structural rearrangements as a feature of the Triple-WT subtype. We found no significant outcome correlation with genomic classification, but samples assigned a transcriptomic subclass enriched for immune gene expression associated with lymphocyte infiltrate on pathology review and high LCK protein expression, a T cell marker, were associated with improved patient survival. This clinicopathological and multi-dimensional analysis suggests that the prognosis of melanoma patients with regional metastases is influenced by tumor stroma immunobiology, offering insights to further personalize therapeutic decision-making.

%B Cell %V 161 %P 1681-96 %8 2015 Jun 18 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/26091043?dopt=Abstract %R 10.1016/j.cell.2015.05.044 %0 Journal Article %J Nature %D 2015 %T Hallmarks of pluripotency. %A De Los Angeles, Alejandro %A Ferrari, Francesco %A Xi, Ruibin %A Fujiwara, Yuko %A Benvenisty, Nissim %A Deng, Hongkui %A Hochedlinger, Konrad %A Jaenisch, Rudolf %A Lee, Soohyun %A Leitch, Harry G %A Lensch, M William %A Lujan, Ernesto %A Pei, Duanqing %A Rossant, Janet %A Wernig, Marius %A Park, Peter J %A Daley, George Q %X

Stem cells self-renew and generate specialized progeny through differentiation, but vary in the range of cells and tissues they generate, a property called developmental potency. Pluripotent stem cells produce all cells of an organism, while multipotent or unipotent stem cells regenerate only specific lineages or tissues. Defining stem-cell potency relies upon functional assays and diagnostic transcriptional, epigenetic and metabolic states. Here we describe functional and molecular hallmarks of pluripotent stem cells, propose a checklist for their evaluation, and illustrate how forensic genomics can validate their provenance.

%B Nature %V 525 %P 469-78 %8 2015 Sep 23 %G eng %N 7570 %1 http://www.ncbi.nlm.nih.gov/pubmed/26399828?dopt=Abstract %R 10.1038/nature15515 %0 Journal Article %J BMC Bioinformatics %D 2015 %T EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering. %A Lee, Soohyun* %A Seo, Chae Hwa* %A Alver, Burak Han %A Hyuk Lee, Sang %A Park, Peter J %X

BACKGROUND: RNA-seq has been widely used for genome-wide expression profiling. RNA-seq data typically consists of tens of millions of short sequenced reads from different transcripts. However, due to sequence similarity among genes and among isoforms, the source of a given read is often ambiguous. Existing approaches for estimating expression levels from RNA-seq reads tend to compromise between accuracy and computational cost. RESULTS: We introduce a new approach for quantifying transcript abundance from RNA-seq data. EMSAR (Estimation by Mappability-based Segmentation And Reclustering) groups reads according to the set of transcripts to which they are mapped and finds maximum likelihood estimates using a joint Poisson model for each optimal set of segments of transcripts. The method uses nearly all mapped reads, including those mapped to multiple genes. With an efficient transcriptome indexing based on modified suffix arrays, EMSAR minimizes the use of CPU time and memory while achieving accuracy comparable to the best existing methods. CONCLUSIONS: EMSAR is a method for quantifying transcripts from RNA-seq data with high accuracy and low computational cost. EMSAR is available at https://github.com/parklab/emsar.

%B BMC Bioinformatics %V 16 %P 278 %8 2015 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/26335049?dopt=Abstract %R 10.1186/s12859-015-0704-z %0 Journal Article %J Genes Dev %D 2015 %T Sex comb on midleg (Scm) is a functional link between PcG-repressive complexes in Drosophila. %A Kang, Hyuckjoon %A McElroy, Kyle A %A Jung, Youngsook Lucy %A Alekseyenko, Artyom A %A Zee, Barry M %A Park, Peter J %A Kuroda, Mitzi I %K Animals %K Drosophila melanogaster %K Drosophila Proteins %K Histones %K Polycomb-Group Proteins %K Polytene Chromosomes %K Protein Binding %K Protein Transport %K Repressor Proteins %X

The Polycomb group (PcG) proteins are key regulators of development in Drosophila and are strongly implicated in human health and disease. How PcG complexes form repressive chromatin domains remains unclear. Using cross-linked affinity purifications of BioTAP-Polycomb (Pc) or BioTAP-Enhancer of zeste [E(z)], we captured all PcG-repressive complex 1 (PRC1) or PRC2 core components and Sex comb on midleg (Scm) as the only protein strongly enriched with both complexes. Although previously not linked to PRC2, we confirmed direct binding of Scm and PRC2 using recombinant protein expression and colocalization of Scm with PRC1, PRC2, and H3K27me3 in embryos and cultured cells using ChIP-seq (chromatin immunoprecipitation [ChIP] combined with deep sequencing). Furthermore, we found that RNAi knockdown of Scm and overexpression of the dominant-negative Scm-SAM (sterile α motif) domain both affected the binding pattern of E(z) on polytene chromosomes. Aberrant localization of the Scm-SAM domain in long contiguous regions on polytene chromosomes revealed its independent ability to spread on chromatin, consistent with its previously described ability to oligomerize in vitro. Pull-downs of BioTAP-Scm captured PRC1 and PRC2 and additional repressive complexes, including PhoRC, LINT, and CtBP. We propose that Scm is a key mediator connecting PRC1, PRC2, and transcriptional silencing. Combined with previous structural and genetic analyses, our results strongly suggest that Scm coordinates PcG complexes and polymerizes to produce broad domains of PcG silencing.

%B Genes Dev %V 29 %P 1136-50 %8 2015 Jun 1 %G eng %N 11 %1 http://www.ncbi.nlm.nih.gov/pubmed/26063573?dopt=Abstract %R 10.1101/gad.260562.115 %0 Journal Article %J Cancer Cell %D 2015 %T Spatiotemporal Evolution of the Primary Glioblastoma Genome. %A Kim, Jinkuk %A Lee, In-Hee %A Cho, Hee Jin %A Park, Chul-Kee %A Jung, Yang-Soon %A Kim, Yanghee %A Nam, So Hee %A Kim, Byung Sup %A Johnson, Mark D %A Kong, Doo-Sik %A Seol, Ho Jun %A Lee, Jung-Il %A Joo, Kyeung Min %A Yoon, Yeup %A Park, Woong-Yang %A Lee, Jeongwu %A Park, Peter J** %A Nam, Do-Hyun** %X

Tumor recurrence following treatment is the major cause of mortality for glioblastoma multiforme (GBM) patients. Thus, insights on the evolutionary process at recurrence are critical for improved patient care. Here, we describe our genomic analyses of the initial and recurrent tumor specimens from each of 38 GBM patients. A substantial divergence in the landscape of driver alterations was associated with distant appearance of a recurrent tumor from the initial tumor, suggesting that the genomic profile of the initial tumor can mislead targeted therapies for the distally recurred tumor. In addition, in contrast to IDH1-mutated gliomas, IDH1-wild-type primary GBMs rarely developed hypermutation following temozolomide (TMZ) treatment, indicating low risk for TMZ-induced hypermutation for these tumors under the standard regimen.

%B Cancer Cell %V 28 %P 318-28 %8 2015 Sep 14 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/26373279?dopt=Abstract %R 10.1016/j.ccell.2015.07.013 %0 Journal Article %J Neuron %D 2015 %T Cell lineage analysis in human brain using endogenous retroelements. %A Evrony, Gilad D* %A Lee, Eunjung* %A Mehta, Bhaven K %A Benjamini, Yuval %A Johnson, Robert M %A Cai, Xuyu %A Yang, Lixing %A Haseley, Psalm %A Lehmann, Hillel S %A Park, Peter J** %A Walsh, Christopher A** %X

Somatic mutations occur during brain development and are increasingly implicated as a cause of neurogenetic disease. However, the patterns in which somatic mutations distribute in the human brain are unknown. We used high-coverage whole-genome sequencing of single neurons from a normal individual to identify spontaneous somatic mutations as clonal marks to track cell lineages in human brain. Somatic mutation analyses in >30 locations throughout the nervous system identified multiple lineages and sublineages of cells marked by different LINE-1 (L1) retrotransposition events and subsequent mutation of poly-A microsatellites within L1. One clone contained thousands of cells limited to the left middle frontal gyrus, whereas a second distinct clone contained millions of cells distributed over the entire left hemisphere. These patterns mirror known somatic mutation disorders of brain development and suggest that focally distributed mutations are also prevalent in normal brains. Single-cell analysis of somatic mutation enables tracing of cell lineage clones in human brain.

%B Neuron %V 85 %P 49-59 %8 2015 Jan 7 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/25569347?dopt=Abstract %R 10.1016/j.neuron.2014.12.028 %0 Journal Article %J J Am Soc Nephrol %D 2015 %T Genome-Wide Analysis of Wilms' Tumor 1-Controlled Gene Expression in Podocytes Reveals Key Regulatory Mechanisms. %A Kann, Martin %A Ettou, Sandrine* %A Jung, Youngsook L* %A Lenz, Maximilian O %A Taglienti, Mary E %A Park, Peter J %A Schermer, Bernhard %A Benzing, Thomas %A Kreidberg, Jordan A %X

The transcription factor Wilms' tumor suppressor 1 (WT1) is key to podocyte development and viability; however, WT1 transcriptional networks in podocytes remain elusive. We provide a comprehensive analysis of the genome-wide WT1 transcriptional network in podocytes in vivo using chromatin immunoprecipitation followed by sequencing (ChIPseq) and RNA sequencing techniques. Our data show a specific role for WT1 in regulating the podocyte-specific transcriptome through binding to both promoters and enhancers of target genes. Furthermore, we inferred a podocyte transcription factor network consisting of WT1, LMX1B, TCF21, Fox-class and TEAD family transcription factors, and MAFB that uses tissue-specific enhancers to control podocyte gene expression. In addition to previously described WT1-dependent target genes, ChIPseq identified novel WT1-dependent signaling systems. These targets included components of the Hippo signaling system, underscoring the power of genome-wide transcriptional-network analyses. Together, our data elucidate a comprehensive gene regulatory network in podocytes suggesting that WT1 gene regulatory function and podocyte cell-type specification can best be understood in the context of transcription factor-regulatory element network interplay.

%B J Am Soc Nephrol %V 26 %P 2097-104 %8 2015 Sep %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/25636411?dopt=Abstract %R 10.1681/ASN.2014090940 %0 Journal Article %J Bioinformatics %D 2015 %T hiHMM: Bayesian non-parametric joint inference of chromatin state maps. %A Sohn, Kyung-Ah* %A Ho, Joshua W K* %A Djordjevic, Djordje %A Jeong, Hyun-Hwan %A Park, Peter J** %A Kim, Ju Han** %X

MOTIVATION: Genome-wide mapping of chromatin states is essential for defining regulatory elements and inferring their activities in eukaryotic genomes. A number of hidden Markov model (HMM)-based methods have been developed to infer chromatin state maps from genome-wide histone modification data for an individual genome. To perform a principled comparison of evolutionarily distant epigenomes, we must consider species-specific biases such as differences in genome size, strength of signal enrichment and co-occurrence patterns of histone modifications. RESULTS: Here, we present a new Bayesian non-parametric method called hierarchically linked infinite HMM (hiHMM) to jointly infer chromatin state maps in multiple genomes (different species, cell types and developmental stages) using genome-wide histone modification data. This flexible framework provides a new way to learn a consistent definition of chromatin states across multiple genomes, thus facilitating a direct comparison among them. We demonstrate the utility of this method using synthetic data as well as multiple modENCODE ChIP-seq datasets. CONCLUSION: The hierarchical and Bayesian non-parametric formulation in our approach is an important extension to the current set of methodologies for comparative chromatin landscape analysis. AVAILABILITY AND IMPLEMENTATION: Source codes are available at https://github.com/kasohn/hiHMM. Chromatin data are available at http://encode-x.med.harvard.edu/data_sets/chromatin/.

%B Bioinformatics %V 31 %P 2066-74 %8 2015 Jul 1 %G eng %N 13 %1 http://www.ncbi.nlm.nih.gov/pubmed/25725496?dopt=Abstract %R 10.1093/bioinformatics/btv117 %0 Journal Article %J Hum Mol Genet %D 2015 %T Htt CAG repeat expansion confers pleiotropic gains of mutant huntingtin function in chromatin regulation. %A Biagioli, Marta* %A Ferrari, Francesco* %A Mendenhall, Eric M %A Zhang, Yijing %A Erdin, Serkan %A Vijayvargia, Ravi %A Vallabh, Sonia M %A Solomos, Nicole %A Manavalan, Poornima %A Ragavendran, Ashok %A Ozsolak, Fatih %A Lee, Jong Min %A Talkowski, Michael E %A Gusella, James F %A Macdonald, Marcy E %A Park, Peter J %A Seong, Ihn Sik %X

The CAG repeat expansion in the Huntington's disease gene HTT extends a polyglutamine tract in mutant huntingtin that enhances its ability to facilitate polycomb repressive complex 2 (PRC2). To gain insight into this dominant gain of function, we mapped histone modifications genome-wide across an isogenic panel of mouse embryonic stem cell (ESC) and neuronal progenitor cell (NPC) lines, comparing the effects of Htt null and different size Htt CAG mutations. We found that Htt is required in ESC for the proper deposition of histone H3K27me3 at a subset of 'bivalent' loci but in NPC it is needed at 'bivalent' loci for both the proper maintenance and the appropriate removal of this mark. In contrast, Htt CAG size, though changing histone H3K27me3, is prominently associated with altered histone H3K4me3 at 'active' loci. The sets of ESC and NPC genes with altered histone marks delineated by the lack of huntingtin or the presence of mutant huntingtin, though distinct, are enriched in similar pathways with apoptosis specifically highlighted for the CAG mutation. Thus, the manner by which huntingtin function facilitates PRC2 may afford mutant huntingtin with multiple opportunities to impinge upon the broader machinery that orchestrates developmentally appropriate chromatin status.

%B Hum Mol Genet %8 2015 Jan 8 %G ENG %1 http://www.ncbi.nlm.nih.gov/pubmed/25574027?dopt=Abstract %R 10.1093/hmg/ddv006 %0 Journal Article %J Oncogene %D 2015 %T A small subunit processome protein promotes cancer by altering translation. %A Yang, H W %A Kim, T-M %A Song, S S %A Menon, L %A Jiang, X %A Huang, W %A Black, P M %A Park, P J %A Carroll, R S %A Johnson, M D %X

Dysregulation of ribosome biogenesis or translation can promote cancer, but the underlying mechanisms remain unclear. UTP18 is a component of the small subunit processome, a nucleolar multi-protein complex whose only known function is to cleave pre-ribosomal RNA to yield the 18S ribosomal RNA component of 40S ribosomal subunits. Here, we show that UTP18 also alters translation to promote stress resistance and growth, and that UTP18 is frequently gained and overexpressed in cancer. We observed that UTP18 localizes to the cytoplasm in a subset of cells, and that serum withdrawal increases cytoplasmic UTP18 localization. Cytoplasmic UTP18 associates with the translation complex and Hsp90 to upregulate the translation of IRES-containing transcripts such as HIF1a, Myc and VEGF, thereby inducing stress resistance. Hsp90 inhibition decreases cytoplasmic UTP18 and UTP18-induced increases in translation. Importantly, elevated UTP18 expression correlates with increased aggressiveness and decreased survival in numerous cancers. Enforced UTP18 overexpression promotes transformation and tumorigenesis, whereas UTP18 knockdown inhibits these processes. This stress adaptation mechanism is thus co-opted for growth by cancers, and its inhibition may represent a promising new therapeutic target.

%B Oncogene %V 34 %P 4471-81 %8 2015 Aug 20 %G eng %N 34 %1 http://www.ncbi.nlm.nih.gov/pubmed/25435373?dopt=Abstract %R 10.1038/onc.2014.376 %0 Journal Article %J Molecular Cancer %D 2015 %T Identification of rare germline copy number variations over-represented in five human cancer types. %A Park, Richard W* %A Kim, Tae-Min* %A Kasif, Simon %A Park, Peter J %X

-BACKGROUND: Copy number variations (CNVs) are increasingly recognized as significant disease susceptibility markers in many complex disorders including cancer. The availability of a large number of chromosomal copy number profiles in both malignant and normal tissues in cancer patients presents an opportunity to characterize not only somatic alterations but also germline CNVs, which may confer increased risk for cancer. RESULTS: We explored the germline CNVs in five cancer cohorts from the Cancer Genome Atlas (TCGA) consisting of 351 brain, 336 breast, 342 colorectal, 370 renal, and 314 ovarian cancers, genotyped on Affymetrix SNP6.0 arrays. Comparing these to ~3000 normal controls from another study, our case-control association study revealed 39 genomic loci (9 brain, 3 breast, 4 colorectal, 11 renal, and 12 ovarian cancers) as potential candidates of tumor susceptibility loci. Many of these loci are new and in some cases are associated with a substantial increase in disease risk. The majority of the observed loci do not overlap with coding sequences; however, several observed genomic loci overlap with known cancer genes including RET in brain cancers, ERBB2 in renal cell carcinomas, and DCC in ovarian cancers, all of which have not been previously associated with germline changes in cancer. CONCLUSIONS: This large-scale genome-wide association study for CNVs across multiple cancer types identified several novel rare germline CNVs as cancer predisposing genomic loci. These loci can potentially serve as clinically useful markers conferring increased cancer risk.

%B Molecular Cancer %V 14 %P 25 %8 2015 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/25644941?dopt=Abstract %R 10.1186/s12943-015-0292-6 %0 Journal Article %J Nature %D 2015 %T Failure to replicate the STAP cell phenomenon %A De Los Angeles, Alejandro* %A Ferrari, Francesco* %A Fujiwara, Yuko %A Mathieu, Ronald %A Lee, Soohyun %A Lee, Semin %A Tu, Ho-Chou %A Ross, Samantha %A Chou, Stephanie %A Nguyen, Minh %A Wu, Zhaoting %A Theunissen, Thorold W %A Powell, Benjamin E %A Imsoonthornruksa, Sumeth %A Chen, Jiekai %A Borkent, Marti %A Krupalnik, Vladislav %A Lujan, Ernesto %A Wernig, Marius %A Hanna, Jacob H %A Hochedlinger, Konrad %A Pei, Duanqing %A Jaenisch, Rudolf %A Deng, Hongkui %A Orkin, Stuart H %A Park, Peter J** %A Daley, George Q** %B Nature %V 525 %P E6-9 %8 2015 Sep 24 %G eng %N 7570 %1 http://www.ncbi.nlm.nih.gov/pubmed/26399835?dopt=Abstract %R 10.1038/nature15513 %0 Journal Article %J J Am Soc Nephrol %D 2014 %T A bioinformatics approach identifies signal transducer and activator of transcription-3 and checkpoint kinase 1 as upstream regulators of kidney injury molecule-1 after kidney injury. %A Ajay, Amrendra Kumar %A Kim, Tae-Min %A Ramirez-Gonzalez, Victoria %A Park, Peter J %A Frank, David A %A Vaidya, Vishal S %K Acute Kidney Injury %K Animals %K Cell Adhesion Molecules %K Cell Line %K Computational Biology %K DNA Damage %K Gene Expression Regulation %K Humans %K Kidney %K Male %K Membrane Glycoproteins %K Oxidative Stress %K Phosphorylation %K Promoter Regions, Genetic %K Protein Binding %K Protein Kinases %K Rats %K Rats, Wistar %K Receptors, Virus %K Reperfusion Injury %K RNA, Messenger %K RNA, Small Interfering %K STAT3 Transcription Factor %X

Kidney injury molecule-1 (KIM-1)/T cell Ig and mucin domain-containing protein-1 (TIM-1) is upregulated more than other proteins after AKI, and it is highly expressed in renal damage of various etiologies. In this capacity, KIM-1/TIM-1 acts as a phosphatidylserine receptor on the surface of injured proximal tubular epithelial cells, mediating phagocytosis of apoptotic cells, and it may also act as a costimulatory molecule for immune cells. Despite recognition of KIM-1 as an important therapeutic target for kidney disease, the regulators of KIM-1 transcription in the kidney remain unknown. Using a bioinformatics approach, we identified upstream regulators of KIM-1 after AKI. In response to tubular injury in rat and human kidneys or oxidant stress in human proximal tubular epithelial cells (HPTECs), KIM-1 expression increased significantly in a manner that corresponded temporally and regionally with increased phosphorylation of checkpoint kinase 1 (Chk1) and STAT3. Both ischemic and oxidant stress resulted in a dramatic increase in reactive oxygen species that phosphorylated and activated Chk1, which subsequently bound to STAT3, phosphorylating it at S727. Furthermore, STAT3 bound to the KIM-1 promoter after ischemic and oxidant stress, and pharmacological or genetic induction of STAT3 in HPTECs increased KIM-1 mRNA and protein levels. Conversely, inhibition of STAT3 using siRNAs or dominant negative mutants reduced KIM-1 expression in a kidney cancer cell line (769-P) that expresses high basal levels of KIM-1. These observations highlight Chk1 and STAT3 as critical upstream regulators of KIM-1 expression after AKI and may suggest novel approaches for therapeutic intervention.

%B J Am Soc Nephrol %V 25 %P 105-18 %8 2014 Jan %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/24158981?dopt=Abstract %R 10.1681/ASN.2013020161 %0 Journal Article %J Cell %D 2014 %T Integrated genomic characterization of papillary thyroid carcinoma. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Carcinoma %K DNA Copy Number Variations %K Gene Fusion %K Humans %K Mutation %K Thyroid Gland %K Thyroid Neoplasms %X

Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D, and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors, and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease.

%B Cell %V 159 %P 676-90 %8 2014 Oct 23 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/25417114?dopt=Abstract %R 10.1016/j.cell.2014.09.050 %0 Journal Article %J Nature %D 2014 %T Comprehensive molecular characterization of gastric adenocarcinoma. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Adenocarcinoma %K Female %K Gene Expression Regulation, Neoplastic %K Genome, Human %K Herpesvirus 4, Human %K Humans %K Male %K Mutation %K Proteome %K Stomach Neoplasms %X

Gastric cancer is a leading cause of cancer deaths, but analysis of its molecular and clinical characteristics has been complicated by histological and aetiological heterogeneity. Here we describe a comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project. We propose a molecular classification dividing gastric cancer into four subtypes: tumours positive for Epstein-Barr virus, which display recurrent PIK3CA mutations, extreme DNA hypermethylation, and amplification of JAK2, CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2); microsatellite unstable tumours, which show elevated mutation rates, including mutations of genes encoding targetable oncogenic signalling proteins; genomically stable tumours, which are enriched for the diffuse histological variant and mutations of RHOA or fusions involving RHO-family GTPase-activating proteins; and tumours with chromosomal instability, which show marked aneuploidy and focal amplification of receptor tyrosine kinases. Identification of these subtypes provides a roadmap for patient stratification and trials of targeted therapies.

%B Nature %V 513 %P 202-9 %8 2014 Sep 11 %G eng %N 7517 %1 http://www.ncbi.nlm.nih.gov/pubmed/25079317?dopt=Abstract %R 10.1038/nature13480 %0 Journal Article %J Nature %D 2014 %T Comprehensive molecular characterization of urothelial bladder carcinoma. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Cell Cycle %K Chromatin %K Down-Regulation %K Gene Expression Regulation, Neoplastic %K Humans %K MicroRNAs %K Molecular Targeted Therapy %K Oxidative Stress %K Phosphatidylinositol 3-Kinases %K Protein Kinases %K Proto-Oncogene Proteins c-akt %K RNA, Messenger %K Signal Transduction %K TOR Serine-Threonine Kinases %K Urinary Bladder Neoplasms %K Virus Integration %X

Urothelial carcinoma of the bladder is a common malignancy that causes approximately 150,000 deaths per year worldwide. So far, no molecularly targeted agents have been approved for treatment of the disease. As part of The Cancer Genome Atlas project, we report here an integrated analysis of 131 urothelial carcinomas to provide a comprehensive landscape of molecular alterations. There were statistically significant recurrent mutations in 32 genes, including multiple genes involved in cell-cycle regulation, chromatin regulation, and kinase signalling pathways, as well as 9 genes not previously reported as significantly mutated in any cancer. RNA sequencing revealed four expression subtypes, two of which (papillary-like and basal/squamous-like) were also evident in microRNA sequencing and protein data. Whole-genome and RNA sequencing identified recurrent in-frame activating FGFR3-TACC3 fusions and expression or integration of several viruses (including HPV16) that are associated with gene inactivation. Our analyses identified potential therapeutic targets in 69% of the tumours, including 42% with targets in the phosphatidylinositol-3-OH kinase/AKT/mTOR pathway and 45% with targets (including ERBB2) in the RTK/MAPK pathway. Chromatin regulatory genes were more frequently mutated in urothelial carcinoma than in any other common cancer studied so far, indicating the future possibility of targeted therapy for chromatin abnormalities.

%B Nature %V 507 %P 315-22 %8 2014 Mar 20 %G eng %N 7492 %1 http://www.ncbi.nlm.nih.gov/pubmed/24476821?dopt=Abstract %R 10.1038/nature12965 %0 Journal Article %J Nature %D 2014 %T Comprehensive molecular profiling of lung adenocarcinoma. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Adenocarcinoma %K Cell Cycle Proteins %K Female %K Gene Dosage %K Gene Expression Regulation, Neoplastic %K Genomics %K Humans %K Lung Neoplasms %K Male %K Molecular Typing %K Mutation %K Oncogenes %K Sex Factors %K Transcriptome %X

Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen (mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesis.

%B Nature %V 511 %P 543-50 %8 2014 Jul 31 %G eng %N 7511 %1 http://www.ncbi.nlm.nih.gov/pubmed/25079552?dopt=Abstract %R 10.1038/nature13385 %0 Journal Article %J G3 %D 2014 %T Large-scale quality analysis of published ChIP-seq data. %A Marinov, Georgi K %A Kundaje, Anshul %A Park, Peter J %A Wold, Barbara J %K Animals %K Chromatin Immunoprecipitation %K Data Interpretation, Statistical %K Databases, Genetic %K High-Throughput Nucleotide Sequencing %K MyoD Protein %K Quality Control %K Sequence Analysis, DNA %K Transcription Factors %X

ChIP-seq has become the primary method for identifying in vivo protein-DNA interactions on a genome-wide scale, with nearly 800 publications involving the technique appearing in PubMed as of December 2012. Individually and in aggregate, these data are an important and information-rich resource. However, uncertainties about data quality confound their use by the wider research community. Recently, the Encyclopedia of DNA Elements (ENCODE) project developed and applied metrics to objectively measure ChIP-seq data quality. The ENCODE quality analysis was useful for flagging datasets for closer inspection, eliminating or replacing poor data, and for driving changes in experimental pipelines. There had been no similarly systematic quality analysis of the large and disparate body of published ChIP-seq profiles. Here, we report a uniform analysis of vertebrate transcription factor ChIP-seq datasets in the Gene Expression Omnibus (GEO) repository as of April 1, 2012. The majority (55%) of datasets scored as being highly successful, but a substantial minority (20%) were of apparently poor quality, and another ∼25% were of intermediate quality. We discuss how different uses of ChIP-seq data are affected by specific aspects of data quality, and we highlight exceptional instances for which the metric values should not be taken at face value. Unexpectedly, we discovered that a significant subset of control datasets (i.e., no immunoprecipitation and mock immunoprecipitation samples) display an enrichment structure similar to successful ChIP-seq data. This can, in turn, affect peak calling and data interpretation. Published datasets identified here as high-quality comprise a large group that users can draw on for large-scale integrated analysis. In the future, ChIP-seq quality assessment similar to that used here could guide experimentalists at early stages in a study, provide useful input in the publication process, and be used to stratify ChIP-seq data for different community-wide uses.

%B G3 %V 4 %P 209-23 %8 2014 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/24347632?dopt=Abstract %R 10.1534/g3.113.008680 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2014 %T Characterization of HPV and host genome interactions in primary head and neck cancers. %A Parfenov, Michael %A Pedamallu, Chandra Sekhar %A Nils Gehlenborg %A Freeman, Samuel S %A Danilova, Ludmila %A Bristow, Christopher A %A Lee, Semin %A Hadjipanayis, Angela G %A Ivanova, Elena V %A Wilkerson, Matthew D %A Protopopov, Alexei %A Yang, Lixing %A Seth, Sahil %A Song, Xingzhi %A Tang, Jiabin %A Ren, Xiaojia %A Zhang, Jianhua %A Pantazi, Angeliki %A Santoso, Netty %A Xu, Andrew W %A Mahadeshwar, Harshad %A Wheeler, David A %A Haddad, Robert I %A Jung, Joonil %A Ojesina, Akinyemi I %A Issaeva, Natalia %A Yarbrough, Wendell G %A Hayes, D Neil %A Grandis, Jennifer R %A El-Naggar, Adel K %A Meyerson, Matthew %A Park, Peter J %A Chin, Lynda %A Seidman, J G %A Hammerman, Peter S %A Kucherlapati, Raju %A Cancer Genome Atlas Network, The Cancer Genome Atlas %X

Previous studies have established that a subset of head and neck tumors contains human papillomavirus (HPV) sequences and that HPV-driven head and neck cancers display distinct biological and clinical features. HPV is known to drive cancer by the actions of the E6 and E7 oncoproteins, but the molecular architecture of HPV infection and its interaction with the host genome in head and neck cancers have not been comprehensively described. We profiled a cohort of 279 head and neck cancers with next generation RNA and DNA sequencing and show that 35 (12.5%) tumors displayed evidence of high-risk HPV types 16, 33, or 35. Twenty-five cases had integration of the viral genome into one or more locations in the human genome with statistical enrichment for genic regions. Integrations had a marked impact on the human genome and were associated with alterations in DNA copy number, mRNA transcript abundance and splicing, and both inter- and intrachromosomal rearrangements. Many of these events involved genes with documented roles in cancer. Cancers with integrated vs. nonintegrated HPV displayed different patterns of DNA methylation and both human and viral gene expressions. Together, these data provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viral oncogenesis.

%B Proc Natl Acad Sci U S A %V 111 %P 15544-9 %8 2014 Oct 28 %G eng %N 43 %1 http://www.ncbi.nlm.nih.gov/pubmed/25313082?dopt=Abstract %R 10.1073/pnas.1416074111 %0 Journal Article %J Nature %D 2014 %T Comparative analysis of metazoan chromatin organization. %A Ho, Joshua W K* %A Jung, Youngsook L* %A Liu, Tao* %A Alver, Burak Han %A Lee, Soohyun %A Ikegami, Kohta %A Sohn, Kyung-Ah %A Minoda, Aki %A Tolstorukov, Michael Y %A Appert, Alex %A Parker, Stephen C J %A Gu, Tingting %A Kundaje, Anshul %A Riddle, Nicole C %A Bishop, Eric P %A Egelhofer, Thea A %A Hu, Sheng'en Shawn %A Alekseyenko, Artyom A %A Rechtsteiner, Andreas %A Asker, Dalal %A Belsky, Jason A %A Bowman, Sarah K %A Chen, Q Brent %A Chen, Ron A-J %A Day, Daniel S %A Dong, Yan %A Dose, Andrea C %A Duan, Xikun %A Epstein, Charles B %A Ercan, Sevinc %A Feingold, Elise A %A Ferrari, Francesco %A Garrigues, Jacob M %A Gehlenborg, Nils %A Good, Peter J %A Haseley, Psalm %A He, Daniel %A Herrmann, Moritz %A Hoffman, Michael M %A Jeffers, Tess E %A Kharchenko, Peter V %A Kolasinska-Zwierz, Paulina %A Kotwaliwale, Chitra V %A Kumar, Nischay %A Langley, Sasha A %A Larschan, Erica N %A Latorre, Isabel %A Libbrecht, Maxwell W %A Lin, Xueqiu %A Park, Richard %A Pazin, Michael J %A Pham, Hoang N %A Plachetka, Annette %A Qin, Bo %A Schwartz, Yuri B %A Shoresh, Noam %A Stempor, Przemyslaw %A Vielle, Anne %A Wang, Chengyang %A Whittle, Christina M %A Xue, Huiling %A Kingston, Robert E %A Kim, Ju Han %A Bernstein, Bradley E %A Dernburg, Abby F %A Pirrotta, Vincenzo %A Kuroda, Mitzi I %A Noble, William S %A Tullius, Thomas D %A Kellis, Manolis %A MacAlpine, David M** %A Strome, Susan** %A Elgin, Sarah C R** %A Liu, Xiaole Shirley** %A Lieb, Jason D** %A Ahringer, Julie** %A Karpen, Gary H** %A Park, Peter J** %K Animals %K Caenorhabditis elegans %K Cell Line %K Centromere %K Chromatin %K Chromatin Assembly and Disassembly %K DNA Replication %K Drosophila melanogaster %K Enhancer Elements, Genetic %K Epigenesis, Genetic %K Heterochromatin %K Histones %K Humans %K Molecular Sequence Annotation %K Nuclear Lamina %K Nucleosomes %K Promoter Regions, Genetic %K Species Specificity %X

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.

%B Nature %V 512 %P 449-52 %8 2014 Aug 28 %G eng %N 7515 %1 http://www.ncbi.nlm.nih.gov/pubmed/25164756?dopt=Abstract %R 10.1038/nature13415 %0 Journal Article %J Nature %D 2014 %T Comparative analysis of the transcriptome across distant species. %A Gerstein, Mark B* ** %A Rozowsky, Joel* %A Yan, Koon-Kiu* %A Wang, Daifeng* %A Cheng, Chao* %A Brown, James B* %A Davis, Carrie A* %A Hillier, LaDeana* %A Sisu, Cristina* %A Li, Jingyi Jessica* %A Pei, Baikang* %A Harmanci, Arif O* %A Duff, Michael O* %A Djebali, Sarah* %A Alexander, Roger P %A Alver, Burak Han %A Auerbach, Raymond %A Bell, Kimberly %A Bickel, Peter J %A Boeck, Max E %A Boley, Nathan P %A Booth, Benjamin W %A Cherbas, Lucy %A Cherbas, Peter %A Di, Chao %A Dobin, Alex %A Drenkow, Jorg %A Ewing, Brent %A Fang, Gang %A Fastuca, Megan %A Feingold, Elise A %A Frankish, Adam %A Gao, Guanjun %A Good, Peter J %A Guigó, Roderic %A Hammonds, Ann %A Harrow, Jen %A Hoskins, Roger A %A Howald, Cédric %A Hu, Long %A Huang, Haiyan %A Hubbard, Tim J P %A Huynh, Chau %A Jha, Sonali %A Kasper, Dionna %A Kato, Masaomi %A Kaufman, Thomas C %A Kitchen, Robert R %A Ladewig, Erik %A Lagarde, Julien %A Lai, Eric %A Leng, Jing %A Lu, Zhi %A MacCoss, Michael %A May, Gemma %A McWhirter, Rebecca %A Merrihew, Gennifer %A Miller, David M %A Mortazavi, Ali %A Murad, Rabi %A Oliver, Brian %A Olson, Sara %A Park, Peter J %A Pazin, Michael J %A Perrimon, Norbert %A Pervouchine, Dmitri %A Reinke, Valerie %A Reymond, Alexandre %A Robinson, Garrett %A Samsonova, Anastasia %A Saunders, Gary I %A Schlesinger, Felix %A Sethi, Anurag %A Slack, Frank J %A Spencer, William C %A Stoiber, Marcus H %A Strasbourger, Pnina %A Tanzer, Andrea %A Thompson, Owen A %A Wan, Kenneth H %A Wang, Guilin %A Wang, Huaien %A Watkins, Kathie L %A Wen, Jiayu %A Wen, Kejia %A Xue, Chenghai %A Yang, Li %A Yip, Kevin %A Zaleski, Chris %A Zhang, Yan %A Zheng, Henry %A Brenner, Steven E** %A Graveley, Brenton R** %A Celniker, Susan E** %A Gingeras, Thomas R** %A Waterston, Robert** %K Animals %K Caenorhabditis elegans %K Chromatin %K Cluster Analysis %K Drosophila melanogaster %K Gene Expression Profiling %K Gene Expression Regulation, Developmental %K Histones %K Humans %K Larva %K Models, Genetic %K Molecular Sequence Annotation %K Promoter Regions, Genetic %K Pupa %K RNA, Untranslated %K Sequence Analysis, RNA %K Transcriptome %X

The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

%B Nature %V 512 %P 445-8 %8 2014 Aug 28 %G eng %N 7515 %1 http://www.ncbi.nlm.nih.gov/pubmed/25164755?dopt=Abstract %R 10.1038/nature13424 %0 Journal Article %J Cancer Res %D 2014 %T A genome-wide view of microsatellite instability: old stories of cancer mutations revisited with new sequencing technologies. %A Kim, Tae-Min %A Park, Peter J %X

Microsatellites are simple tandem repeats that are present at millions of loci in the human genome. Microsatellite instability (MSI) refers to DNA slippage events on microsatellites that occur frequently in cancer genomes when there is a defect in the DNA-mismatch repair system. These somatic mutations can result in inactivation of tumor-suppressor genes or disrupt other noncoding regulatory sequences, thereby playing a role in carcinogenesis. Here, we will discuss the ways in which high-throughput sequencing data can facilitate genome- or exome-wide discovery and more detailed investigation of MSI events in microsatellite-unstable cancer genomes. We will address the methodologic aspects of this approach and highlight insights from recent analyses of colorectal and endometrial cancer genomes from The Cancer Genome Atlas project. These include identification of novel MSI targets within and across tumor types and the relationship between the likelihood of MSI events to chromatin structure. Given the increasing popularity of exome and genome sequencing of cancer genomes, a comprehensive characterization of MSI may serve as a valuable marker of cancer evolution and aid in a search for therapeutic targets.

%B Cancer Res %V 74 %P 6377-82 %8 2014 Nov 15 %G eng %N 22 %1 http://www.ncbi.nlm.nih.gov/pubmed/25371413?dopt=Abstract %R 10.1158/0008-5472.CAN-14-1225 %0 Journal Article %J Nat Methods %D 2014 %T Guided visual exploration of genomic stratifications in cancer. %A Marc Streit* %A Alexander Lex* %A Samuel Gratzl %A Christian Partl %A Dieter Schmalstieg %A Hanspeter Pfister %A Park, Peter J** %A Gehlenborg, Nils** %K Algorithms %K Animals %K Chromosome Mapping %K data mining %K Database Management Systems %K Databases, Genetic %K Genetic Predisposition to Disease %K Genomics %K Humans %K Neoplasms %K Proteome %K Software %K User-Computer Interface %B Nat Methods %V 11 %P 884-5 %8 2014 Sep %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/25166867?dopt=Abstract %R 10.1038/nmeth.3088 %0 Journal Article %J Nucleic Acids Res %D 2014 %T Impact of sequencing depth in ChIP-seq experiments. %A Jung, Youngsook L %A Luquette, Lovelace J %A Ho, Joshua W K %A Ferrari, Francesco %A Tolstorukov, Michael %A Minoda, Aki %A Issner, Robbyn %A Epstein, Charles B %A Karpen, Gary H %A Kuroda, Mitzi I %A Park, Peter J %K Algorithms %K Animals %K Chromatin Immunoprecipitation %K Drosophila melanogaster %K Genome, Human %K Genome, Insect %K Genomic Library %K High-Throughput Nucleotide Sequencing %K Histones %K Humans %K Models, Genetic %K Protein Processing, Post-Translational %K Sequence Analysis, DNA %X

In a chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiment, an important consideration in experimental design is the minimum number of sequenced reads required to obtain statistically significant results. We present an extensive evaluation of the impact of sequencing depth on identification of enriched regions for key histone modifications (H3K4me3, H3K36me3, H3K27me3 and H3K9me2/me3) using deep-sequenced datasets in human and fly. We propose to define sufficient sequencing depth as the number of reads at which detected enrichment regions increase <1% for an additional million reads. Although the required depth depends on the nature of the mark and the state of the cell in each experiment, we observe that sufficient depth is often reached at <20 million reads for fly. For human, there are no clear saturation points for the examined datasets, but our analysis suggests 40-50 million reads as a practical minimum for most marks. We also devise a mathematical model to estimate the sufficient depth and total genomic coverage of a mark. Lastly, we find that the five algorithms tested do not agree well for broad enrichment profiles, especially at lower depths. Our findings suggest that sufficient sequencing depth and an appropriate peak-calling algorithm are essential for ensuring robustness of conclusions derived from ChIP-seq data.

%B Nucleic Acids Res %V 42 %P e74 %8 2014 May %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/24598259?dopt=Abstract %R 10.1093/nar/gku178 %0 Journal Article %J Cell %D 2014 %T Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. %A Hoadley, Katherine A %A Yau, Christina %A Wolf, Denise M %A Cherniack, Andrew D %A Tamborero, David %A Ng, Sam %A Leiserson, Max D M %A Niu, Beifang %A McLellan, Michael D %A Uzunangelov, Vladislav %A Zhang, Jiashan %A Kandoth, Cyriac %A Akbani, Rehan %A Shen, Hui %A Omberg, Larsson %A Chu, Andy %A Margolin, Adam A %A Van't Veer, Laura J %A Lopez-Bigas, Nuria %A Laird, Peter W %A Raphael, Benjamin J %A Ding, Li %A Robertson, A Gordon %A Byers, Lauren A %A Mills, Gordon B %A Weinstein, John N %A Van Waes, Carter %A Chen, Zhong %A Collisson, Eric A %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %A Benz, Christopher C %A Perou, Charles M %A Stuart, Joshua M %K Cluster Analysis %K Humans %K Neoplasms %K Transcriptome %X

Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multiplatform classification, while correlated with tissue-of-origin, provides independent information for predicting clinical outcomes. All data sets are available for data-mining from a unified resource to support further biological discoveries and insights into novel therapeutic strategies.

%B Cell %V 158 %P 929-44 %8 2014 Aug 14 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/25109877?dopt=Abstract %R 10.1016/j.cell.2014.06.049 %0 Journal Article %J Nat Commun %D 2014 %T Nucleosomal occupancy changes locally over key regulatory regions during cell differentiation and reprogramming. %A West, Jason A* %A Cook, April* %A Alver, Burak Han %A Stadtfeld, Matthias %A Deaton, Aimee M %A Hochedlinger, Konrad %A Park, Peter J** %A Tolstorukov, Michael Y** %A Kingston, Robert E** %X

Chromatin structure determines DNA accessibility. We compare nucleosome occupancy in mouse and human embryonic stem cells (ESCs), induced-pluripotent stem cells (iPSCs) and differentiated cell types using MNase-seq. To address variability inherent in this technique, we developed a bioinformatic approach to identify regions of difference (RoD) in nucleosome occupancy between pluripotent and somatic cells. Surprisingly, most chromatin remains unchanged; a majority of rearrangements appear to affect a single nucleosome. RoDs are enriched at genes and regulatory elements, including enhancers associated with pluripotency and differentiation. RoDs co-localize with binding sites of key developmental regulators, including the reprogramming factors Klf4, Oct4/Sox2 and c-Myc. Nucleosomal landscapes in ESC enhancers are extensively altered, exhibiting lower nucleosome occupancy in pluripotent cells than in somatic cells. Most changes are reset during reprogramming. We conclude that changes in nucleosome occupancy are a hallmark of cell differentiation and reprogramming and likely identify regulatory regions essential for these processes.

%B Nat Commun %V 5 %P 4719 %8 2014 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/25158628?dopt=Abstract %R 10.1038/ncomms5719 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2014 %T p53 prevents neurodegeneration by regulating synaptic genes. %A Merlo, Paola %A Frost, Bess %A Peng, Shouyong %A Yang, Yawei J %A Park, Peter J %A Feany, Mel %X

DNA damage has been implicated in neurodegenerative disorders, including Alzheimer's disease and other tauopathies, but the consequences of genotoxic stress to postmitotic neurons are poorly understood. Here we demonstrate that p53, a key mediator of the DNA damage response, plays a neuroprotective role in a Drosophila model of tauopathy. Further, through a whole-genome ChIP-chip analysis, we identify genes controlled by p53 in postmitotic neurons. We genetically validate a specific pathway, synaptic function, in p53-mediated neuroprotection. We then demonstrate that the control of synaptic genes by p53 is conserved in mammals. Collectively, our results implicate synaptic function as a central target in p53-dependent protection from neurodegeneration.

%B Proc Natl Acad Sci U S A %V 111 %P 18055-60 %8 2014 Dec 16 %G eng %N 50 %1 http://www.ncbi.nlm.nih.gov/pubmed/25453105?dopt=Abstract %R 10.1073/pnas.1419083111 %0 Journal Article %J Cancer Cell %D 2014 %T The somatic genomic landscape of chromophobe renal cell carcinoma. %A Davis, Caleb F* %A Ricketts, Christopher J* %A Wang, Min* %A Yang, Lixing* %A Cherniack, Andrew D %A Shen, Hui %A Buhay, Christian %A Kang, Hyojin %A Kim, Sang Cheol %A Fahey, Catherine C %A Hacker, Kathryn E %A Bhanot, Gyan %A Gordenin, Dmitry A %A Chu, Andy %A Gunaratne, Preethi H %A Biehl, Michael %A Seth, Sahil %A Kaipparettu, Benny A %A Bristow, Christopher A %A Donehower, Lawrence A %A Wallen, Eric M %A Smith, Angela B %A Tickoo, Satish K %A Tamboli, Pheroze %A Reuter, Victor %A Schmidt, Laura S %A Hsieh, James J %A Choueiri, Toni K %A Hakimi, A Ari %A Cancer Genome Atlas Research Network %A Chin, Lynda %A Meyerson, Matthew %A Kucherlapati, Raju %A Park, Woong-Yang %A Robertson, A Gordon %A Laird, Peter W %A Henske, Elizabeth P %A Kwiatkowski, David J %A Park, Peter J %A Morgan, Margaret %A Shuch, Brian %A Muzny, Donna %A Wheeler, David A %A Linehan, W Marston %A Gibbs, Richard A %A Rathmell, W Kimryn %A Creighton, Chad J %K Base Sequence %K Carcinoma, Renal Cell %K Chromosome Breakpoints %K Chromosome Deletion %K Chromosomes, Human %K DNA Copy Number Variations %K DNA Methylation %K DNA Mutational Analysis %K DNA, Mitochondrial %K Exome %K Genome, Human %K Humans %K Kidney Neoplasms %K Molecular Sequence Data %K Promoter Regions, Genetic %K Telomerase %K Transcriptome %X

We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) on the basis of multidimensional and comprehensive characterization, including mtDNA and whole-genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared with other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT upregulation in cancer distinct from previously observed amplifications and point mutations.

%B Cancer Cell %V 26 %P 319-30 %8 2014 Sep 8 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/25155756?dopt=Abstract %R 10.1016/j.ccr.2014.07.014 %0 Journal Article %J Cell Cycle %D 2014 %T Rearranging the chromatin for pluripotency. %A Ferrari, Francesco* %A Apostolou, Effie* %A Park, Peter J** %A Hochedlinger, Konrad** %B Cell Cycle %V 13 %P 167-8 %8 2014 Jan 15 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/24241209?dopt=Abstract %R 10.4161/cc.27028 %0 Journal Article %J Nat Struct Mol Biol %D 2014 %T Transcriptional control of a whole chromosome: emerging models for dosage compensation. %A Ferrari, Francesco %A Alekseyenko, Artyom A %A Park, Peter J %A Kuroda, Mitzi I %K Animals %K Caenorhabditis elegans %K Chromosomes %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Gene Expression Regulation %K Gene Silencing %K Genomics %K Mammals %K Models, Genetic %K X Chromosome Inactivation %X

Males and females of many animal species differ in their sex-chromosome karyotype, and this creates imbalances between X-chromosome and autosomal gene products that require compensation. Although distinct molecular mechanisms have evolved in three highly studied systems, they all achieve coordinate regulation of an entire chromosome by differential RNA-polymerase occupancy at X-linked genes. High-throughput genome-wide methods have been pivotal in driving the latest progress in the field. Here we review the emerging models for dosage compensation in mammals, flies and nematodes, with a focus on mechanisms affecting RNA polymerase II activity on the X chromosome.

%B Nat Struct Mol Biol %V 21 %P 118-25 %8 2014 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/24500429?dopt=Abstract %R 10.1038/nsmb.2763 %0 Journal Article %J Nature %D 2013 %T Comprehensive molecular characterization of clear cell renal cell carcinoma. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Acetyl-CoA Carboxylase %K AMP-Activated Protein Kinases %K Carcinoma, Renal Cell %K Chromatin %K Chromatin Assembly and Disassembly %K Citric Acid Cycle %K DNA Methylation %K DNA Mutational Analysis %K Epigenesis, Genetic %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Genome, Human %K Genomics %K GRB10 Adaptor Protein %K Histone-Lysine N-Methyltransferase %K Humans %K Metabolic Networks and Pathways %K MicroRNAs %K Mutation %K Pentose Phosphate Pathway %K Phosphatidylinositol 3-Kinases %K Proto-Oncogene Proteins c-akt %K PTEN Phosphohydrolase %K RNA, Neoplasm %K Signal Transduction %K Survival Analysis %X Genetic changes underlying clear cell renal cell carcinoma (ccRCC) include alterations in genes controlling cellular oxygen sensing (for example, VHL) and the maintenance of chromatin states (for example, PBRM1). We surveyed more than 400 tumours using different genomic platforms and identified 19 significantly mutated genes. The PI(3)K/AKT pathway was recurrently mutated, suggesting this pathway as a potential therapeutic target. Widespread DNA hypomethylation was associated with mutation of the H3K36 methyltransferase SETD2, and integrative analysis suggested that mutations involving the SWI/SNF chromatin remodelling complex (PBRM1, ARID1A, SMARCA4) could have far-reaching effects on other pathways. Aggressive cancers demonstrated evidence of a metabolic shift, involving downregulation of genes involved in the TCA cycle, decreased AMPK and PTEN protein levels, upregulation of the pentose phosphate pathway and the glutamine transporter genes, increased acetyl-CoA carboxylase protein, and altered promoter methylation of miR-21 (also known as MIR21) and GRB10. Remodelling cellular metabolism thus constitutes a recurrent pattern in ccRCC that correlates with tumour stage and severity and offers new views on the opportunities for disease treatment. %B Nature %V 499 %P 43-9 %8 2013 Jul 4 %G eng %N 7456 %1 http://www.ncbi.nlm.nih.gov/pubmed/23792563?dopt=Abstract %R 10.1038/nature12222 %0 Journal Article %J Nature %D 2013 %T Integrated genomic characterization of endometrial carcinoma. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Breast Neoplasms %K Chromosome Aberrations %K DNA Copy Number Variations %K DNA Mutational Analysis %K DNA Polymerase II %K DNA-Binding Proteins %K Endometrial Neoplasms %K Exome %K Female %K Gene Expression Regulation, Neoplastic %K Genome, Human %K Genomics %K Humans %K Ovarian Neoplasms %K Signal Transduction %K Transcription Factors %X We performed an integrated genomic, transcriptomic and proteomic characterization of 373 endometrial carcinomas using array- and sequencing-based technologies. Uterine serous tumours and ∼25% of high-grade endometrioid tumours had extensive copy number alterations, few DNA methylation changes, low oestrogen receptor/progesterone receptor levels, and frequent TP53 mutations. Most endometrioid tumours had few copy number alterations or TP53 mutations, but frequent mutations in PTEN, CTNNB1, PIK3CA, ARID1A and KRAS and novel mutations in the SWI/SNF chromatin remodelling complex gene ARID5B. A subset of endometrioid tumours that we identified had a markedly increased transversion mutation frequency and newly identified hotspot mutations in POLE. Our results classified endometrial cancers into four categories: POLE ultramutated, microsatellite instability hypermutated, copy-number low, and copy-number high. Uterine serous carcinomas share genomic features with ovarian serous and basal-like breast carcinomas. We demonstrated that the genomic features of endometrial carcinomas permit a reclassification that may affect post-surgical adjuvant treatment for women with aggressive tumours. %B Nature %V 497 %P 67-73 %8 2013 May 2 %G eng %N 7447 %1 http://www.ncbi.nlm.nih.gov/pubmed/23636398?dopt=Abstract %R 10.1038/nature12113 %0 Journal Article %J Cell %D 2013 %T The somatic genomic landscape of glioblastoma. %A Brennan, Cameron W %A Verhaak, Roel G W %A McKenna, Aaron %A Campos, Benito %A Noushmehr, Houtan %A Salama, Sofie R %A Zheng, Siyuan %A Chakravarty, Debyani %A Sanborn, J Zachary %A Berman, Samuel H %A Beroukhim, Rameen %A Bernard, Brady %A Wu, Chang-Jiun %A Genovese, Giannicola %A Shmulevich, Ilya %A Barnholtz-Sloan, Jill %A Zou, Lihua %A Vegesna, Rahulsimham %A Shukla, Sachet A %A Ciriello, Giovanni %A Yung, W K %A Zhang, Wei %A Sougnez, Carrie %A Mikkelsen, Tom %A Aldape, Kenneth %A Bigner, Darell D %A Van Meir, Erwin G %A Prados, Michael %A Sloan, Andrew %A Black, Keith L %A Eschbacher, Jennifer %A Finocchiaro, Gaetano %A Friedman, William %A Andrews, David W %A Guha, Abhijit %A Iacocca, Mary %A O'Neill, Brian P %A Foltz, Greg %A Myers, Jerome %A Weisenberger, Daniel J %A Penny, Robert %A Kucherlapati, Raju %A Perou, Charles M %A Hayes, D Neil %A Gibbs, Richard %A Marra, Marco %A Mills, Gordon B %A Lander, Eric %A Spellman, Paul %A Wilson, Richard %A Sander, Chris %A Weinstein, John %A Meyerson, Matthew %A Gabriel, Stacey %A Laird, Peter W %A Haussler, David %A Getz, Gad %A Chin, Lynda %A TCGA Research Network %K Brain Neoplasms %K Female %K Gene Expression Profiling %K Gene Regulatory Networks %K Glioblastoma %K Humans %K Male %K Mutation %K Proteome %K Signal Transduction %X We describe the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer. %B Cell %V 155 %P 462-77 %8 2013 Oct 10 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/24120142?dopt=Abstract %R 10.1016/j.cell.2013.09.034 %0 Journal Article %J Nat Genet %D 2013 %T The Cancer Genome Atlas Pan-Cancer analysis project. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %A Weinstein, John N %A Collisson, Eric A %A Mills, Gordon B %A Shaw, Kenna R Mills %A Ozenberger, Brad A %A Ellrott, Kyle %A Shmulevich, Ilya %A Sander, Chris %A Stuart, Joshua M %K Gene Expression Profiling %K Genome %K Humans %K Neoplasms %X

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

%B Nat Genet %V 45 %P 1113-20 %8 2013 Oct %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/24071849?dopt=Abstract %R 10.1038/ng.2764 %0 Journal Article %J Genes Dev %D 2013 %T The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. %A Soruco, Marcela M L* %A Chery, Jessica* %A Bishop, Eric P* %A Siggers, Trevor %A Tolstorukov, Michael Y %A Leydon, Alexander R %A Sugden, Arthur U %A Goebel, Karen %A Feng, Jessica %A Xia, Peng %A Vedenko, Anastasia %A Bulyk, Martha L %A Park, Peter J %A Larschan, Erica %K Animals %K Cell Line %K DNA-Binding Proteins %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Female %K Male %K Protein Binding %K X Chromosome %X

The Drosophila male-specific lethal (MSL) dosage compensation complex increases transcript levels on the single male X chromosome to equal the transcript levels in XX females. However, it is not known how the MSL complex is linked to its DNA recognition elements, the critical first step in dosage compensation. Here, we demonstrate that a previously uncharacterized zinc finger protein, CLAMP (chromatin-linked adaptor for MSL proteins), functions as the first link between the MSL complex and the X chromosome. CLAMP directly binds to the MSL complex DNA recognition elements and is required for the recruitment of the MSL complex. The discovery of CLAMP identifies a key factor required for the chromosome-specific targeting of dosage compensation, providing new insights into how subnuclear domains of coordinate gene regulation are formed within metazoan genomes.

%B Genes Dev %V 27 %P 1551-6 %8 2013 Jul 15 %G eng %N 14 %1 http://www.ncbi.nlm.nih.gov/pubmed/23873939?dopt=Abstract %R 10.1101/gad.214585.113 %0 Journal Article %J Genes Dev %D 2013 %T Conservation and de novo acquisition of dosage compensation on newly evolved sex chromosomes in Drosophila. %A Alekseyenko, Artyom A %A Ellison, Christopher E %A Gorchakov, Andrey A %A Zhou, Qi %A Kaiser, Vera B %A Toda, Nick %A Walton, Zaak %A Peng, Shouyong %A Park, Peter J %A Bachtrog, Doris %A Kuroda, Mitzi I %K Animals %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Evolution, Molecular %K Female %K Karyotype %K Male %K Molecular Sequence Data %K Sex Chromosomes %X

Dosage compensation has arisen in response to the evolution of distinct male (XY) and female (XX) karyotypes. In Drosophila melanogaster, the MSL complex increases male X transcription approximately twofold. X-specific targeting is thought to occur through sequence-dependent binding to chromatin entry sites (CESs), followed by spreading in cis to active genes. We tested this model by asking how newly evolving sex chromosome arms in Drosophila miranda acquired dosage compensation. We found evidence for the creation of new CESs, with the analogous sequence and spacing as in D. melanogaster, providing strong support for the spreading model in the establishment of dosage compensation.

%B Genes Dev %V 27 %P 853-8 %8 2013 Apr 15 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/23630075?dopt=Abstract %R 10.1101/gad.215426.113 %0 Journal Article %J Genome Res %D 2013 %T A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity. %A Zhang, Bing* %A Day, Daniel S* %A Ho, Joshua W %A Song, Lingyun %A Cao, Jingjing %A Christodoulou, Danos %A Seidman, Jonathan G %A Crawford, Gregory E %A Park, Peter J %A Pu, William T %K Acetylation %K Binding Sites %K Chromatin %K Chromatin Assembly and Disassembly %K Cluster Analysis %K E1A-Associated p300 Protein %K Endothelial Cells %K Enhancer Elements, Genetic %K Gene Expression Regulation %K Histones %K Human Umbilical Vein Endothelial Cells %K Humans %K Nucleotide Motifs %K Protein Binding %K Response Elements %K Transcription Factors %K Vascular Endothelial Growth Factor A %X

Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth factor A (VEGFA), a major regulator of angiogenesis, triggers changes in transcriptional activity of human umbilical vein endothelial cells (HUVECs). Here, we used chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) to measure genome-wide changes in histone H3 acetylation at lysine 27 (H3K27ac), a marker of active enhancers, in unstimulated HUVECs and HUVECs stimulated with VEGFA for 1, 4, and 12 h. We show that sites with the greatest H3K27ac change upon stimulation were associated tightly with EP300, a histone acetyltransferase. Using the variation of H3K27ac as a novel epigenetic signature, we identified transcriptional regulatory elements that are functionally linked to angiogenesis, participate in rapid VEGFA-stimulated changes in chromatin conformation, and mediate VEGFA-induced transcriptional responses. Dynamic H3K27ac deposition and associated changes in chromatin conformation required EP300 activity instead of altered nucleosome occupancy or changes in DNase I hypersensitivity. EP300 activity was also required for a subset of dynamic H3K27ac sites to loop into proximity of promoters. Our study identified thousands of endothelial, VEGFA-responsive enhancers, demonstrating that an epigenetic signature based on the variation of a chromatin feature is a productive approach to define signal-responsive genomic elements. Further, our study implicates global epigenetic modifications in rapid, signal-responsive transcriptional regulation.

%B Genome Res %V 23 %P 917-27 %8 2013 Jun %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/23547170?dopt=Abstract %R 10.1101/gr.149674.112 %0 Journal Article %J Cell Stem Cell %D 2013 %T Genome-wide chromatin interactions of the Nanog locus in pluripotency, differentiation, and reprogramming. %A Apostolou, Effie* %A Ferrari, Francesco* %A Walsh, Ryan M %A Bar-Nur, Ori %A Stadtfeld, Matthias %A Cheloufi, Sihem %A Stuart, Hannah T %A Polo, Jose M %A Ohsumi, Toshiro K %A Borowsky, Mark L %A Kharchenko, Peter V %A Park, Peter J** %A Hochedlinger, Konrad** %K Animals %K Cellular Reprogramming %K Chromatin %K Genome %K Homeodomain Proteins %K Humans %K Mice %K Pluripotent Stem Cells %X

The chromatin state of pluripotency genes has been studied extensively in embryonic stem cells (ESCs) and differentiated cells, but their potential interactions with other parts of the genome remain largely unexplored. Here, we identified a genome-wide, pluripotency-specific interaction network around the Nanog promoter by adapting circular chromosome conformation capture sequencing. This network was rearranged during differentiation and restored in induced pluripotent stem cells. A large fraction of Nanog-interacting loci were bound by Mediator or cohesin in pluripotent cells. Depletion of these proteins from ESCs resulted in a disruption of contacts and the acquisition of a differentiation-specific interaction pattern prior to obvious transcriptional and phenotypic changes. Similarly, the establishment of Nanog interactions during reprogramming often preceded transcriptional upregulation of associated genes, suggesting a causative link. Our results document a complex, pluripotency-specific chromatin "interactome" for Nanog and suggest a functional role for long-range genomic interactions in the maintenance and induction of pluripotency.

%B Cell Stem Cell %V 12 %P 699-712 %8 2013 Jun 6 %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/23665121?dopt=Abstract %R 10.1016/j.stem.2013.04.013 %0 Journal Article %J Neoplasia %D 2013 %T Loss of Sh3gl2/endophilin A1 is a common event in urothelial carcinoma that promotes malignant behavior. %A Majumdar, Shyama %A Gong, Edward M %A Di Vizio, Dolores %A Dreyfuss, Jonathan M %A Degraff, David J %A Hager, Martin H %A Park, Peter J %A Bellmunt, Joaquim %A Matusik, Robert J %A Rosenberg, Jonathan E %A Adam, Rosalyn M %K Adaptor Proteins, Signal Transducing %K Animals %K Carcinoma %K Cell Line, Tumor %K Cell Movement %K Cell Proliferation %K Cell Transformation, Neoplastic %K Disease Progression %K Gene Expression Profiling %K Gene Silencing %K Humans %K Mice %K Receptor, Epidermal Growth Factor %K Signal Transduction %K src-Family Kinases %K STAT3 Transcription Factor %K Tumor Burden %K Urinary Bladder Neoplasms %K Xenograft Model Antitumor Assays %X

Urothelial carcinoma (UC) causes substantial morbidity and mortality worldwide. However, the molecular mechanisms underlying urothelial cancer development and tumor progression are still largely unknown. Using informatics analysis, we identified Sh3gl2 (endophilin A1) as a bladder urothelium-enriched transcript. The gene encoding Sh3gl2 is located on chromosome 9p, a region frequently altered in UC. Sh3gl2 is known to regulate endocytosis of receptor tyrosine kinases implicated in oncogenesis, such as the epidermal growth factor receptor (EGFR) and c-Met. However, its role in UC pathogenesis is unknown. Informatics analysis of expression profiles as well as immunohistochemical staining of tissue microarrays revealed Sh3gl2 expression to be decreased in UC specimens compared to nontumor tissues. Loss of Sh3gl2 was associated with increasing tumor grade and with muscle invasion, which is a reliable predictor of metastatic disease and cancer-derived mortality. Sh3gl2 expression was undetectable in 19 of 20 human UC cell lines but preserved in the low-grade cell line RT4. Stable silencing of Sh3gl2 in RT4 cells by RNA interference 1) enhanced proliferation and colony formation in vitro, 2) inhibited EGF-induced EGFR internalization and increased EGFR activation, 3) stimulated phosphorylation of Src family kinases and STAT3, and 4) promoted growth of RT4 xenografts in subrenal capsule tissue recombination experiments. Conversely, forced re-expression of Sh3gl2 in T24 cells and silenced RT4 clones attenuated oncogenic behaviors, including growth and migration. Together, these findings identify loss of Sh3gl2 as a frequent event in UC development that promotes disease progression.

%B Neoplasia %V 15 %P 749-60 %8 2013 Jul %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/23814487?dopt=Abstract %0 Journal Article %J Int J Cancer %D 2013 %T Overcoming evasive resistance from vascular endothelial growth factor a inhibition in sarcomas by genetic or pharmacologic targeting of hypoxia-inducible factor 1α. %A Kim, Yeo-Jung %A Lee, Hae-June %A Kim, Tae-Min %A Eisinger-Mathason, T S Karin %A Zhang, Alexia Y %A Schmidt, Benjamin %A Karl, Daniel L %A Nakazawa, Michael S %A Park, Peter J %A Simon, M Celeste %A Yoon, Sam S %K Animals %K Antibodies, Monoclonal %K Antibodies, Monoclonal, Humanized %K Antigens, Neoplasm %K Apoptosis %K Carbonic Anhydrases %K Cell Hypoxia %K Cell Line %K Cell Line, Tumor %K DNA-Binding Proteins %K Doxorubicin %K Endothelial Cells %K Genetic Therapy %K Humans %K Hypoxia-Inducible Factor 1, alpha Subunit %K Mice %K Sarcoma %K Vascular Endothelial Growth Factor A %X

Increased levels of hypoxia and hypoxia-inducible factor 1α (HIF-1α) in human sarcomas correlate with tumor progression and radiation resistance. Prolonged antiangiogenic therapy of tumors not only delays tumor growth but may also increase hypoxia and HIF-1α activity. In our recent clinical trial, treatment with the vascular endothelial growth factor A (VEGF-A) antibody, bevacizumab, followed by a combination of bevacizumab and radiation led to near complete necrosis in nearly half of sarcomas. Gene Set Enrichment Analysis of microarrays from pretreatment biopsies found that the Gene Ontology category "Response to hypoxia" was upregulated in poor responders and that the hierarchical clustering based on 140 hypoxia-responsive genes reliably separated poor responders from good responders. The most commonly used chemotherapeutic drug for sarcomas, doxorubicin (Dox), was recently found to block HIF-1α binding to DNA at low metronomic doses. In four sarcoma cell lines, HIF-1α shRNA or Dox at low concentrations blocked HIF-1α induction of VEGF-A by 84-97% and carbonic anhydrase 9 by 83-93%. HT1080 sarcoma xenografts had increased hypoxia and/or HIF-1α activity with increasing tumor size and with anti-VEGF receptor antibody (DC101) treatment. Combining DC101 with HIF-1α shRNA or metronomic Dox had a synergistic effect in suppressing growth of HT1080 xenografts, at least in part via induction of tumor endothelial cell apoptosis. In conclusion, sarcomas respond to increased hypoxia by expressing HIF-1α target genes that may promote resistance to antiangiogenic and other therapies. HIF-1α inhibition blocks this evasive resistance and augments destruction of the tumor vasculature.

%B Int J Cancer %V 132 %P 29-41 %8 2013 Jan 1 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/22684860?dopt=Abstract %R 10.1002/ijc.27666 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2013 %T Primate genome architecture influences structural variation mechanisms and functional consequences. %A Gokcumen, Omer %A Tischler, Verena %A Tica, Jelena %A Zhu, Qihui %A Iskow, Rebecca C %A Lee, Eunjung %A Fritz, Markus Hsi-Yang %A Langdon, Amy %A Stütz, Adrian M %A Pavlidis, Pavlos %A Benes, Vladimir %A Mills, Ryan E %A Park, Peter J %A Lee, Charles %A Korbel, Jan O %K Animals %K Gene Duplication %K Gene Expression Profiling %K Gene Expression Regulation %K Genome %K Genomic Structural Variation %K Humans %K Nucleotides %K Organ Specificity %K Primates %K Species Specificity %X

Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.

%B Proc Natl Acad Sci U S A %V 110 %P 15764-9 %8 2013 Sep 24 %G eng %N 39 %1 http://www.ncbi.nlm.nih.gov/pubmed/24014587?dopt=Abstract %R 10.1073/pnas.1305904110 %0 Journal Article %J Mol Cell Biol %D 2013 %T Variable requirements for DNA-binding proteins at polycomb-dependent repressive regions in human HOX clusters. %A Woo, Caroline J %A Kharchenko, Peter V %A Daheron, Laurence %A Park, Peter J %A Kingston, Robert E %K Animals %K Cells, Cultured %K DNA-Binding Proteins %K Gene Expression Regulation, Developmental %K Genes, Homeobox %K Homeodomain Proteins %K Humans %K Mesenchymal Stromal Cells %K Multigene Family %K Polycomb Repressive Complex 2 %K Polycomb-Group Proteins %K Regulatory Elements, Transcriptional %K Transcription, Genetic %X

Polycomb group (PcG)-mediated repression is an evolutionarily conserved process critical for cell fate determination and maintenance of gene expression during embryonic development. However, the mechanisms underlying PcG recruitment in mammals remain unclear since few regulatory sites have been identified. We report two novel prospective PcG-dependent regulatory elements within the human HOXB and HOXC clusters and compare their repressive activities to a previously identified element in the HOXD cluster. These regions recruited the PcG proteins BMI1 and SUZ12 to a reporter construct in mesenchymal stem cells and conferred repression that was dependent upon PcG expression. Furthermore, we examined the potential of two DNA-binding proteins, JARID2 and YY1, to regulate PcG activity at these three elements. JARID2 has differential requirements, whereas YY1 appears to be required for repressive activity at all 3 sites. We conclude that distinct elements of the mammalian HOX clusters can recruit components of the PcG complexes and confer repression, similar to what has been seen in Drosophila. These elements, however, have diverse requirements for binding factors, which, combined with previous data on other loci, speaks to the complexity of PcG targeting in mammals.

%B Mol Cell Biol %V 33 %P 3274-85 %8 2013 Aug %G eng %N 16 %1 http://www.ncbi.nlm.nih.gov/pubmed/23775117?dopt=Abstract %R 10.1128/MCB.00275-13 %0 Journal Article %J Science %D 2013 %T Comment on "Drosophila dosage compensation involves enhanced Pol II recruitment to male X-linked promoters". %A Ferrari, F* %A Jung, Y L* %A Kharchenko, P V %A Plachetka, A %A Alekseyenko, A A %A Kuroda, M I %A Park, P J %K Animals %K DNA Polymerase II %K Dosage Compensation, Genetic %K Drosophila %K Drosophila Proteins %K Female %K Genes, X-Linked %K Male %K Promoter Regions, Genetic %K X Chromosome %X

Conrad et al. (Reports, 10 August 2012, p. 742) reported a doubling of RNA polymerase II (Pol II) occupancy at X-linked promoters to support 5' recruitment as the key mechanism for dosage compensation in Drosophila. However, they employed an erroneous data-processing step, overestimating Pol II differences. Reanalysis of the data fails to support the authors' model for dosage compensation.

%B Science %V 340 %P 273 %8 2013 Apr 19 %G eng %N 6130 %1 http://www.ncbi.nlm.nih.gov/pubmed/23599463?dopt=Abstract %R 10.1126/science.1231815 %0 Journal Article %J Cell %D 2013 %T Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome. %A Davoli, Teresa %A Xu, Andrew Wei %A Mengwasser, Kristen E %A Sack, Laura M %A Yoon, John C %A Park, Peter J %A Elledge, Stephen J %K Algorithms %K Aneuploidy %K Gene Dosage %K Genes, Tumor Suppressor %K Humans %K Neoplasms %K Oncogenes %X

Aneuploidy has been recognized as a hallmark of cancer for more than 100 years, yet no general theory to explain the recurring patterns of aneuploidy in cancer has emerged. Here, we develop Tumor Suppressor and Oncogene (TUSON) Explorer, a computational method that analyzes the patterns of mutational signatures in tumors and predicts the likelihood that any individual gene functions as a tumor suppressor (TSG) or oncogene (OG). By analyzing >8,200 tumor-normal pairs, we provide statistical evidence suggesting that many more genes possess cancer driver properties than anticipated, forming a continuum of oncogenic potential. Integrating our driver predictions with information on somatic copy number alterations, we find that the distribution and potency of TSGs (STOP genes), OGs, and essential genes (GO genes) on chromosomes can predict the complex patterns of aneuploidy and copy number variation characteristic of cancer genomes. We propose that the cancer genome is shaped through a process of cumulative haploinsufficiency and triplosensitivity.

%B Cell %V 155 %P 948-62 %8 2013 Nov 7 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/24183448?dopt=Abstract %R 10.1016/j.cell.2013.10.011 %0 Journal Article %J Cell %D 2013 %T Diverse mechanisms of somatic structural variations in human cancer genomes. %A Yang, Lixing %A Luquette, Lovelace J %A Gehlenborg, Nils %A Xi, Ruibin %A Haseley, Psalm S %A Hsieh, Chih-Heng %A Zhang, Chengsheng %A Ren, Xiaojia %A Protopopov, Alexei %A Chin, Lynda %A Kucherlapati, Raju %A Lee, Charles %A Park, Peter J %K Algorithms %K Chromosome Aberrations %K Genome, Human %K Genome-Wide Association Study %K Glioblastoma %K Humans %K Mutation %K Neoplasms %X

Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.

%B Cell %V 153 %P 919-29 %8 2013 May 9 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/23663786?dopt=Abstract %R 10.1016/j.cell.2013.04.010 %0 Journal Article %J Genome Res %D 2013 %T Functional genomic analysis of chromosomal aberrations in a compendium of 8000 cancer genomes. %A Kim, Tae-Min %A Xi, Ruibin %A Luquette, Lovelace J %A Park, Richard W %A Johnson, Mark D %A Park, Peter J %K Chromosome Aberrations %K Cluster Analysis %K Comparative Genomic Hybridization %K DNA Copy Number Variations %K Genetic Association Studies %K Genetic Loci %K Genomic Instability %K Genomics %K Humans %K Neoplasms %X

A large database of copy number profiles from cancer genomes can facilitate the identification of recurrent chromosomal alterations that often contain key cancer-related genes. It can also be used to explore low-prevalence genomic events such as chromothripsis. In this study, we report an analysis of 8227 human cancer copy number profiles obtained from 107 array comparative genomic hybridization (CGH) studies. Our analysis reveals similarity of chromosomal arm-level alterations among developmentally related tumor types as well as a number of co-occurring pairs of arm-level alterations. Recurrent ("pan-lineage") focal alterations identified across diverse tumor types show an enrichment of known cancer-related genes and genes with relevant functions in cancer-associated phenotypes (e.g., kinase and cell cycle). Tumor type-specific ("lineage-restricted") alterations and their enriched functional categories were also identified. Furthermore, we developed an algorithm for detecting regions in which the copy number oscillates rapidly between fixed levels, indicative of chromothripsis. We observed these massive genomic rearrangements in 1%-2% of the samples with variable tumor type-specific incidence rates. Taken together, our comprehensive view of copy number alterations provides a framework for understanding the functional significance of various genomic alterations in cancer genomes.

%B Genome Res %V 23 %P 217-27 %8 2013 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/23132910?dopt=Abstract %R 10.1101/gr.140301.112 %0 Journal Article %J Cell Rep %D 2013 %T "Jump start and gain" model for dosage compensation in Drosophila based on direct sequencing of nascent transcripts. %A Ferrari, Francesco %A Plachetka, Annette %A Alekseyenko, Artyom A %A Jung, Youngsook L %A Ozsolak, Fatih %A Kharchenko, Peter V %A Park, Peter J %A Kuroda, Mitzi I %X

Dosage compensation in Drosophila is mediated by the MSL complex, which increases male X-linked gene expression approximately 2-fold. The MSL complex preferentially binds the bodies of active genes on the male X, depositing H4K16ac with a 3' bias. Two models have been proposed for the influence of the MSL complex on transcription: one based on promoter recruitment of RNA polymerase II (Pol II), and a second featuring enhanced transcriptional elongation. Here, we utilize nascent RNA sequencing to document dosage compensation during transcriptional elongation. We also compare X and autosomes from published data on paused and elongating polymerase in order to assess the role of Pol II recruitment. Our results support a model for differentially regulated elongation, starting with release from 5' pausing and increasing through X-linked gene bodies. Our results highlight facilitated transcriptional elongation as a key mechanism for the coordinated regulation of a diverse set of genes.

%B Cell Rep %V 5 %P 629-36 %8 2013 Nov 14 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/24183666?dopt=Abstract %R 10.1016/j.celrep.2013.09.037 %0 Journal Article %J J Clin Invest %D 2013 %T KDM2B promotes pancreatic cancer via Polycomb-dependent and -independent transcriptional programs. %A Tzatsos, Alexandros %A Paskaleva, Polina* %A Ferrari, Francesco* %A Deshpande, Vikram %A Stoykova, Svetlana %A Contino, Gianmarco %A Wong, Kwok-Kin %A Lan, Fei %A Trojer, Patrick %A Park, Peter J %A Bardeesy, Nabeel %K Animals %K Carcinoma, Pancreatic Ductal %K Cell Line, Tumor %K Disease Models, Animal %K Epigenesis, Genetic %K F-Box Proteins %K Humans %K Jumonji Domain-Containing Histone Demethylases %K Mice %K Mice, SCID %K Oxidoreductases, N-Demethylating %K Pancreatic Neoplasms %K Polycomb-Group Proteins %K Proto-Oncogene Proteins p21(ras) %K Transcription, Genetic %K Up-Regulation %X

Epigenetic mechanisms mediate heritable control of cell identity in normal cells and cancer. We sought to identify epigenetic regulators driving the pathogenesis of pancreatic ductal adenocarcinoma (PDAC), one of the most lethal human cancers. We found that KDM2B (also known as Ndy1, FBXL10, and JHDM1B), an H3K36 histone demethylase implicated in bypass of cellular senescence and somatic cell reprogramming, is markedly overexpressed in human PDAC, with levels increasing with disease grade and stage, and highest expression in metastases. KDM2B silencing abrogated tumorigenicity of PDAC cell lines exhibiting loss of epithelial differentiation, whereas KDM2B overexpression cooperated with KrasG12D to promote PDAC formation in mouse models. Gain- and loss-of-function experiments coupled to genome-wide gene expression and ChIP studies revealed that KDM2B drives tumorigenicity through 2 different transcriptional mechanisms. KDM2B repressed developmental genes through cobinding with Polycomb group (PcG) proteins at transcriptional start sites, whereas it activated a module of metabolic genes, including mediators of protein synthesis and mitochondrial function, cobound by the MYC oncogene and the histone demethylase KDM5A. These results defined epigenetic programs through which KDM2B subverts cellular differentiation and drives the pathogenesis of an aggressive subset of PDAC.

%B J Clin Invest %V 123 %P 727-39 %8 2013 Feb 1 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/23321669?dopt=Abstract %R 10.1172/JCI64535 %0 Journal Article %J Cell %D 2013 %T The landscape of microsatellite instability in colorectal and endometrial cancer genomes. %A Kim, Tae-Min %A Laird, Peter W %A Park, Peter J %K Colorectal Neoplasms %K Endometrial Neoplasms %K Epigenesis, Genetic %K Female %K Frameshift Mutation %K Genome-Wide Association Study %K Humans %K Male %K Microsatellite Instability %X

Microsatellites-simple tandem repeats present at millions of sites in the human genome-can shorten or lengthen due to a defect in DNA mismatch repair. We present here a comprehensive genome-wide analysis of the prevalence, mutational spectrum, and functional consequences of microsatellite instability (MSI) in cancer genomes. We analyzed MSI in 277 colorectal and endometrial cancer genomes (including 57 microsatellite-unstable ones) using exome and whole-genome sequencing data. Recurrent MSI events in coding sequences showed tumor type specificity, elevated frameshift-to-inframe ratios, and lower transcript levels than wild-type alleles. Moreover, genome-wide analysis revealed differences in the distribution of MSI versus point mutations, including overrepresentation of MSI in euchromatic and intronic regions compared to heterochromatic and intergenic regions, respectively, and depletion of MSI at nucleosome-occupied sequences. Our results provide a panoramic view of MSI in cancer genomes, highlighting their tumor type specificity, impact on gene expression, and the role of chromatin organization.

%B Cell %V 155 %P 858-68 %8 2013 Nov 7 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/24209623?dopt=Abstract %R 10.1016/j.cell.2013.10.015 %0 Journal Article %J Bioinformatics %D 2013 %T Nozzle: a report generation toolkit for data analysis pipelines. %A Gehlenborg, Nils %A Noble, Michael S %A Getz, Gad %A Chin, Lynda %A Park, Peter J %K Computational Biology %K Genomics %K Humans %K Neoplasms %K Programming Languages %K Software %K User-Computer Interface %K Workflow %X

SUMMARY: We have developed Nozzle, an R package that provides an Application Programming Interface to generate HTML reports with dynamic user interface elements. Nozzle was designed to facilitate summarization and rapid browsing of complex results in data analysis pipelines where multiple analyses are performed frequently on big datasets. The package can be applied to any project where user-friendly reports need to be created. AVAILABILITY: The R package is available on CRAN at http://cran.r-project.org/package=Nozzle.R1. Examples and additional materials are available at http://gdac.broadinstitute.org/nozzle. The source code is also available at http://www.github.com/parklab/Nozzle. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 29 %P 1089-91 %8 2013 Apr 15 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/23419376?dopt=Abstract %R 10.1093/bioinformatics/btt085 %0 Journal Article %J Mol Cell Biol %D 2013 %T Spt6 regulates intragenic and antisense transcription, nucleosome positioning, and histone modifications genome-wide in fission yeast. %A DeGennaro, Christine M %A Alver, Burak Han %A Marguerat, Samuel %A Stepanova, Ekaterina %A Davis, Christopher P %A Bähler, Jürg %A Park, Peter J %A Winston, Fred %K Base Sequence %K Consensus Sequence %K Gene Expression Regulation, Fungal %K Genome, Fungal %K Histone Chaperones %K Histone-Lysine N-Methyltransferase %K Histones %K Methylation %K Nucleosomes %K Protein Multimerization %K Protein Processing, Post-Translational %K RNA Splicing %K RNA, Antisense %K RNA, Messenger %K Schizosaccharomyces %K Schizosaccharomyces pombe Proteins %K Sequence Analysis, DNA %K Transcriptome %X

Spt6 is a highly conserved histone chaperone that interacts directly with both RNA polymerase II and histones to regulate gene expression. To gain a comprehensive understanding of the roles of Spt6, we performed genome-wide analyses of transcription, chromatin structure, and histone modifications in a Schizosaccharomyces pombe spt6 mutant. Our results demonstrate dramatic changes to transcription and chromatin structure in the mutant, including elevated antisense transcripts at >70% of all genes and general loss of the +1 nucleosome. Furthermore, Spt6 is required for marks associated with active transcription, including trimethylation of histone H3 on lysine 4, previously observed in humans but not Saccharomyces cerevisiae, and lysine 36. Taken together, our results indicate that Spt6 is critical for the accuracy of transcription and the integrity of chromatin, likely via its direct interactions with RNA polymerase II and histones.

%B Mol Cell Biol %V 33 %P 4779-92 %8 2013 Dec %G eng %N 24 %1 http://www.ncbi.nlm.nih.gov/pubmed/24100010?dopt=Abstract %R 10.1128/MCB.01068-13 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2013 %T Swi/Snf chromatin remodeling/tumor suppressor complex establishes nucleosome occupancy at target promoters. %A Tolstorukov, Michael Y* %A Sansam, Courtney G* %A Lu, Ping* %A Koellhoffer, Edward C %A Helming, Katherine C %A Alver, Burak Han %A Tillman, Erik J %A Evans, Julia A %A Wilson, Boris G %A Park, Peter J** %A Roberts, Charles W M** %K Animals %K Cell Proliferation %K Chromatin %K Chromosomal Proteins, Non-Histone %K CpG Islands %K DNA Helicases %K Fibroblasts %K Gene Expression Regulation, Neoplastic %K Gene Knockdown Techniques %K Mice %K Neoplasms %K Nuclear Proteins %K Nucleosomes %K Primary Cell Culture %K Promoter Regions, Genetic %K Protein Binding %K Transcription Factors %K Transcriptional Activation %X

Precise nucleosome-positioning patterns at promoters are thought to be crucial for faithful transcriptional regulation. However, the mechanisms by which these patterns are established, are dynamically maintained, and subsequently contribute to transcriptional control are poorly understood. The switch/sucrose non-fermentable chromatin remodeling complex, also known as the Brg1 associated factors complex, is a master developmental regulator and tumor suppressor capable of mobilizing nucleosomes in biochemical assays. However, its role in establishing the nucleosome landscape in vivo is unclear. Here we have inactivated Snf5 and Brg1, core subunits of the mammalian Swi/Snf complex, to evaluate their effects on chromatin structure and transcription levels genomewide. We find that inactivation of either subunit leads to disruptions of specific nucleosome patterning combined with a loss of overall nucleosome occupancy at a large number of promoters, regardless of their association with CpG islands. These rearrangements are accompanied by gene expression changes that promote cell proliferation. Collectively, these findings define a direct relationship between chromatin-remodeling complexes, chromatin structure, and transcriptional regulation.

%B Proc Natl Acad Sci U S A %V 110 %P 10165-70 %8 2013 Jun 18 %G eng %N 25 %1 http://www.ncbi.nlm.nih.gov/pubmed/23723349?dopt=Abstract %R 10.1073/pnas.1302209110 %0 Book Section %B Tag-Based Next Generation Sequencing %D 2012 %T Genome-wide mapping of protein-DNA interactions by ChIP-seq %A Ho, Joshua W K %A Alekseyenko, Artyom A %A Kuroda, Mitzi I %A Park, Peter J %E Harbers, Matthias %E Kahl, Günter %B Tag-Based Next Generation Sequencing %I Wiley-VCH Verlag GmbH & Co. KGaA %C Weinheim, Germany %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/9783527644582.ch9/summary %0 Journal Article %J Computer Graphics Forum %D 2012 %T StratomeX: Visual analysis of large-scale heterogeneous genomics data for cancer subtype characterization %A Lex, Alexander %A Streit, Marc %A Shulz, H -J %A Partl, C %A Schmalstieg, D %A Park, Peter J %A Gehlenborg, Nils %B Computer Graphics Forum %V 31 %P 1175-1184 %G eng %U http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8659.2012.03110.x/abstract %N 3 %0 Journal Article %J Nature %D 2012 %T An integrated encyclopedia of DNA elements in the human genome. %A ENCODE Project, Consortium %K Alleles %K Animals %K Binding Sites %K Chromatin %K Chromatin Immunoprecipitation %K Chromosomes, Human %K Deoxyribonuclease I %K DNA %K DNA Footprinting %K DNA Methylation %K DNA-Binding Proteins %K Encyclopedias as Topic %K Exons %K Genetic Predisposition to Disease %K Genetic Variation %K Genome, Human %K Genome-Wide Association Study %K Genomics %K Histones %K Humans %K Mammals %K Molecular Sequence Annotation %K Neoplasms %K Polymorphism, Single Nucleotide %K Promoter Regions, Genetic %K Proteins %K Regulatory Sequences, Nucleic Acid %K Sequence Analysis, RNA %K Transcription Factors %K Transcription, Genetic %X

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

%B Nature %V 489 %P 57-74 %8 2012 Sep 6 %G eng %N 7414 %1 http://www.ncbi.nlm.nih.gov/pubmed/22955616?dopt=Abstract %R 10.1038/nature11247 %0 Journal Article %J Genome Res %D 2012 %T ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. %A Landt, Stephen G %A Marinov, Georgi K %A Kundaje, Anshul %A Kheradpour, Pouya %A Pauli, Florencia %A Batzoglou, Serafim %A Bernstein, Bradley E %A Bickel, Peter %A Brown, James B %A Cayting, Philip %A Chen, Yiwen %A DeSalvo, Gilberto %A Epstein, Charles %A Fisher-Aylor, Katherine I %A Euskirchen, Ghia %A Gerstein, Mark %A Gertz, Jason %A Hartemink, Alexander J %A Hoffman, Michael M %A Iyer, Vishwanath R %A Jung, Youngsook L %A Karmakar, Subhradip %A Kellis, Manolis %A Kharchenko, Peter V %A Li, Qunhua %A Liu, Tao %A Liu, X Shirley %A Ma, Lijia %A Milosavljevic, Aleksandar %A Myers, Richard M %A Park, Peter J %A Pazin, Michael J %A Perry, Marc D %A Raha, Debasish %A Reddy, Timothy E %A Rozowsky, Joel %A Shoresh, Noam %A Sidow, Arend %A Slattery, Matthew %A Stamatoyannopoulos, John A %A Tolstorukov, Michael Y %A White, Kevin P %A Xi, Simon %A Farnham, Peggy J %A Lieb, Jason D %A Wold, Barbara J %A Snyder, Michael %K Animals %K Chromatin Immunoprecipitation %K Databases, Genetic %K Genome %K Genomics %K Guidelines as Topic %K High-Throughput Nucleotide Sequencing %K Histones %K Humans %K Internet %K Transcription Factors %X

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.

%B Genome Res %V 22 %P 1813-31 %8 2012 Sep %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/22955991?dopt=Abstract %R 10.1101/gr.136184.111 %0 Journal Article %J Nature %D 2012 %T Comprehensive genomic characterization of squamous cell lung cancers. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Adenocarcinoma %K Carcinoma, Squamous Cell %K DNA Mutational Analysis %K Gene Deletion %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Genes, p16 %K Genes, p53 %K Genome, Human %K Genomics %K Humans %K Lung Neoplasms %K Molecular Targeted Therapy %K Mutation %K Mutation Rate %K Phosphatidylinositol 3-Kinases %K Signal Transduction %X

Lung squamous cell carcinoma is a common type of lung cancer, causing approximately 400,000 deaths per year worldwide. Genomic alterations in squamous cell lung cancers have not been comprehensively characterized, and no molecularly targeted agents have been specifically developed for its treatment. As part of The Cancer Genome Atlas, here we profile 178 lung squamous cell carcinomas to provide a comprehensive landscape of genomic and epigenomic alterations. We show that the tumour type is characterized by complex genomic alterations, with a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 segments of copy number alteration per tumour. We find statistically recurrent mutations in 11 genes, including mutation of TP53 in nearly all specimens. Previously unreported loss-of-function mutations are seen in the HLA-A class I major histocompatibility gene. Significantly altered pathways included NFE2L2 and KEAP1 in 34%, squamous differentiation genes in 44%, phosphatidylinositol-3-OH kinase pathway genes in 47%, and CDKN2A and RB1 in 72% of tumours. We identified a potential therapeutic target in most tumours, offering new avenues of investigation for the treatment of squamous cell lung cancers.

%B Nature %V 489 %P 519-25 %8 2012 Sep 27 %G eng %N 7417 %1 http://www.ncbi.nlm.nih.gov/pubmed/22960745?dopt=Abstract %R 10.1038/nature11404 %0 Journal Article %J Nature %D 2012 %T Comprehensive molecular characterization of human colon and rectal cancer. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Colonic Neoplasms %K DNA Copy Number Variations %K DNA Methylation %K Exome %K Gene Expression Profiling %K Humans %K Mutation %K Mutation Rate %K Polymorphism, Single Nucleotide %K Rectal Neoplasms %K Sequence Analysis, DNA %X

To characterize somatic alterations in colorectal carcinoma, we conducted a genome-scale analysis of 276 samples, analysing exome sequence, DNA copy number, promoter methylation and messenger RNA and microRNA expression. A subset of these samples (97) underwent low-depth-of-coverage whole-genome sequencing. In total, 16% of colorectal carcinomas were found to be hypermutated: three-quarters of these had the expected high microsatellite instability, usually with hypermethylation and MLH1 silencing, and one-quarter had somatic mismatch-repair gene and polymerase ε (POLE) mutations. Excluding the hypermutated cancers, colon and rectum cancers were found to have considerably similar patterns of genomic alteration. Twenty-four genes were significantly mutated, and in addition to the expected APC, TP53, SMAD4, PIK3CA and KRAS mutations, we found frequent mutations in ARID1A, SOX9 and FAM123B. Recurrent copy-number alterations include potentially drug-targetable amplifications of ERBB2 and newly discovered amplification of IGF2. Recurrent chromosomal translocations include the fusion of NAV2 and WNT pathway member TCF7L1. Integrative analyses suggest new markers for aggressive colorectal carcinoma and an important role for MYC-directed transcriptional activation and repression.

%B Nature %V 487 %P 330-7 %8 2012 Jul 19 %G eng %N 7407 %1 http://www.ncbi.nlm.nih.gov/pubmed/22810696?dopt=Abstract %R 10.1038/nature11252 %0 Journal Article %J Nature %D 2012 %T Comprehensive molecular portraits of human breast tumours. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Breast Neoplasms %K DNA Copy Number Variations %K DNA Methylation %K DNA Mutational Analysis %K Exome %K Female %K GATA3 Transcription Factor %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Genes, BRCA1 %K Genes, erbB-2 %K Genes, Neoplasm %K Genes, p53 %K Genetic Heterogeneity %K Genome, Human %K Genomics %K Humans %K MAP Kinase Kinase Kinase 1 %K MicroRNAs %K Mutation %K Oligonucleotide Array Sequence Analysis %K Ovarian Neoplasms %K Phosphatidylinositol 3-Kinases %K Protein Array Analysis %K Proteomics %K Receptors, Estrogen %K Retinoblastoma Protein %K RNA, Messenger %K RNA, Neoplasm %X

We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.

%B Nature %V 490 %P 61-70 %8 2012 Oct 4 %G eng %N 7418 %1 http://www.ncbi.nlm.nih.gov/pubmed/23000897?dopt=Abstract %R 10.1038/nature11412 %0 Journal Article %J Hum Mol Genet %D 2012 %T Expression profiling of uterine leiomyomata cytogenetic subgroups reveals distinct signatures in matched myometrium: transcriptional profiling of the t(12;14) and evidence in support of predisposing genetic heterogeneity. %A Hodge, Jennelle C %A Kim, Tae-Min %A Dreyfuss, Jonathan M %A Somasundaram, Priya %A Christacos, Nicole C %A Rousselle, Marissa %A Quade, Bradley J %A Park, Peter J %A Stewart, Elizabeth A %A Morton, Cynthia C %K Chromosome Aberrations %K Cluster Analysis %K Female %K Gene Expression Profiling %K Genetic Heterogeneity %K Genetic Predisposition to Disease %K HMGA2 Protein %K Humans %K Karyotyping %K Leiomyoma %K Myometrium %K Transcription, Genetic %K Uterine Neoplasms %X

Uterine leiomyomata (UL), the most common neoplasm in reproductive-age women, are classified into distinct genetic subgroups based on recurrent chromosome abnormalities. To develop a molecular signature of UL with t(12;14)(q14-q15;q23-q24), we took advantage of the multiple UL arising as independent clonal lesions within a single uterus. We compared genome-wide expression levels of t(12;14) UL to non-t(12;14) UL from each of nine women in a paired analysis, with each sample weighted for the percentage of t(12;14) cells to adjust for mosaicism with normal cells. This resulted in a transcriptional profile that confirmed HMGA2, known to be overexpressed in t(12;14) UL, as the most significantly altered gene. Pathway analysis of the differentially expressed genes showed significant association with cell proliferation, particularly G1/S checkpoint regulation. This is consistent with the known larger size of t(12;14) UL relative to karyotypically normal UL or to UL in the deletion 7q22 subgroup. Unsupervised hierarchical clustering demonstrated that patient variability is relatively dominant to the distinction of t(12;14) UL compared with non-t(12;14) UL or of t(12;14) UL compared with del(7q) UL. The paired design we employed is therefore important to produce an accurate t(12;14) UL-specific gene list by removing the confounding effects of genotype and environment. Interestingly, myometrium not only clustered away from the tumors, but generally separated based on associated t(12;14) versus del(7q) status. Nine genes were identified whose expression can distinguish the myometrium origin. This suggests an underlying constitutional genetic predisposition to these somatic changes which could potentially lead to improved personalized management and treatment.

%B Hum Mol Genet %V 21 %P 2312-29 %8 2012 May 15 %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/22343407?dopt=Abstract %R 10.1093/hmg/dds051 %0 Journal Article %J PLoS Genet %D 2012 %T Identification of chromatin-associated regulators of MSL complex targeting in Drosophila dosage compensation. %A Larschan, Erica %A Soruco, Marcela M L %A Lee, Ok-Kyung %A Peng, Shouyong %A Bishop, Eric P %A Chery, Jessica %A Goebel, Karen %A Feng, Jessica %A Park, Peter J %A Kuroda, Mitzi I %K Animals %K Cell Line %K Chromatin %K DNA-Binding Proteins %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Female %K Gene Expression Regulation %K Male %K Nuclear Proteins %K RNA Interference %K Transcription Factors %K X Chromosome %X

Sex chromosome dosage compensation in Drosophila provides a model for understanding how chromatin organization can modulate coordinate gene regulation. Male Drosophila increase the transcript levels of genes on the single male X approximately two-fold to equal the gene expression in females, which have two X-chromosomes. Dosage compensation is mediated by the Male-Specific Lethal (MSL) histone acetyltransferase complex. Five core components of the MSL complex were identified by genetic screens for genes that are specifically required for male viability and are dispensable for females. However, because dosage compensation must interface with the general transcriptional machinery, it is likely that identifying additional regulators that are not strictly male-specific will be key to understanding the process at a mechanistic level. Such regulators would not have been recovered from previous male-specific lethal screening strategies. Therefore, we have performed a cell culture-based, genome-wide RNAi screen to search for factors required for MSL targeting or function. Here we focus on the discovery of proteins that function to promote MSL complex recruitment to "chromatin entry sites," which are proposed to be the initial sites of MSL targeting. We find that components of the NSL (Non-specific lethal) complex, and a previously unstudied zinc-finger protein, facilitate MSL targeting and display a striking enrichment at MSL entry sites. Identification of these factors provides new insight into how MSL complex establishes the specialized hyperactive chromatin required for dosage compensation in Drosophila.

%B PLoS Genet %V 8 %P e1002830 %8 2012 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/22844249?dopt=Abstract %R 10.1371/journal.pgen.1002830 %0 Journal Article %J Ann Surg %D 2012 %T Upregulation of proapoptotic microRNA mir-125a after massive small bowel resection in rats. %A Balakrishnan, Anita %A Stearns, Adam T %A Park, Peter J %A Dreyfuss, Jonathan M %A Ashley, Stanley W %A Rhoads, David B %A Tavakkolizadeh, Ali %K Animals %K Apoptosis %K Blotting, Western %K Cell Line %K Cell Proliferation %K Flow Cytometry %K Intestinal Mucosa %K Intestine, Small %K Laser Capture Microdissection %K Male %K MicroRNAs %K Myeloid Cell Leukemia Sequence 1 Protein %K Oligonucleotide Array Sequence Analysis %K Proto-Oncogene Proteins c-bcl-2 %K Rats %K Rats, Sprague-Dawley %K Real-Time Polymerase Chain Reaction %K Short Bowel Syndrome %K Up-Regulation %X

OBJECTIVE: Short bowel syndrome remains a condition of high morbidity and mortality, and current therapeutic options carry significant side effects. To identify new treatments we focused on postresection changes in microRNAs--short noncoding RNAs, which suppress target genes--and suggest a previously undiscovered role for microRNA-125a (mir-125a) in intestinal adaptation. METHODS: Rats underwent either 80% massive small bowel resection or transection and were harvested after 48 hours. Jejunum was harvested for microRNA microarrays, laser capture microdissection, and RNA and protein analysis. Mir-125a was overexpressed in intestinal epithelium-6 (crypt-derived) cells (IEC-6) and effects on proliferation and apoptosis determined using MTS and flow cytometry. Expression of potential targets of mir-125a in rat jejunum and IEC-6 cells was determined using quantitative real-time polymerase chain reaction (RNA) and Western blotting (protein). RESULTS: Resection upregulated mir-125a and mir-214 by 2.4-folds and 3.2-folds, respectively. Highest levels of expression were noted in the crypt fraction. Mir-125a overexpression induced apoptosis and resultant growth arrest in IEC-6 cells. The expression of the prosurvival Bcl-2 family member Mcl-1 was downregulated in both mir-125a-overexpressing IEC-6 cells and in jejunum of resected rats, confirming Mcl-1 as a previously undiscovered target of mir-125a. CONCLUSIONS: Upregulation of mir-125a suppresses the prosurvival protein Mcl1, producing the increase in apoptosis known to accompany the proliferative changes characteristic of intestinal adaptation. Our data highlight a potential role for microRNAs as mediators of the adaptive process and may facilitate the development of new therapeutic options for short bowel syndrome.

%B Ann Surg %V 255 %P 747-53 %8 2012 Apr %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/22418008?dopt=Abstract %R 10.1097/SLA.0b013e31824b485a %0 Journal Article %J PLoS Genet %D 2012 %T Enrichment of HP1a on Drosophila chromosome 4 genes creates an alternate chromatin structure critical for regulation in this heterochromatic domain. %A Riddle, Nicole C* %A Jung, Youngsook L* %A Gu, Tingting* %A Alekseyenko, Artyom A %A Asker, Dalal %A Gui, Hongxing %A Kharchenko, Peter V %A Minoda, Aki %A Plachetka, Annette %A Schwartz, Yuri B %A Tolstorukov, Michael Y %A Kuroda, Mitzi I %A Pirrotta, Vincenzo %A Karpen, Gary H %A Park, Peter J** %A Elgin, Sarah C R** %K Animals %K Animals, Genetically Modified %K Chromosomal Proteins, Non-Histone %K Chromosomes %K DNA-Directed RNA Polymerases %K Drosophila melanogaster %K Drosophila Proteins %K Euchromatin %K Gene Expression Regulation %K Heterochromatin %K Histone-Lysine N-Methyltransferase %K Histones %K Humans %K Methylation %K Mutation %X

Chromatin environments differ greatly within a eukaryotic genome, depending on expression state, chromosomal location, and nuclear position. In genomic regions characterized by high repeat content and high gene density, chromatin structure must silence transposable elements but permit expression of embedded genes. We have investigated one such region, chromosome 4 of Drosophila melanogaster. Using chromatin-immunoprecipitation followed by microarray (ChIP-chip) analysis, we examined enrichment patterns of 20 histone modifications and 25 chromosomal proteins in S2 and BG3 cells, as well as the changes in several marks resulting from mutations in key proteins. Active genes on chromosome 4 are distinct from those in euchromatin or pericentric heterochromatin: while there is a depletion of silencing marks at the transcription start sites (TSSs), HP1a and H3K9me3, but not H3K9me2, are enriched strongly over gene bodies. Intriguingly, genes on chromosome 4 are less frequently associated with paused polymerase. However, when the chromatin is altered by depleting HP1a or POF, the RNA pol II enrichment patterns of many chromosome 4 genes shift, showing a significant decrease over gene bodies but not at TSSs, accompanied by lower expression of those genes. Chromosome 4 genes have a low incidence of TRL/GAGA factor binding sites and a low T(m) downstream of the TSS, characteristics that could contribute to a low incidence of RNA polymerase pausing. Our data also indicate that EGG and POF jointly regulate H3K9 methylation and promote HP1a binding over gene bodies, while HP1a targeting and H3K9 methylation are maintained at the repeats by an independent mechanism. The HP1a-enriched, POF-associated chromatin structure over the gene bodies may represent one type of adaptation for genes embedded in repetitive DNA.

%B PLoS Genet %V 8 %P e1002954 %8 2012 Sep %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/23028361?dopt=Abstract %R 10.1371/journal.pgen.1002954 %0 Journal Article %J Genome Res %D 2012 %T Nature and function of insulator protein binding sites in the Drosophila genome. %A Schwartz, Yuri B %A Linder-Basso, Daniela %A Kharchenko, Peter V %A Tolstorukov, Michael Y %A Kim, Maria %A Li, Hua-Bing %A Gorchakov, Andrey A %A Minoda, Aki %A Shanower, Gregory %A Alekseyenko, Artyom A %A Riddle, Nicole C %A Jung, Youngsook L %A Gu, Tingting %A Plachetka, Annette %A Elgin, Sarah C R %A Kuroda, Mitzi I %A Park, Peter J %A Savitsky, Mikhail %A Karpen, Gary H %A Pirrotta, Vincenzo %K Animals %K Binding Sites %K Drosophila melanogaster %K Drosophila Proteins %K Epigenesis, Genetic %K Genome, Insect %K Histones %K Insulator Elements %K Methylation %K Microtubule-Associated Proteins %K Nuclear Proteins %K Polycomb-Group Proteins %K Protein Processing, Post-Translational %K RNA, Small Interfering %K Transcription, Genetic %X

Chromatin insulator elements and associated proteins have been proposed to partition eukaryotic genomes into sets of independently regulated domains. Here we test this hypothesis by quantitative genome-wide analysis of insulator protein binding to Drosophila chromatin. We find distinct combinatorial binding of insulator proteins to different classes of sites and uncover a novel type of insulator element that binds CP190 but not any other known insulator proteins. Functional characterization of different classes of binding sites indicates that only a small fraction act as robust insulators in standard enhancer-blocking assays. We show that insulators restrict the spreading of the H3K27me3 mark but only at a small number of Polycomb target regions and only to prevent repressive histone methylation within adjacent genes that are already transcriptionally inactive. RNAi knockdown of insulator proteins in cultured cells does not lead to major alterations in genome expression. Taken together, these observations argue against the concept of a genome partitioned by specialized boundary elements and suggest that insulators are reserved for specific regulation of selected genes.

%B Genome Res %V 22 %P 2188-98 %8 2012 Nov %G eng %N 11 %1 http://www.ncbi.nlm.nih.gov/pubmed/22767387?dopt=Abstract %R 10.1101/gr.138156.112 %0 Journal Article %J Neoplasia %D 2012 %T Alternative splicing of CHEK2 and codeletion with NF2 promote chromosomal instability in meningioma. %A Yang, Hong Wei %A Kim, Tae-Min %A Song, Sydney S %A Shrinath, Nihal %A Park, Richard %A Kalamarides, Michel %A Park, Peter J %A Black, Peter M %A Carroll, Rona S %A Johnson, Mark D %K Alternative Splicing %K Blotting, Western %K Checkpoint Kinase 2 %K Chromosomal Instability %K Disease Progression %K Gene Deletion %K Genes, Neurofibromatosis 2 %K Humans %K Meningeal Neoplasms %K Meningioma %K Neoplasm Grading %K Protein-Serine-Threonine Kinases %K Reverse Transcriptase Polymerase Chain Reaction %X

Mutations of the NF2 gene on chromosome 22q are thought to initiate tumorigenesis in nearly 50% of meningiomas, and 22q deletion is the earliest and most frequent large-scale chromosomal abnormality observed in these tumors. In aggressive meningiomas, 22q deletions are generally accompanied by the presence of large-scale segmental abnormalities involving other chromosomes, but the reasons for this association are unknown. We find that large-scale chromosomal alterations accumulate during meningioma progression primarily in tumors harboring 22q deletions, suggesting 22q-associated chromosomal instability. Here we show frequent codeletion of the DNA repair and tumor suppressor gene, CHEK2, in combination with NF2 on chromosome 22q in a majority of aggressive meningiomas. In addition, tumor-specific splicing of CHEK2 in meningioma leads to decreased functional Chk2 protein expression. We show that enforced Chk2 knockdown in meningioma cells decreases DNA repair. Furthermore, Chk2 depletion increases centrosome amplification, thereby promoting chromosomal instability. Taken together, these data indicate that alternative splicing and frequent codeletion of CHEK2 and NF2 contribute to the genomic instability and associated development of aggressive biologic behavior in meningiomas.

%B Neoplasia %V 14 %P 20-8 %8 2012 Jan %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/22355270?dopt=Abstract %0 Journal Article %J Nat Genet %D 2012 %T Ascorbic acid prevents loss of Dlk1-Dio3 imprinting and facilitates generation of all-iPS cell mice from terminally differentiated B cells. %A Stadtfeld, Matthias %A Apostolou, Effie %A Ferrari, Francesco %A Choi, Jiho %A Walsh, Ryan M %A Chen, Taiping %A Ooi, Steen S K %A Kim, Sang Yong %A Bestor, Timothy H %A Shioda, Toshi %A Park, Peter J %A Hochedlinger, Konrad %K Animals %K Ascorbic Acid %K B-Lymphocytes %K Cell Differentiation %K Chromatin %K DNA (Cytosine-5-)-Methyltransferase %K DNA Methylation %K Genomic Imprinting %K Induced Pluripotent Stem Cells %K Intercellular Signaling Peptides and Proteins %K Mice %K Nuclear Reprogramming %K Octamer Transcription Factor-3 %K Promoter Regions, Genetic %K Proteins %K RNA, Long Noncoding %X

The generation of induced pluripotent stem cells (iPSCs) often results in aberrant epigenetic silencing of the imprinted Dlk1-Dio3 gene cluster, compromising the ability to generate entirely iPSC-derived adult mice ('all-iPSC mice'). Here, we show that reprogramming in the presence of ascorbic acid attenuates hypermethylation of Dlk1-Dio3 by enabling a chromatin configuration that interferes with binding of the de novo DNA methyltransferase Dnmt3a. This approach allowed us to generate all-iPSC mice from mature B cells, which have until now failed to support the development of exclusively iPSC-derived postnatal animals. Our data show that transcription factor-mediated reprogramming can endow a defined, terminally differentiated cell type with a developmental potential equivalent to that of embryonic stem cells. More generally, these findings indicate that culture conditions during cellular reprogramming can strongly influence the epigenetic and biological properties of the resultant iPSCs.

%B Nat Genet %V 44 %P 398-405, S1-2 %8 2012 Apr %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/22387999?dopt=Abstract %R 10.1038/ng.1110 %0 Journal Article %J Toxicol Sci %D 2012 %T Expression, circulation, and excretion profile of microRNA-21, -155, and -18a following acute kidney injury. %A Saikumar, Janani %A Hoffmann, Dana %A Kim, Tae-Min %A Gonzalez, Victoria Ramirez %A Zhang, Qin %A Goering, Peter L %A Brown, Ronald P %A Bijol, Vanesa %A Park, Peter J %A Waikar, Sushrut S %A Vaidya, Vishal S %K Acute Kidney Injury %K Algorithms %K Animals %K Case-Control Studies %K Gentamicins %K Humans %K MicroRNAs %K Polymerase Chain Reaction %K Rats %K Rats, Wistar %X

MicroRNAs (miRNAs) are endogenous noncoding RNA molecules that are involved in post-transcriptional gene silencing. Using global miRNA expression profiling, we found miR-21, -155, and 18a to be highly upregulated in rat kidneys following tubular injury induced by ischemia/reperfusion (I/R) or gentamicin administration. Mir-21 and -155 also showed decreased expression patterns in blood and urinary supernatants in both models of kidney injury. Furthermore, urinary levels of miR-21 increased 1.2-fold in patients with clinical diagnosis of acute kidney injury (AKI) (n = 22) as compared with healthy volunteers (n = 25) (p < 0.05), and miR-155 decreased 1.5-fold in patients with AKI (p < 0.01). We identified 29 messenger RNA core targets of these 3 miRNAs using the context likelihood of relatedness algorithm and found these predicted gene targets to be highly enriched for genes associated with apoptosis or cell proliferation. Taken together, these results suggest that miRNA-21 and -155 could potentially serve as translational biomarkers for detection of AKI and may play a critical role in the pathogenesis of kidney injury and tissue repair process.

%B Toxicol Sci %V 129 %P 256-67 %8 2012 Oct %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/22705808?dopt=Abstract %R 10.1093/toxsci/kfs210 %0 Journal Article %J Mol Cell %D 2012 %T Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells. %A Tolstorukov, Michael Y* %A Goldman, Joseph A* %A Gilbert, Cristele %A Ogryzko, Vasily %A Kingston, Robert E** %A Park, Peter J** %K Cell Line, Tumor %K Chromatin %K Down-Regulation %K Exons %K Gene Expression %K Genetic Variation %K HeLa Cells %K Histones %K Humans %K Introns %K Nucleosomes %K Proteomics %K RNA Processing, Post-Transcriptional %K RNA Splicing %K RNA, Messenger %K Transcription, Genetic %X

Variation in chromatin composition and organization often reflects differences in genome function. Histone variants, for example, replace canonical histones to contribute to regulation of numerous nuclear processes including transcription, DNA repair, and chromosome segregation. Here we focus on H2A.Bbd, a rapidly evolving variant found in mammals but not in invertebrates. We report that in human cells, nucleosomes bearing H2A.Bbd form unconventional chromatin structures enriched within actively transcribed genes and characterized by shorter DNA protection and nucleosome spacing. Analysis of transcriptional profiles from cells depleted for H2A.Bbd demonstrated widespread changes in gene expression with a net downregulation of transcription and disruption of normal mRNA splicing patterns. In particular, we observed changes in exon inclusion rates and increased presence of intronic sequences in mRNA products upon H2A.Bbd depletion. Taken together, our results indicate that H2A.Bbd is involved in formation of a specific chromatin structure that facilitates both transcription and initial mRNA processing.

%B Mol Cell %V 47 %P 596-607 %8 2012 Aug 24 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/22795134?dopt=Abstract %R 10.1016/j.molcel.2012.06.011 %0 Journal Article %J Invest Ophthalmol Vis Sci %D 2012 %T iSyTE: integrated Systems Tool for Eye gene discovery. %A Lachke, Salil A %A Ho, Joshua W K %A Kryukov, Gregory V %A O'Connell, Daniel J %A Aboukhalil, Anton %A Bulyk, Martha L %A Park, Peter J %A Maas, Richard L %K Animals %K Cataract %K Gene Expression Profiling %K Gene Expression Regulation, Developmental %K In Situ Hybridization %K Lens, Crystalline %K Mice %K Microarray Analysis %X

PURPOSE: To facilitate the identification of genes associated with cataract and other ocular defects, the authors developed and validated a computational tool termed iSyTE (integrated Systems Tool for Eye gene discovery; http://bioinformatics.udel.edu/Research/iSyTE). iSyTE uses a mouse embryonic lens gene expression data set as a bioinformatics filter to select candidate genes from human or mouse genomic regions implicated in disease and to prioritize them for further mutational and functional analyses. METHODS: Microarray gene expression profiles were obtained for microdissected embryonic mouse lens at three key developmental time points in the transition from the embryonic day (E)10.5 stage of lens placode invagination to E12.5 lens primary fiber cell differentiation. Differentially regulated genes were identified by in silico comparison of lens gene expression profiles with those of whole embryo body (WB) lacking ocular tissue. RESULTS: Gene set analysis demonstrated that this strategy effectively removes highly expressed but nonspecific housekeeping genes from lens tissue expression profiles, allowing identification of less highly expressed lens disease-associated genes. Among 24 previously mapped human genomic intervals containing genes associated with isolated congenital cataract, the mutant gene is ranked within the top two iSyTE-selected candidates in approximately 88% of cases. Finally, in situ hybridization confirmed lens expression of several novel iSyTE-identified genes. CONCLUSIONS: iSyTE is a publicly available Web resource that can be used to prioritize candidate genes within mapped genomic intervals associated with congenital cataract for further investigation. Extension of this approach to other ocular tissue components will facilitate eye disease gene discovery.

%B Invest Ophthalmol Vis Sci %V 53 %P 1617-27 %8 2012 Mar %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/22323457?dopt=Abstract %R 10.1167/iovs.11-8839 %0 Journal Article %J Science %D 2012 %T Landscape of somatic retrotransposition in human cancers. %A Lee, Eunjung %A Iskow, Rebecca %A Yang, Lixing %A Gokcumen, Omer %A Haseley, Psalm %A Luquette, Lovelace J %A Lohr, Jens G %A Harris, Christopher C %A Ding, Li %A Wilson, Richard K %A Wheeler, David A %A Gibbs, Richard A %A Kucherlapati, Raju %A Lee, Charles %A Kharchenko, Peter V** %A Park, Peter J** %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Base Sequence %K Cell Transformation, Neoplastic %K Colorectal Neoplasms %K DNA Methylation %K Female %K Gene Expression Regulation, Neoplastic %K Genes, Neoplasm %K Genome, Human %K Glioblastoma %K Humans %K Long Interspersed Nucleotide Elements %K Male %K Microsatellite Instability %K Molecular Sequence Annotation %K Molecular Sequence Data %K Multiple Myeloma %K Mutagenesis, Insertional %K Mutation %K Ovarian Neoplasms %K Prostatic Neoplasms %K Retroelements %K Sequence Analysis, DNA %X

Transposable elements (TEs) are abundant in the human genome, and some are capable of generating new insertions through RNA intermediates. In cancer, the disruption of cellular mechanisms that normally suppress TE activity may facilitate mutagenic retrotranspositions. We performed single-nucleotide resolution analysis of TE insertions in 43 high-coverage whole-genome sequencing data sets from five cancer types. We identified 194 high-confidence somatic TE insertions, as well as thousands of polymorphic TE insertions in matched normal genomes. Somatic insertions were present in epithelial tumors but not in blood or brain cancers. Somatic L1 insertions tend to occur in genes that are commonly mutated in cancer, disrupt the expression of the target genes, and are biased toward regions of cancer-specific DNA hypomethylation, highlighting their potential impact in tumorigenesis.

%B Science %V 337 %P 967-71 %8 2012 Aug 24 %G eng %N 6097 %1 http://www.ncbi.nlm.nih.gov/pubmed/22745252?dopt=Abstract %R 10.1126/science.1222077 %0 Journal Article %J Stem Cells %D 2012 %T Numb regulates glioma stem cell fate and growth by altering epidermal growth factor receptor and Skp1-Cullin-F-box ubiquitin ligase activity. %A Jiang, Xiuli %A Xing, Hongyan %A Kim, Tae-Min %A Jung, Yuchae %A Huang, Wei %A Yang, Hong Wei %A Song, Shengye %A Park, Peter J %A Carroll, Rona S %A Johnson, Mark D %K Animals %K Antigens, CD %K Blotting, Western %K Cell Line %K Flow Cytometry %K Glioma %K Glycoproteins %K Humans %K Immunohistochemistry %K Immunoprecipitation %K Membrane Proteins %K Mice %K Mice, Nude %K Neoplastic Stem Cells %K Nerve Tissue Proteins %K Peptides %K Receptor, Epidermal Growth Factor %K Reverse Transcriptase Polymerase Chain Reaction %K SKP Cullin F-Box Protein Ligases %K Tumor Cells, Cultured %X

Glioblastoma contains a hierarchy of stem-like cancer cells, but how this hierarchy is established is unclear. Here, we show that asymmetric Numb localization specifies glioblastoma stem-like cell (GSC) fate in a manner that does not require Notch inhibition. Numb is asymmetrically localized to CD133-hi GSCs. The predominant Numb isoform, Numb4, decreases Notch and promotes a CD133-hi, radial glial-like phenotype. However, upregulation of a novel Numb isoform, Numb4 delta 7 (Numb4d7), increases Notch and AKT activation while nevertheless maintaining CD133-hi fate specification. Numb knockdown increases Notch and promotes growth while favoring a CD133-lo, glial progenitor-like phenotype. We report the novel finding that Numb4 (but not Numb4d7) promotes SCF(Fbw7) ubiquitin ligase assembly and activation to increase Notch degradation. However, both Numb isoforms decrease epidermal growth factor receptor (EGFR) expression, thereby regulating GSC fate. Small molecule inhibition of EGFR activity phenocopies the effect of Numb on CD133 and Pax6. Clinically, homozygous NUMB deletions and low Numb mRNA expression occur primarily in a subgroup of proneural glioblastomas. Higher Numb expression is found in classical and mesenchymal glioblastomas and correlates with decreased survival. Thus, decreased Numb promotes glioblastoma growth, but the remaining Numb establishes a phenotypically diverse stem-like cell hierarchy that increases tumor aggressiveness and therapeutic resistance.

%B Stem Cells %V 30 %P 1313-26 %8 2012 Jul %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/22553175?dopt=Abstract %R 10.1002/stem.1120 %0 Journal Article %J Drug Saf %D 2012 %T A pharmacoepidemiological network model for drug safety surveillance: statins and rhabdomyolysis. %A Reis, Ben Y %A Olson, Karen L %A Tian, Lu %A Bohn, Rhonda L %A Brownstein, John S %A Park, Peter J %A Cziraky, Mark J %A Wilson, Marcus D %A Mandl, Kenneth D %K Adverse Drug Reaction Reporting Systems %K Cohort Studies %K Drug Monitoring %K Female %K Humans %K Hydroxymethylglutaryl-CoA Reductase Inhibitors %K Male %K Middle Aged %K Models, Theoretical %K Pharmacoepidemiology %K Retrospective Studies %K Rhabdomyolysis %K Risk Factors %K United States %X

BACKGROUND: Recent withdrawals of major drugs have highlighted the critical importance of drug safety surveillance in the postmarketing phase. Limitations of spontaneous report data have led drug safety professionals to pursue alternative postmarketing surveillance approaches based on healthcare administrative claims data. These data are typically analysed by comparing the adverse event rates associated with a drug of interest to those of a single comparable reference drug. OBJECTIVE: The aim of this study was to determine whether adverse event detection can be improved by incorporating information from multiple reference drugs. We developed a pharmacological network model that implemented this approach and evaluated its performance. METHODS: We studied whether adverse event detection can be improved by incorporating information from multiple reference drugs, and describe two approaches for doing so. The first, reported previously, combines a set of related drugs into a single reference cohort. The second is a novel pharmacoepidemiological network model, which integrates multiple pair-wise comparisons across an entire set of related drugs into a unified consensus safety score for each drug. We also implemented a single reference drug approach for comparison with both multi-drug approaches. All approaches were applied within a sequential analysis framework, incorporating new information as it became available and addressing the issue of multiple testing over time. We evaluated all these approaches using statin (HMG-CoA reductase inhibitors) safety data from a large healthcare insurer in the US covering April 2000 through March 2005. RESULTS: We found that both multiple reference drug approaches offer earlier detection (6-13 months) than the single reference drug approach, without triggering additional false positives. CONCLUSIONS: Such combined approaches have the potential to be used with existing healthcare databases to improve the surveillance of therapeutics in the postmarketing phase over single-comparator methods. The proposed network approach also provides an integrated visualization framework enabling decision makers to understand the key high-level safety relationships amongst a group of related drugs.

%B Drug Saf %V 35 %P 395-406 %8 2012 May 1 %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/22506565?dopt=Abstract %R 10.2165/11596610-000000000-00000 %0 Journal Article %J PLoS Genet %D 2012 %T Sequence-specific targeting of dosage compensation in Drosophila favors an active chromatin context. %A Alekseyenko, Artyom A* %A Ho, Joshua W K* %A Peng, Shouyong* %A Gelbart, Marnie %A Tolstorukov, Michael Y %A Plachetka, Annette %A Kharchenko, Peter V %A Jung, Youngsook L %A Gorchakov, Andrey A %A Larschan, Erica %A Gu, Tingting %A Minoda, Aki %A Riddle, Nicole C %A Schwartz, Yuri B %A Elgin, Sarah C R %A Karpen, Gary H %A Pirrotta, Vincenzo %A Kuroda, Mitzi I** %A Park, Peter J** %K Acetylation %K Animals %K Base Composition %K Binding Sites %K Chromatin %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Gene Expression Regulation %K Genes, X-Linked %K Histones %K Male %K Nuclear Proteins %K Nucleosomes %K Nucleotide Motifs %K Protein-Serine-Threonine Kinases %K RNA Interference %K RNA-Binding Proteins %K Transcription Factors %K Transcription, Genetic %K X Chromosome %X

The Drosophila MSL complex mediates dosage compensation by increasing transcription of the single X chromosome in males approximately two-fold. This is accomplished through recognition of the X chromosome and subsequent acetylation of histone H4K16 on X-linked genes. Initial binding to the X is thought to occur at "entry sites" that contain a consensus sequence motif ("MSL recognition element" or MRE). However, this motif is only ∼2 fold enriched on X, and only a fraction of the motifs on X are initially targeted. Here we ask whether chromatin context could distinguish between utilized and non-utilized copies of the motif, by comparing their relative enrichment for histone modifications and chromosomal proteins mapped in the modENCODE project. Through a comparative analysis of the chromatin features in male S2 cells (which contain MSL complex) and female Kc cells (which lack the complex), we find that the presence of active chromatin modifications, together with an elevated local GC content in the surrounding sequences, has strong predictive value for functional MSL entry sites, independent of MSL binding. We tested these sites for function in Kc cells by RNAi knockdown of Sxl, resulting in induction of MSL complex. We show that ectopic MSL expression in Kc cells leads to H4K16 acetylation around these sites and a relative increase in X chromosome transcription. Collectively, our results support a model in which a pre-existing active chromatin environment, coincident with H3K36me3, contributes to MSL entry site selection. The consequences of MSL targeting of the male X chromosome include increase in nucleosome lability, enrichment for H4K16 acetylation and JIL-1 kinase, and depletion of linker histone H1 on active X-linked genes. Our analysis can serve as a model for identifying chromatin and local sequence features that may contribute to selection of functional protein binding sites in the genome.

%B PLoS Genet %V 8 %P e1002646 %8 2012 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/22570616?dopt=Abstract %R 10.1371/journal.pgen.1002646 %0 Journal Article %J Cell %D 2012 %T Single-neuron sequencing analysis of L1 retrotransposition and somatic mutation in the human brain. %A Evrony, Gilad D* %A Cai, Xuyu* %A Lee, Eunjung %A Hills, L Benjamin %A Elhosary, Princess C %A Lehmann, Hillel S %A Parker, J J %A Atabay, Kutay D %A Gilmore, Edward C %A Poduri, Annapurna %A Park, Peter J %A Walsh, Christopher A %K Caudate Nucleus %K Cerebral Cortex %K Child %K Chromosomes, Human, Pair 18 %K Genome-Wide Association Study %K Humans %K Long Interspersed Nucleotide Elements %K Male %K Malformations of Cortical Development %K Mosaicism %K Mutation %K Neurons %K Proto-Oncogene Proteins c-akt %K Single-Cell Analysis %K Trisomy %X

A major unanswered question in neuroscience is whether there exists genomic variability between individual neurons of the brain, contributing to functional diversity or to an unexplained burden of neurological disease. To address this question, we developed a method to amplify genomes of single neurons from human brains. Because recent reports suggest frequent LINE-1 (L1) retrotransposition in human brains, we performed genome-wide L1 insertion profiling of 300 single neurons from cerebral cortex and caudate nucleus of three normal individuals, recovering >80% of germline insertions from single neurons. While we find somatic L1 insertions, we estimate <0.6 unique somatic insertions per neuron, and most neurons lack detectable somatic insertions, suggesting that L1 is not a major generator of neuronal diversity in cortex and caudate. We then genotyped single cortical cells to characterize the mosaicism of a somatic AKT3 mutation identified in a child with hemimegalencephaly. Single-neuron sequencing allows systematic assessment of genomic diversity in the human brain.

%B Cell %V 151 %P 483-96 %8 2012 Oct 26 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/23101622?dopt=Abstract %R 10.1016/j.cell.2012.09.035 %0 Journal Article %J Curr Protoc Hum Genet %D 2012 %T A survey of copy-number variation detection tools based on high-throughput sequencing data. %A Xi, Ruibin %A Lee, Semin %A Park, Peter J %K Algorithms %K Computational Biology %K DNA Copy Number Variations %K Genomics %K High-Throughput Nucleotide Sequencing %K Internet %X

Copy-number variation (CNV) is a major class of genomic variation with potentially important functional consequences in both normal and diseased populations. Remarkable advances in development of next-generation sequencing (NGS) platforms provide an unprecedented opportunity for accurate, high-resolution characterization of CNVs. In this unit, we give an overview of available computational tools for detection of CNVs and discuss comparative advantages and disadvantages of different approaches.

%B Curr Protoc Hum Genet %V Chapter 7 %P Unit7.19 %8 2012 Oct %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/23074071?dopt=Abstract %R 10.1002/0471142905.hg0719s75 %0 Journal Article %J Nat Biotechnol %D 2012 %T Systematic identification of synergistic drug pairs targeting HIV. %A Tan, Xu %A Hu, Long %A Luquette, Lovelace J %A Gao, Geng %A Liu, Yifang %A Qu, Hongjing %A Xi, Ruibin %A Lu, Zhi John %A Park, Peter J %A Elledge, Stephen J %K Algorithms %K Anti-HIV Agents %K Combinatorial Chemistry Techniques %K Drug Evaluation, Preclinical %K Drug Synergism %K HIV %K Humans %X

The systematic identification of effective drug combinations has been hindered by the unavailability of methods that can explore the large combinatorial search space of drug interactions. Here we present multiplex screening for interacting compounds (MuSIC), which expedites the comprehensive assessment of pairwise compound interactions. We examined ∼500,000 drug pairs from 1,000 US Food and Drug Administration (FDA)-approved or clinically tested drugs and identified drugs that synergize to inhibit HIV replication. Our analysis reveals an enrichment of anti-inflammatory drugs in drug combinations that synergize against HIV. As inflammation accompanies HIV infection, these findings indicate that inhibiting inflammation could curb HIV propagation. Multiple drug pairs identified in this study, including various glucocorticoids and nitazoxanide (NTZ), synergize by targeting different steps in the HIV life cycle. MuSIC can be applied to a wide variety of disease-relevant screens to facilitate efficient identification of compound combinations.

%B Nat Biotechnol %V 30 %P 1125-30 %8 2012 Nov %G eng %N 11 %1 http://www.ncbi.nlm.nih.gov/pubmed/23064238?dopt=Abstract %R 10.1038/nbt.2391 %0 Journal Article %J Science Signaling %D 2012 %T A Wnt-bmp feedback circuit controls intertissue signaling dynamics in tooth organogenesis. %A O'Connell, Daniel J* %A Ho, Joshua W K* %A Mammoto, Tadanori %A Turbe-Doan, Annick %A O'Connell, Joyce T %A Haseley, Psalm S %A Koo, Samuel %A Kamiya, Nobuhiro %A Ingber, Donald E %A Park, Peter J %A Maas, Richard L %K Animals %K Bone Morphogenetic Proteins %K Mice %K Organogenesis %K Signal Transduction %K Tooth %K Wnt Proteins %X

Many vertebrate organs form through the sequential and reciprocal exchange of signaling molecules between juxtaposed epithelial and mesenchymal tissues. We undertook a systems biology approach that combined the generation and analysis of large-scale spatiotemporal gene expression data with mouse genetic experiments to gain insight into the mechanisms that control epithelial-mesenchymal signaling interactions in the developing mouse molar tooth. We showed that the shift in instructive signaling potential from dental epithelium to dental mesenchyme was accompanied by temporally coordinated genome-wide changes in gene expression in both compartments. To identify the mechanism responsible, we developed a probabilistic technique that integrates regulatory evidence from gene expression data and from the literature to reconstruct a gene regulatory network for the epithelial and mesenchymal compartments in early tooth development. By integrating these epithelial and mesenchymal gene regulatory networks through the action of diffusible extracellular signaling molecules, we identified a key epithelial-mesenchymal intertissue Wnt-Bmp (bone morphogenetic protein) feedback circuit. We then validated this circuit in vivo with compound genetic mutations in mice that disrupted this circuit. Moreover, mathematical modeling demonstrated that the structure of the circuit accounted for the observed reciprocal signaling dynamics. Thus, we have identified a critical signaling circuit that controls the coordinated genome-wide expression changes and reciprocal signaling molecule dynamics that occur in interacting epithelial and mesenchymal compartments during organogenesis.

%B Science Signaling %V 5 %P ra4 %8 2012 Jan 10 %G eng %N 206 %1 http://www.ncbi.nlm.nih.gov/pubmed/22234613?dopt=Abstract %R 10.1126/scisignal.2002414 %0 Journal Article %J Nature %D 2011 %T Integrated genomic analyses of ovarian carcinoma. %A Cancer Genome Atlas Network, The Cancer Genome Atlas %K Aged %K Carcinoma %K DNA Methylation %K Female %K Gene Dosage %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Genomics %K Humans %K MicroRNAs %K Middle Aged %K Mutation %K Ovarian Neoplasms %K RNA, Messenger %X A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology. %B Nature %V 474 %P 609-15 %8 2011 Jun 30 %G eng %N 7353 %1 http://www.ncbi.nlm.nih.gov/pubmed/21720365?dopt=Abstract %R 10.1038/nature10166 %0 Journal Article %J Nat Struct Mol Biol %D 2011 %T An assessment of histone-modification antibody quality. %A Egelhofer, Thea A* %A Minoda, Aki* %A Klugman, Sarit* %A Lee, Kyungjoon %A Kolasinska-Zwierz, Paulina %A Alekseyenko, Artyom A %A Cheung, Ming-Sin %A Day, Daniel S %A Gadel, Sarah %A Gorchakov, Andrey A %A Gu, Tingting %A Kharchenko, Peter V %A Kuan, Samantha %A Latorre, Isabel %A Linder-Basso, Daniela %A Luu, Ying %A Ngo, Queminh %A Perry, Marc %A Rechtsteiner, Andreas %A Riddle, Nicole C %A Schwartz, Yuri B %A Shanower, Gregory A %A Vielle, Anne %A Ahringer, Julie %A Elgin, Sarah C R %A Kuroda, Mitzi I %A Pirrotta, Vincenzo %A Ren, Bing %A Strome, Susan %A Park, Peter J** %A Karpen, Gary H** %A Hawkins, R David** %A Lieb, Jason D** %K Animals %K Antibodies %K Antibody Specificity %K Blotting, Western %K Caenorhabditis elegans %K Caenorhabditis elegans Proteins %K Chromatin Immunoprecipitation %K Drosophila melanogaster %K Drosophila Proteins %K Histones %K Immunoblotting %K Protein Processing, Post-Translational %K Quality Control %K Reproducibility of Results %X

We have tested the specificity and utility of more than 200 antibodies raised against 57 different histone modifications in Drosophila melanogaster, Caenorhabditis elegans and human cells. Although most antibodies performed well, more than 25% failed specificity tests by dot blot or western blot. Among specific antibodies, more than 20% failed in chromatin immunoprecipitation experiments. We advise rigorous testing of histone-modification antibodies before use, and we provide a website for posting new test results (http://compbio.med.harvard.edu/antibodies/).

%B Nat Struct Mol Biol %V 18 %P 91-3 %8 2011 Jan %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/21131980?dopt=Abstract %R 10.1038/nsmb.1972 %0 Journal Article %J Nat Genet %D 2011 %T Evidence for dosage compensation between the X chromosome and autosomes in mammals. %A Kharchenko, Peter V %A Xi, Ruibin %A Park, Peter J %K Animals %K Dosage Compensation, Genetic %K Humans %K Sequence Analysis, RNA %K X Chromosome %B Nat Genet %V 43 %P 1167-9; author reply 1171-2 %8 2011 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/22120048?dopt=Abstract %R 10.1038/ng.991 %0 Journal Article %J Cell Stem Cell %D 2011 %T Lung stem cell self-renewal relies on BMI1-dependent control of expression at imprinted loci. %A Zacharek, Sima J %A Fillmore, Christine M %A Lau, Allison N %A Gludish, David W %A Chou, Alan %A Ho, Joshua W K %A Zamponi, Raffaella %A Gazit, Roi %A Bock, Christoph %A Jäger, Natalie %A Smith, Zachary D %A Kim, Tae-Min %A Saunders, Arven H %A Wong, Janice %A Lee, Joo-Hyeon %A Roach, Rebecca R %A Rossi, Derrick J %A Meissner, Alex %A Gimelbrant, Alexander A %A Park, Peter J %A Kim, Carla F %K Adult Stem Cells %K Animals %K Cell Survival %K Cells, Cultured %K Cyclin-Dependent Kinase Inhibitor p16 %K Gene Expression Profiling %K Gene Expression Regulation, Developmental %K Genes, p16 %K Genetic Loci %K Genomic Imprinting %K Lung %K Mice %K Mice, Mutant Strains %K Nuclear Proteins %K Polycomb Repressive Complex 1 %K Proto-Oncogene Proteins %K Regeneration %K Repressor Proteins %K RNA, Small Interfering %K S-Phase Kinase-Associated Proteins %X

BMI1 is required for the self-renewal of stem cells in many tissues including the lung epithelial stem cells, Bronchioalveolar Stem Cells (BASCs). Imprinted genes, which exhibit expression from only the maternally or paternally inherited allele, are known to regulate developmental processes, but what their role is in adult cells remains a fundamental question. Many imprinted genes were derepressed in Bmi1 knockout mice, and knockdown of Cdkn1c (p57) and other imprinted genes partially rescued the self-renewal defect of Bmi1 mutant lung cells. Expression of p57 and other imprinted genes was required for lung cell self-renewal in culture and correlated with repair of lung epithelial cell injury in vivo. Our data suggest that BMI1-dependent regulation of expressed alleles at imprinted loci, distinct from imprinting per se, is required for control of lung stem cells. We anticipate that the regulation and function of imprinted genes is crucial for self-renewal in diverse adult tissue-specific stem cells.

%B Cell Stem Cell %V 9 %P 272-81 %8 2011 Sep 2 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/21885022?dopt=Abstract %R 10.1016/j.stem.2011.07.007 %0 Journal Article %J Genome Res %D 2011 %T Plasticity in patterns of histone modifications and chromosomal proteins in Drosophila heterochromatin. %A Riddle, Nicole C* %A Minoda, Aki* %A Kharchenko, Peter V* %A Alekseyenko, Artyom A %A Schwartz, Yuri B %A Tolstorukov, Michael Y %A Gorchakov, Andrey A %A Jaffe, Jacob D %A Kennedy, Cameron %A Linder-Basso, Daniela %A Peach, Sally E %A Shanower, Gregory %A Zheng, Haiyan %A Kuroda, Mitzi I %A Pirrotta, Vincenzo %A Park, Peter J %A Elgin, Sarah C R** %A Karpen, Gary H** %K Animals %K Cell Line %K Chromosomal Proteins, Non-Histone %K DNA Transposable Elements %K Drosophila melanogaster %K Epigenomics %K Euchromatin %K Female %K Gene Expression Regulation %K Gene Silencing %K HeLa Cells %K Heterochromatin %K Histones %K Humans %K Male %K Protein Structure, Tertiary %X

Eukaryotic genomes are packaged in two basic forms, euchromatin and heterochromatin. We have examined the composition and organization of Drosophila melanogaster heterochromatin in different cell types using ChIP-array analysis of histone modifications and chromosomal proteins. As anticipated, the pericentric heterochromatin and chromosome 4 are on average enriched for the "silencing" marks H3K9me2, H3K9me3, HP1a, and SU(VAR)3-9, and are generally depleted for marks associated with active transcription. The locations of the euchromatin-heterochromatin borders identified by these marks are similar in animal tissues and most cell lines, although the amount of heterochromatin is variable in some cell lines. Combinatorial analysis of chromatin patterns reveals distinct profiles for euchromatin, pericentric heterochromatin, and the 4th chromosome. Both silent and active protein-coding genes in heterochromatin display complex patterns of chromosomal proteins and histone modifications; a majority of the active genes exhibit both "activation" marks (e.g., H3K4me3 and H3K36me3) and "silencing" marks (e.g., H3K9me2 and HP1a). The hallmark of active genes in heterochromatic domains appears to be a loss of H3K9 methylation at the transcription start site. We also observe complex epigenomic profiles of intergenic regions, repeated transposable element (TE) sequences, and genes in the heterochromatic extensions. An unexpectedly large fraction of sequences in the euchromatic chromosome arms exhibits a heterochromatic chromatin signature, which differs in size, position, and impact on gene expression among cell types. We conclude that patterns of heterochromatin/euchromatin packaging show greater complexity and plasticity than anticipated. This comprehensive analysis provides a foundation for future studies of gene activity and chromosomal functions that are influenced by or dependent upon heterochromatin.

%B Genome Res %V 21 %P 147-63 %8 2011 Feb %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/21177972?dopt=Abstract %R 10.1101/gr.110098.110 %0 Journal Article %J Nature %D 2011 %T X chromosome dosage compensation via enhanced transcriptional elongation in Drosophila. %A Larschan, Erica* %A Bishop, Eric P* %A Kharchenko, Peter V %A Core, Leighton J %A Lis, John T %A Park, Peter J** %A Kuroda, Mitzi I** %K Acetylation %K Animals %K Cell Line %K Chromosomes, Insect %K DNA-Binding Proteins %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Genes, Insect %K Genes, X-Linked %K Histones %K Male %K Nuclear Proteins %K RNA Polymerase II %K Sequence Analysis, DNA %K Transcription Factors %K Transcription, Genetic %K X Chromosome %X

The evolution of sex chromosomes has resulted in numerous species in which females inherit two X chromosomes but males have a single X, thus requiring dosage compensation. MSL (Male-specific lethal) complex increases transcription on the single X chromosome of Drosophila males to equalize expression of X-linked genes between the sexes. The biochemical mechanisms used for dosage compensation must function over a wide dynamic range of transcription levels and differential expression patterns. It has been proposed that the MSL complex regulates transcriptional elongation to control dosage compensation, a model subsequently supported by mapping of the MSL complex and MSL-dependent histone 4 lysine 16 acetylation to the bodies of X-linked genes in males, with a bias towards 3' ends. However, experimental analysis of MSL function at the mechanistic level has been challenging owing to the small magnitude of the chromosome-wide effect and the lack of an in vitro system for biochemical analysis. Here we use global run-on sequencing (GRO-seq) to examine the specific effect of the MSL complex on RNA Polymerase II (RNAP II) on a genome-wide level. Results indicate that the MSL complex enhances transcription by facilitating the progression of RNAP II across the bodies of active X-linked genes. Improving transcriptional output downstream of typical gene-specific controls may explain how dosage compensation can be imposed on the diverse set of genes along an entire chromosome.

%B Nature %V 471 %P 115-8 %8 2011 Mar 3 %G eng %N 7336 %1 http://www.ncbi.nlm.nih.gov/pubmed/21368835?dopt=Abstract %R 10.1038/nature09757 %0 Journal Article %J Wiley Interdiscip Rev Syst Biol Med %D 2011 %T Advances in analysis of transcriptional regulatory networks. %A Kim, Tae-Min %A Park, Peter J %K Chromatin Immunoprecipitation %K DNA %K DNA-Binding Proteins %K Gene Regulatory Networks %K Regulatory Elements, Transcriptional %K Transcription Factors %K Transcription, Genetic %X

A transcriptional regulatory network represents a molecular framework in which developmental or environmental cues are transformed into differential expression of genes. Transcriptional regulation is mediated by the combinatorial interplay between cis-regulatory DNA elements and trans-acting transcription factors, and is perhaps the most important mechanism for controlling gene expression. Recent innovations, most notably the method for detecting protein-DNA interactions genome-wide, can help provide a comprehensive catalog of cis-regulatory elements and their interaction with given trans-acting factors in a given condition. A transcriptional regulatory network that integrates such information can lead to a systems-level understanding of regulatory mechanisms. In this review, we will highlight the key aspects of current knowledge on eukaryotic transcriptional regulation, especially on known transcription factors and their interacting regulatory elements. Then we will review some recent technical advances for genome-wide mapping of DNA-protein interactions based on high-throughput sequencing. Finally, we will discuss the types of biological insights that can be obtained from a network-level understanding of transcription regulation as well as future challenges in the field.

%B Wiley Interdiscip Rev Syst Biol Med %V 3 %P 21-35 %8 2011 Jan-Feb %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/21069662?dopt=Abstract %R 10.1002/wsbm.105 %0 Journal Article %J Hum Mol Genet %D 2011 %T Amniocytes can serve a dual function as a source of iPS cells and feeder layers. %A Anchan, Raymond M %A Quaas, Philipp %A Gerami-Naini, Behzad %A Bartake, Hrishikesh %A Griffin, Adam %A Zhou, Yilan %A Day, Daniel S %A Eaton, Jennifer L %A George, Liji L %A Naber, Catherine %A Turbe-Doan, Annick %A Park, Peter J %A Hornstein, Mark D %A Maas, Richard L %K Amnion %K Animals %K Cell Culture Techniques %K Cell Differentiation %K Cells, Cultured %K Embryonic Stem Cells %K Female %K Humans %K Induced Pluripotent Stem Cells %K Mice %K Mice, Inbred C57BL %K Octamer Transcription Factors %X

Clinical barriers to stem-cell therapy include the need for efficient derivation of histocompatible stem cells and the zoonotic risk inherent to human stem-cell xenoculture on mouse feeder cells. We describe a system for efficiently deriving induced pluripotent stem (iPS) cells from human and mouse amniocytes, and for maintaining the pluripotency of these iPS cells on mitotically inactivated feeder layers prepared from the same amniocytes. Both cellular components of this system are thus autologous to a single donor. Moreover, the use of human feeder cells reduces the risk of zoonosis. Generation of iPS cells using retroviral vectors from short- or long-term cultured human and mouse amniocytes using four factors, or two factors in mouse, occurs in 5-7 days with 0.5% efficiency. This efficiency is greater than that reported for mouse and human fibroblasts using similar viral infection approaches, and does not appear to result from selective reprogramming of Oct4(+) or c-Kit(+) amniocyte subpopulations. Derivation of amniocyte-derived iPS (AdiPS) cell colonies, which express pluripotency markers and exhibit appropriate microarray expression and DNA methylation properties, was facilitated by live immunostaining. AdiPS cells also generate embryoid bodies in vitro and teratomas in vivo. Furthermore, mouse and human amniocytes can serve as feeder layers for iPS cells and for mouse and human embryonic stem (ES) cells. Thus, human amniocytes provide an efficient source of autologous iPS cells and, as feeder cells, can also maintain iPS and ES cell pluripotency without the safety concerns associated with xenoculture.

%B Hum Mol Genet %V 20 %P 962-74 %8 2011 Mar 1 %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/21156717?dopt=Abstract %R 10.1093/hmg/ddq542 %0 Journal Article %J BMC Genomics %D 2011 %T ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis. %A Ho, Joshua W K %A Bishop, Eric P %A Karchenko, Peter V %A Nègre, Nicolas %A White, Kevin P %A Park, Peter J %K Algorithms %K Animals %K Chromatin Immunoprecipitation %K Drosophila melanogaster %K Gene Expression Profiling %K Gene Expression Regulation, Developmental %K Gene Library %K Genome, Insect %K Histones %K Oligonucleotide Array Sequence Analysis %K Reproducibility of Results %K RNA Polymerase II %K Sequence Analysis, DNA %K Terminator Regions, Genetic %K Transcription Initiation Site %X

BACKGROUND: Chromatin immunoprecipitation (ChIP) followed by microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) allows genome-wide discovery of protein-DNA interactions such as transcription factor bindings and histone modifications. Previous reports only compared a small number of profiles, and little has been done to compare histone modification profiles generated by the two technologies or to assess the impact of input DNA libraries in ChIP-seq analysis. Here, we performed a systematic analysis of a modENCODE dataset consisting of 31 pairs of ChIP-chip/ChIP-seq profiles of the coactivator CBP, RNA polymerase II (RNA PolII), and six histone modifications across four developmental stages of Drosophila melanogaster. RESULTS: Both technologies produce highly reproducible profiles within each platform, ChIP-seq generally produces profiles with a better signal-to-noise ratio, and allows detection of more peaks and narrower peaks. The set of peaks identified by the two technologies can be significantly different, but the extent to which they differ varies depending on the factor and the analysis algorithm. Importantly, we found that there is a significant variation among multiple sequencing profiles of input DNA libraries and that this variation most likely arises from both differences in experimental condition and sequencing depth. We further show that using an inappropriate input DNA profile can impact the average signal profiles around genomic features and peak calling results, highlighting the importance of having high quality input DNA data for normalization in ChIP-seq analysis. CONCLUSIONS: Our findings highlight the biases present in each of the platforms, show the variability that can arise from both technology and analysis methods, and emphasize the importance of obtaining high quality and deeply sequenced input DNA libraries for ChIP-seq analysis.

%B BMC Genomics %V 12 %P 134 %8 2011 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/21356108?dopt=Abstract %R 10.1186/1471-2164-12-134 %0 Journal Article %J Nature %D 2011 %T Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. %A Kharchenko, Peter V %A Alekseyenko, Artyom A %A Schwartz, Yuri B %A Minoda, Aki %A Riddle, Nicole C %A Ernst, Jason %A Sabo, Peter J %A Larschan, Erica %A Gorchakov, Andrey A %A Gu, Tingting %A Linder-Basso, Daniela %A Plachetka, Annette %A Shanower, Gregory %A Tolstorukov, Michael Y %A Luquette, Lovelace J %A Xi, Ruibin %A Jung, Youngsook L %A Park, Richard W %A Bishop, Eric P %A Canfield, Theresa K %A Sandstrom, Richard %A Thurman, Robert E %A MacAlpine, David M %A Stamatoyannopoulos, John A %A Kellis, Manolis %A Elgin, Sarah C R %A Kuroda, Mitzi I %A Pirrotta, Vincenzo %A Karpen, Gary H** %A Park, Peter J** %K Animals %K Cell Line %K Chromatin %K Chromatin Immunoprecipitation %K Chromosomal Proteins, Non-Histone %K Deoxyribonuclease I %K Drosophila melanogaster %K Drosophila Proteins %K Exons %K Gene Expression Regulation %K Genes, Insect %K Genome, Insect %K Histones %K Male %K Molecular Sequence Annotation %K Oligonucleotide Array Sequence Analysis %K Polycomb Repressive Complex 1 %K RNA %K Sequence Analysis %K Transcription, Genetic %X

Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which have an impact on cell differentiation, gene regulation and other key cellular processes. Here we present a genome-wide chromatin landscape for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial patterns. Integrative analysis with other data (non-histone chromatin proteins, DNase I hypersensitivity, GRO-Seq reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, genes, regulatory elements and other functional domains. We find that active genes display distinct chromatin signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions and genomic contexts. We also demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are regulated, and will serve as a resource for future experimental investigations of genome structure and function.

%B Nature %V 471 %P 480-5 %8 2011 Mar 24 %G eng %N 7339 %1 http://www.ncbi.nlm.nih.gov/pubmed/21179089?dopt=Abstract %R 10.1038/nature09725 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2011 %T Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion. %A Xi, Ruibin %A Hadjipanayis, Angela G %A Luquette, Lovelace J %A Kim, Tae-Min %A Lee, Eunjung %A Zhang, Jianhua %A Johnson, Mark D %A Muzny, Donna M %A Wheeler, David A %A Gibbs, Richard A %A Kucherlapati, Raju %A Park, Peter J %K Algorithms %K Bayes Theorem %K Brain Neoplasms %K Comparative Genomic Hybridization %K Computer Simulation %K DNA Copy Number Variations %K Female %K Gene Dosage %K Genome %K Genome, Human %K Glioblastoma %K Humans %K Models, Genetic %K Models, Statistical %K Sequence Analysis, DNA %X

DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer and confer susceptibility to a variety of human disorders. Array comparative genomic hybridization has been used widely to identify CNVs genome wide, but the next-generation sequencing technology provides an opportunity to characterize CNVs genome wide with unprecedented resolution. In this study, we developed an algorithm to detect CNVs from whole-genome sequencing data and applied it to a newly sequenced glioblastoma genome with a matched control. This read-depth algorithm, called BIC-seq, can accurately and efficiently identify CNVs via minimizing the Bayesian information criterion. Using BIC-seq, we identified hundreds of CNVs as small as 40 bp in the cancer genome sequenced at 10× coverage, whereas we could only detect large CNVs (> 15 kb) in the array comparative genomic hybridization profiles for the same genome. Eighty percent (14/16) of the small variants tested (110 bp to 14 kb) were experimentally validated by quantitative PCR, demonstrating high sensitivity and true positive rate of the algorithm. We also extended the algorithm to detect recurrent CNVs in multiple samples as well as deriving error bars for breakpoints using a Gibbs sampling approach. We propose this statistical approach as a principled yet practical and efficient method to estimate CNVs in whole-genome sequencing data.

%B Proc Natl Acad Sci U S A %V 108 %P E1128-36 %8 2011 Nov 15 %G eng %N 46 %1 http://www.ncbi.nlm.nih.gov/pubmed/22065754?dopt=Abstract %R 10.1073/pnas.1110574108 %0 Journal Article %J Cancer Res %D 2011 %T A developmental taxonomy of glioblastoma defined and maintained by MicroRNAs. %A Kim, Tae-Min %A Huang, Wei %A Park, Richard %A Park, Peter J** %A Johnson, Mark D** %K Brain Neoplasms %K DNA Modification Methylases %K DNA Repair Enzymes %K Gene Dosage %K Gene Expression Profiling %K Glioblastoma %K Humans %K MicroRNAs %K Mutation %K Neoplastic Stem Cells %K Promoter Regions, Genetic %K RNA, Messenger %K Tumor Suppressor Proteins %X

mRNA expression profiling has suggested the existence of multiple glioblastoma subclasses, but their number and characteristics vary among studies and the etiology underlying their development is unclear. In this study, we analyzed 261 microRNA expression profiles from The Cancer Genome Atlas (TCGA), identifying five clinically and genetically distinct subclasses of glioblastoma that each related to a different neural precursor cell type. These microRNA-based glioblastoma subclasses displayed microRNA and mRNA expression signatures resembling those of radial glia, oligoneuronal precursors, neuronal precursors, neuroepithelial/neural crest precursors, or astrocyte precursors. Each subclass was determined to be genetically distinct, based on the significant differences they displayed in terms of patient race, age, treatment response, and survival. We also identified several microRNAs as potent regulators of subclass-specific gene expression networks in glioblastoma. Foremost among these is miR-9, which suppresses mesenchymal differentiation in glioblastoma by downregulating expression of JAK kinases and inhibiting activation of STAT3. Our findings suggest that microRNAs are important determinants of glioblastoma subclasses through their ability to regulate developmental growth and differentiation programs in several transformed neural precursor cell types. Taken together, our results define developmental microRNA expression signatures that both characterize and contribute to the phenotypic diversity of glioblastoma subclasses, thereby providing an expanded framework for understanding the pathogenesis of glioblastoma in a human neurodevelopmental context.

%B Cancer Res %V 71 %P 3387-99 %8 2011 May 1 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/21385897?dopt=Abstract %R 10.1158/0008-5472.CAN-10-4117 %0 Journal Article %J Cell Metab %D 2011 %T Expression of the splicing factor gene SFRS10 is reduced in human obesity and contributes to enhanced lipogenesis. %A Pihlajamäki, Jussi %A Lerin, Carles %A Itkonen, Paula %A Boes, Tanner %A Floss, Thomas %A Schroeder, Joshua %A Dearie, Farrell %A Crunkhorn, Sarah %A Burak, Furkan %A Jimenez-Chillaron, Josep C %A Kuulasmaa, Tiina %A Miettinen, Pekka %A Park, Peter J %A Nasser, Imad %A Zhao, Zhenwen %A Zhang, Zhaiyi %A Xu, Yan %A Wurst, Wolfgang %A Ren, Hongmei %A Morris, Andrew J %A Stamm, Stefan %A Goldfine, Allison B %A Laakso, Markku %A Patti, Mary Elizabeth %K Adult %K Aged %K Animals %K Cell Line, Tumor %K Female %K Gene Expression Regulation %K Humans %K Lipids %K Lipogenesis %K Liver %K Male %K Mice %K Mice, Inbred ICR %K Mice, Transgenic %K Middle Aged %K Muscle, Skeletal %K Nerve Tissue Proteins %K Obesity %K Phosphatidate Phosphatase %K RNA Splicing %K RNA-Binding Proteins %X

Alternative mRNA splicing provides transcript diversity and may contribute to human disease. We demonstrate that expression of several genes regulating RNA processing is decreased in both liver and skeletal muscle of obese humans. We evaluated a representative splicing factor, SFRS10, downregulated in both obese human liver and muscle and in high-fat-fed mice, and determined metabolic impact of reduced expression. SFRS10-specific siRNA induces lipogenesis and lipid accumulation in hepatocytes. Moreover, Sfrs10 heterozygous mice have increased hepatic lipogenic gene expression, VLDL secretion, and plasma triglycerides. We demonstrate that LPIN1, a key regulator of lipid metabolism, is a splicing target of SFRS10; reduced SFRS10 favors the lipogenic β isoform of LPIN1. Importantly, LPIN1β-specific siRNA abolished lipogenic effects of decreased SFRS10 expression. Together, our results indicate that reduced expression of SFRS10, as observed in tissues from obese humans, alters LPIN1 splicing, induces lipogenesis, and therefore contributes to metabolic phenotypes associated with obesity.

%B Cell Metab %V 14 %P 208-18 %8 2011 Aug 3 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/21803291?dopt=Abstract %R 10.1016/j.cmet.2011.06.007 %0 Journal Article %J PLoS One %D 2011 %T Gene expression analysis reveals the cell cycle and kinetochore genes participating in ischemia reperfusion injury and early development in kidney. %A Kim, Tae-Min %A Ramírez, Victoria %A Barrera-Chimal, Jonatan %A Bobadilla, Norma A %A Park, Peter J %A Vaidya, Vishal S %K Animals %K Cell Cycle %K Gene Expression Profiling %K Kidney %K Kinetochores %K Male %K Mice %K Oligonucleotide Array Sequence Analysis %K Rats %K Regeneration %K Reperfusion Injury %X

BACKGROUND: The molecular mechanisms that mediate the ischemia-reperfusion (I/R) injury in kidney are not completely understood. It is also largely unknown whether such mechanisms overlap with those governing the early development of kidney. METHODOLOGY/PRINCIPAL FINDINGS: We performed gene expression analysis to investigate the transcriptome changes during regeneration after I/R injury in the rat (0 hr, 6 hr, 24 hr, and 120 hr after reperfusion) and early development of mouse kidney (embryonic day 16 p.c. and postnatal 1 and 7 day). Pathway analysis revealed a wide spectrum of molecular functions that may participate in the regeneration and developmental processes of kidney as well as the functional association between them. While the genes associated with cell cycle, immunity, inflammation, and apoptosis were globally activated during the regeneration after I/R injury, the genes encoding various transporters and metabolic enzymes were down-regulated. We also observed that these injury-associated molecular functions largely overlap with those of early kidney development. In particular, the up-regulation of kinases and kinesins with roles in cell division was common during regeneration and early developmental kidney as validated by real-time PCR and immunohistochemistry. CONCLUSIONS: In addition to the candidate genes whose up-regulation constitutes an overlapping expression signature between kidney regeneration and development, this study lays a foundation for studying the functional relationship between two biological processes.

%B PLoS One %V 6 %P e25679 %8 2011 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/21980527?dopt=Abstract %R 10.1371/journal.pone.0025679 %0 Journal Article %J Nat Struct Mol Biol %D 2011 %T Impact of chromatin structure on sequence variability in the human genome. %A Tolstorukov, Michael Y %A Volfovsky, Natalia %A Stephens, Robert M %A Park, Peter J %K Chromatin %K Epigenesis, Genetic %K Genome, Human %K Humans %K Polymorphism, Single Nucleotide %X

DNA sequence variations in individual genomes give rise to different phenotypes within the same species. One mechanism in this process is the alteration of chromatin structure due to sequence variation that influences gene regulation. We composed a high-confidence collection of human single-nucleotide polymorphisms and indels based on analysis of publicly available sequencing data and investigated whether the DNA loci associated with stable nucleosome positions are protected against mutations. We addressed how the sequence variation reflects the occupancy profiles of nucleosomes bearing different epigenetic modifications on genome scale. We found that indels are depleted around nucleosome positions of all considered types, whereas single-nucleotide polymorphisms are enriched around the positions of bulk nucleosomes but depleted around the positions of epigenetically modified nucleosomes. These findings indicate an increased level of conservation for the sequences associated with epigenetically modified nucleosomes, highlighting complex organization of the human chromatin.

%B Nat Struct Mol Biol %V 18 %P 510-5 %8 2011 Apr %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/21399641?dopt=Abstract %R 10.1038/nsmb.2012 %0 Journal Article %J Int J Radiat Oncol Biol Phys %D 2011 %T Phase II study of neoadjuvant bevacizumab and radiotherapy for resectable soft tissue sarcomas. %A Yoon, Sam S %A Duda, Dan G %A Karl, Daniel L %A Kim, Tae-Min %A Kambadakone, Avinash R %A Chen, Yen-Lin %A Rothrock, Courtney %A Rosenberg, Andrew E %A Nielsen, G Petur %A Kirsch, David G %A Choy, Edwin %A Harmon, David C %A Hornicek, Francis J %A Dreyfuss, Jonathan M %A Ancukiewicz, Marek %A Sahani, Dushyant V %A Park, Peter J %A Jain, Rakesh K %A Delaney, Thomas F %K Adult %K Aged %K Angiogenesis Inhibitors %K Antibodies, Monoclonal, Humanized %K Cell Proliferation %K Female %K Gene Expression Profiling %K Humans %K Male %K Microvessels %K Middle Aged %K Neoadjuvant Therapy %K Neoplasm Recurrence, Local %K Postoperative Complications %K Radiotherapy Dosage %K Sarcoma %K Soft Tissue Neoplasms %K Treatment Outcome %K Tumor Burden %K Tumor Markers, Biological %X

PURPOSE: Numerous preclinical studies have demonstrated that angiogenesis inhibitors can increase the efficacy of radiotherapy (RT). We sought to examine the safety and efficacy of bevacizumab (BV) and RT in soft tissue sarcomas and explore biomarkers to help determine the treatment response. METHODS AND MATERIALS: Patients with ≥5 cm, intermediate- or high-grade soft tissue sarcomas at significant risk of local recurrence received neoadjuvant BV alone followed by BV plus RT before surgical resection. Correlative science studies included analysis of the serial blood and tumor samples and serial perfusion computed tomography scans. RESULTS: The 20 patients had a median tumor size of 8.25 cm, with 13 extremity, 1 trunk, and 6 retroperitoneal/pelvis tumors. The neoadjuvant treatment was well tolerated, with only 4 patients having Grade 3 toxicities (hypertension, liver function test elevation). BV plus RT resulted in ≥80% pathologic necrosis in 9 (45%) of 20 tumors, more than double the historical rate seen with RT alone. Three patients had a complete pathologic response. The median microvessel density decreased 53% after BV alone (p <.05). After combination therapy, the median tumor cell proliferation decreased by 73%, apoptosis increased 10.4-fold, and the blood flow, blood volume, and permeability surface area decreased by 62-72% (p <.05). Analysis of gene expression microarrays of untreated tumors identified a 24-gene signature for treatment response. The microvessel density and circulating progenitor cells at baseline and the reduction in microvessel density and plasma soluble c-KIT with BV therapy also correlated with a good pathologic response (p <.05). After a median follow-up of 20 months, only 1 patient had developed local recurrence. CONCLUSIONS: The results from the present exploratory study indicated that BV increases the efficacy of RT against soft tissue sarcomas and might reduce the incidence of local recurrence. Thus, this regimen warrants additional investigation. Gene expression profiles and other tissue and circulating biomarkers showed promising correlations with treatment response.

%B Int J Radiat Oncol Biol Phys %V 81 %P 1081-90 %8 2011 Nov 15 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/20932656?dopt=Abstract %R 10.1016/j.ijrobp.2010.07.024 %0 Journal Article %J Genetika %D 2010 %T [Dosage compensation in drosophila: sequence-specific initiation and sequence-independent spreading of MSL complex to the active genes on the male X chromosome]. %A Gorchakov, A A %A Alekseenko, A A %A Kharchenko, P V %A Park, P %A Kuroda, M %K Animals %K Chromosomes, Insect %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Male %K Ribonucleoproteins %K Transcription, Genetic %K X Chromosome %X For the dosage compensation to occur, genes on the single male X chromosomes in Drosophila must be selectively bound and acetylated by the ribonucleoprotein complex called MSL complex. It remained unknown how such exquisite specificity is achieved, and whether specific DNA sequences were involved. In the present work we demonstrate that it is transcription of the gene on the X chromosome that is important for MSL targeting, irrespective of gene origin and DNA sequence. %B Genetika %V 46 %P 1430-4 %8 2010 Oct %G rus %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/21254570?dopt=Abstract %0 Journal Article %J Molecular Cell %D 2010 %T CpG islands recruit a histone H3 lysine 36 demethylase. %A Blackledge, Neil P %A Zhou, Jin C %A Tolstorukov, Michael Y %A Farcas, Anca M %A Park, Peter J %A Klose, Robert J %K Amino Acid Sequence %K Binding Sites %K CpG Islands %K DNA Methylation %K DNA-Binding Proteins %K F-Box Proteins %K Histone Demethylases %K Histones %K Humans %K Jumonji Domain-Containing Histone Demethylases %K Lysine %K Molecular Sequence Data %K Mutation %K Oxidoreductases, N-Demethylating %K Protein Binding %K Protein Structure, Tertiary %K Recombinant Proteins %K Sequence Homology, Amino Acid %X

In higher eukaryotes, up to 70% of genes have high levels of nonmethylated cytosine/guanine base pairs (CpGs) surrounding promoters and gene regulatory units. These features, called CpG islands, were identified over 20 years ago, but there remains little mechanistic evidence to suggest how these enigmatic elements contribute to promoter function, except that they are refractory to epigenetic silencing by DNA methylation. Here we show that CpG islands directly recruit the H3K36-specific lysine demethylase enzyme KDM2A. Nucleation of KDM2A at these elements results in removal of H3K36 methylation, creating CpG island chromatin that is uniquely depleted of this modification. KDM2A utilizes a zinc finger CxxC (ZF-CxxC) domain that preferentially recognizes nonmethylated CpG DNA, and binding is blocked when the CpG DNA is methylated, thus constraining KDM2A to nonmethylated CpG islands. These data expose a straightforward mechanism through which KDM2A delineates a unique architecture that differentiates CpG island chromatin from bulk chromatin.

%B Molecular Cell %V 38 %P 179-90 %8 2010 Apr 23 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/20417597?dopt=Abstract %R 10.1016/j.molcel.2010.04.009 %0 Journal Article %J Clin Cancer Res %D 2010 %T Genomic profiling reveals alternative genetic pathways of meningioma malignant progression dependent on the underlying NF2 status. %A Goutagny, Stéphane %A Yang, Hong Wei %A Zucman-Rossi, Jessica %A Chan, Jennifer %A Dreyfuss, Jonathan M %A Park, Peter J %A Black, Peter M %A Giovannini, Marco %A Carroll, Rona S %A Kalamarides, Michel %K Adult %K Aged %K Disease Progression %K Female %K Gene Dosage %K Gene Expression %K Gene Expression Profiling %K Genes, Neurofibromatosis 2 %K Genotype %K Humans %K Loss of Heterozygosity %K Male %K Meningeal Neoplasms %K Meningioma %K Middle Aged %K Mutation %K Oligonucleotide Array Sequence Analysis %K Polymorphism, Single Nucleotide %K Reverse Transcriptase Polymerase Chain Reaction %X

PURPOSE: Meningiomas are the most common central nervous system tumors in the population of age 35 and older. WHO defines three grades predictive of the risk of recurrence. Clinical data supporting histologic malignant progression of meningiomas are sparse and underlying molecular mechanisms are not clearly depicted. EXPERIMENTAL DESIGN: We identified genetic alterations associated with histologic progression of 36 paired meningioma samples in 18 patients using 500K SNP genotyping arrays and NF2 gene sequencing. RESULTS: The most frequent chromosome alterations observed in progressing meningioma samples are early alterations (i.e., present both in lower- and higher-grade samples of a single patient). In our series, NF2 gene inactivation was an early and frequent event in progressing meningioma samples (73%). Chromosome alterations acquired during progression from grade I to grade II meningioma were not recurrent. Progression to grade III was characterized by recurrent genomic alterations, the most frequent being CDKN2A/CDKN2B locus loss on 9p. CONCLUSION: Meningiomas displayed different patterns of genetic alterations during progression according to their NF2 status: NF2-mutated meningiomas showed higher chromosome instability during progression than NF2-nonmutated meningiomas, which had very few imbalanced chromosome segments. This pattern of alterations could thus be used as markers in clinical practice to identify tumors prone to progress among grade I meningiomas.

%B Clin Cancer Res %V 16 %P 4155-64 %8 2010 Aug 15 %G eng %N 16 %1 http://www.ncbi.nlm.nih.gov/pubmed/20682713?dopt=Abstract %R 10.1158/1078-0432.CCR-10-0891 %0 Journal Article %J Science %D 2010 %T Identification of functional elements and regulatory circuits by Drosophila modENCODE. %A modENCODE Consortium, * %A Roy, Sushmita* %A Ernst, Jason* %A Kharchenko, Peter V* %A Kheradpour, Pouya* %A Negre, Nicolas* %A Eaton, Matthew L* %A Landolin, Jane M* %A Bristow, Christopher A* %A Ma, Lijia* %A Lin, Michael F* %A Washietl, Stefan* %A Arshinoff, Bradley I %A Ay, Ferhat %A Meyer, Patrick E %A Robine, Nicolas %A Washington, Nicole L %A Di Stefano, Luisa %A Berezikov, Eugene %A Brown, Christopher D %A Candeias, Rogerio %A Carlson, Joseph W %A Carr, Adrian %A Jungreis, Irwin %A Marbach, Daniel %A Sealfon, Rachel %A Tolstorukov, Michael Y %A Will, Sebastian %A Alekseyenko, Artyom A %A Artieri, Carlo %A Booth, Benjamin W %A Brooks, Angela N %A Dai, Qi %A Davis, Carrie A %A Duff, Michael O %A Feng, Xin %A Gorchakov, Andrey A %A Gu, Tingting %A Henikoff, Jorja G %A Kapranov, Philipp %A Li, Renhua %A MacAlpine, Heather K %A Malone, John %A Minoda, Aki %A Nordman, Jared %A Okamura, Katsutomo %A Perry, Marc %A Powell, Sara K %A Riddle, Nicole C %A Sakai, Akiko %A Samsonova, Anastasia %A Sandler, Jeremy E %A Schwartz, Yuri B %A Sher, Noa %A Spokony, Rebecca %A Sturgill, David %A van Baren, Marijke %A Wan, Kenneth H %A Yang, Li %A Yu, Charles %A Feingold, Elise %A Good, Peter %A Guyer, Mark %A Lowdon, Rebecca %A Ahmad, Kami %A Andrews, Justen %A Berger, Bonnie %A Brenner, Steven E %A Brent, Michael R %A Cherbas, Lucy %A Elgin, Sarah C R %A Gingeras, Thomas R %A Grossman, Robert %A Hoskins, Roger A %A Kaufman, Thomas C %A Kent, William %A Kuroda, Mitzi I %A Orr-Weaver, Terry %A Perrimon, Norbert %A Pirrotta, Vincenzo %A Posakony, James W %A Ren, Bing %A Russell, Steven %A Cherbas, Peter %A Graveley, Brenton R %A Lewis, Suzanna %A Micklem, Gos %A Oliver, Brian %A Park, Peter J %A Celniker, Susan E** %A Henikoff, Steven** %A Karpen, Gary H** %A Lai, Eric C** %A MacAlpine, David M** %A Stein, Lincoln D** %A White, Kevin P** %A Kellis, Manolis** %K Animals %K Binding Sites %K Chromatin %K Computational Biology %K Drosophila melanogaster %K Drosophila Proteins %K Epigenesis, Genetic %K Gene Expression Regulation %K Gene Regulatory Networks %K Genes, Insect %K Genome, Insect %K Genomics %K Histones %K Molecular Sequence Annotation %K Nucleosomes %K Promoter Regions, Genetic %K RNA, Small Untranslated %K Transcription Factors %K Transcription, Genetic %X

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

%B Science %V 330 %P 1787-97 %8 2010 Dec 24 %G eng %N 6012 %1 http://www.ncbi.nlm.nih.gov/pubmed/21177974?dopt=Abstract %R 10.1126/science.1198374 %0 Journal Article %J Nat Med %D 2010 %T Loss of the tumor suppressor Snf5 leads to aberrant activation of the Hedgehog-Gli pathway. %A Jagani, Zainab %A Mora-Blanco, E Lorena %A Sansam, Courtney G %A McKenna, Elizabeth S %A Wilson, Boris %A Chen, Dongshu %A Klekota, Justin %A Tamayo, Pablo %A Nguyen, Phuong T L %A Tolstorukov, Michael %A Park, Peter J %A Cho, Yoon-Jae %A Hsiao, Kathy %A Buonamici, Silvia %A Pomeroy, Scott L %A Mesirov, Jill P %A Ruffner, Heinz %A Bouwmeester, Tewis %A Luchansky, Sarah J %A Murtie, Joshua %A Kelleher, Joseph F %A Warmuth, Markus %A Sellers, William R %A Roberts, Charles W M %A Dorsch, Marion %K Animals %K Cell Line, Tumor %K Chromatin Immunoprecipitation %K Chromosomal Proteins, Non-Histone %K DNA Primers %K DNA-Binding Proteins %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K Immunoblotting %K In Situ Hybridization %K Mass Spectrometry %K Mice %K Microarray Analysis %K Rhabdoid Tumor %K Signal Transduction %K Transcription Factors %X

Aberrant activation of the Hedgehog (Hh) pathway can drive tumorigenesis. To investigate the mechanism by which glioma-associated oncogene family zinc finger-1 (GLI1), a crucial effector of Hh signaling, regulates Hh pathway activation, we searched for GLI1-interacting proteins. We report that the chromatin remodeling protein SNF5 (encoded by SMARCB1, hereafter called SNF5), which is inactivated in human malignant rhabdoid tumors (MRTs), interacts with GLI1. We show that Snf5 localizes to Gli1-regulated promoters and that loss of Snf5 leads to activation of the Hh-Gli pathway. Conversely, re-expression of SNF5 in MRT cells represses GLI1. Consistent with this, we show the presence of a Hh-Gli-activated gene expression profile in primary MRTs and show that GLI1 drives the growth of SNF5-deficient MRT cells in vitro and in vivo. Therefore, our studies reveal that SNF5 is a key mediator of Hh signaling and that aberrant activation of GLI1 is a previously undescribed targetable mechanism contributing to the growth of MRT cells.

%B Nat Med %V 16 %P 1429-33 %8 2010 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/21076395?dopt=Abstract %R 10.1038/nm.2251 %0 Journal Article %J Exp Cell Res %D 2010 %T MicroRNA mir-16 is anti-proliferative in enterocytes and exhibits diurnal rhythmicity in intestinal crypts. %A Balakrishnan, Anita %A Stearns, Adam T %A Park, Peter J %A Dreyfuss, Jonathan M %A Ashley, Stanley W %A Rhoads, David B %A Tavakkolizadeh, Ali %K Animals %K Cell Enlargement %K Cell Line %K Cell Proliferation %K Circadian Rhythm %K Cyclin-Dependent Kinase 6 %K Cyclins %K DNA %K Enterocytes %K G1 Phase %K Gene Expression %K Intestinal Mucosa %K Jejunum %K Male %K MicroRNAs %K Muscle, Smooth %K Photoperiod %K Rats %K Rats, Sprague-Dawley %K Transfection %X

BACKGROUND AND AIMS: The intestine exhibits profound diurnal rhythms in function and morphology, in part due to changes in enterocyte proliferation. The regulatory mechanisms behind these rhythms remain largely unknown. We hypothesized that microRNAs are involved in mediating these rhythms, and studied the role of microRNAs specifically in modulating intestinal proliferation. METHODS: Diurnal rhythmicity of microRNAs in rat jejunum was analyzed by microarrays and validated by qPCR. Temporal expression of diurnally rhythmic mir-16 was further quantified in intestinal crypts, villi, and smooth muscle using laser capture microdissection and qPCR. Morphological changes in rat jejunum were assessed by histology and proliferation by immunostaining for bromodeoxyuridine. In IEC-6 cells stably overexpressing mir-16, proliferation was assessed by cell counting and MTS assay, cell cycle progression and apoptosis by flow cytometry, and cell cycle gene expression by qPCR and immunoblotting. RESULTS: mir-16 peaked 6 hours after light onset (HALO 6) with diurnal changes restricted to crypts. Crypt depth and villus height peaked at HALO 13-14 in antiphase to mir-16. Overexpression of mir-16 in IEC-6 cells suppressed specific G1/S regulators (cyclins D1-3, cyclin E1 and cyclin-dependent kinase 6) and produced G1 arrest. Protein expression of these genes exhibited diurnal rhythmicity in rat jejunum, peaking between HALO 11 and 17 in antiphase to mir-16. CONCLUSIONS: This is the first report of circadian rhythmicity of specific microRNAs in rat jejunum. Our data provide a link between anti-proliferative mir-16 and the intestinal proliferation rhythm and point to mir-16 as an important regulator of proliferation in jejunal crypts. This function may be essential to match proliferation and absorptive capacity with nutrient availability.

%B Exp Cell Res %V 316 %P 3512-21 %8 2010 Dec 10 %G eng %N 20 %1 http://www.ncbi.nlm.nih.gov/pubmed/20633552?dopt=Abstract %R 10.1016/j.yexcr.2010.07.007 %0 Journal Article %J BMC Bioinformatics %D 2010 %T Quantized correlation coefficient for measuring reproducibility of ChIP-chip data. %A Peng, Shouyong %A Kuroda, Mitzi I %A Park, Peter J %K Animals %K Chromatin Immunoprecipitation %K Drosophila %K Female %K Gene Expression Profiling %K Humans %K Male %K Nucleic Acid Hybridization %K Oligonucleotide Array Sequence Analysis %K Oligonucleotide Probes %X

BACKGROUND: Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in replicates, and a correlation coefficient between replicates is used often to assess reproducibility. However, the correlation coefficient can be misleading because it is affected not only by the reproducibility of the signal but also by the amount of binding signal present in the data. RESULTS: We develop the Quantized correlation coefficient (QCC) that is much less dependent on the amount of signal. This involves discretization of data into set of quantiles (quantization), a merging procedure to group the background probes, and recalculation of the Pearson correlation coefficient. This procedure reduces the influence of the background noise on the statistic, which then properly focuses more on the reproducibility of the signal. The performance of this procedure is tested in both simulated and real ChIP-chip data. For replicates with different levels of enrichment over background and coverage, we find that QCC reflects reproducibility more accurately and is more robust than the standard Pearson or Spearman correlation coefficients. The quantization and the merging procedure can also suggest a proper quantile threshold for separating signal from background for further analysis. CONCLUSIONS: To measure reproducibility of ChIP-chip data correctly, a correlation coefficient that is robust to the amount of signal present should be used. QCC is one such measure. The QCC statistic can also be applied in a variety of other contexts for measuring reproducibility, including analysis of array CGH data for DNA copy number and gene expression data.

%B BMC Bioinformatics %V 11 %P 399 %8 2010 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/20663215?dopt=Abstract %R 10.1186/1471-2105-11-399 %0 Journal Article %J Cell %D 2010 %T A region of the human HOXD cluster that confers polycomb-group responsiveness. %A Woo, Caroline J %A Kharchenko, Peter V %A Daheron, Laurence %A Park, Peter J %A Kingston, Robert E %K Cell Differentiation %K Chromatin %K Embryonic Stem Cells %K Gene Knockdown Techniques %K Genes, Homeobox %K Homeodomain Proteins %K Humans %K Intracellular Signaling Peptides and Proteins %K Mesenchymal Stromal Cells %K Nuclear Proteins %K Polycomb Repressive Complex 1 %K Polycomb-Group Proteins %K Proto-Oncogene Proteins %K Regulatory Elements, Transcriptional %K Repressor Proteins %X

Polycomb group (PcG) proteins are essential for accurate axial body patterning during embryonic development. PcG-mediated repression is conserved in metazoans and is targeted in Drosophila by Polycomb response elements (PREs). However, targeting sequences in humans have not been described. While analyzing chromatin architecture in the context of human embryonic stem cell (hESC) differentiation, we discovered a 1.8kb region between HOXD11 and HOXD12 (D11.12) that is associated with PcG proteins, becomes nuclease hypersensitive, and then shows alteration in nuclease sensitivity as hESCs differentiate. The D11.12 element repressed luciferase expression from a reporter construct and full repression required a highly conserved region and YY1 binding sites. Furthermore, repression was dependent on the PcG proteins BMI1 and EED and a YY1-interacting partner, RYBP. We conclude that D11.12 is a Polycomb-dependent regulatory region with similarities to Drosophila PREs, indicating conservation in the mechanisms that target PcG function in mammals and flies.

%B Cell %V 140 %P 99-110 %8 2010 Jan 8 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/20085705?dopt=Abstract %R 10.1016/j.cell.2009.12.022 %0 Journal Article %J Epigenomics %D 2010 %T Analysis of primary structure of chromatin with next-generation sequencing. %A Tolstorukov, Michael Y %A Kharchenko, Peter V %A Park, Peter J %X

The recent development of next-generation sequencing technology has enabled significant progress in chromatin structure analysis. Here, we review the experimental and bioinformatic approaches to studying nucleosome positioning and histone modification profiles on a genome scale using this technology. These studies advanced our knowledge of the nucleosome positioning patterns of both epigenetically modified and bulk nucleosomes and elucidated the role of such patterns in regulation of gene expression. The identification and analysis of large sets of nucleosome-bound DNA sequences allowed better understanding of the rules that govern nucleosome positioning in organisms of various complexity. We also discuss the existing challenges and prospects of using next-generation sequencing for nucleosome positioning analysis and outline the importance of such studies for the entire chromatin structure field.

%B Epigenomics %V 2 %P 187-197 %8 2010 Apr %G ENG %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/22022339?dopt=Abstract %R 10.2217/epi.09.48 %0 Journal Article %J Brief Funct Genomics %D 2010 %T Detecting structural variations in the human genome using next generation sequencing. %A Xi, Ruibin %A Kim, Tae-Min %A Park, Peter J %K Algorithms %K Chromosome Mapping %K Genetic Markers %K Genetic Variation %K Genome, Human %K Humans %K Sequence Analysis, DNA %X

Structural variations are widespread in the human genome and can serve as genetic markers in clinical and evolutionary studies. With the advances in the next-generation sequencing technology, recent methods allow for identification of structural variations with unprecedented resolution and accuracy. They also provide opportunities to discover variants that could not be detected on conventional microarray-based platforms, such as dosage-invariant chromosomal translocations and inversions. In this review, we will describe some of the sequencing-based algorithms for detection of structural variations and discuss the key issues in future development.

%B Brief Funct Genomics %V 9 %P 405-15 %8 2010 Dec %G eng %N 5-6 %1 http://www.ncbi.nlm.nih.gov/pubmed/21216738?dopt=Abstract %R 10.1093/bfgp/elq025 %0 Journal Article %J Genome Biol %D 2010 %T Estimating enrichment of repetitive elements from high-throughput sequence data. %A Day, Daniel S %A Luquette, Lovelace J %A Park, Peter J %A Kharchenko, Peter V %K Animals %K Base Sequence %K CD4-Positive T-Lymphocytes %K Cell Line %K Databases, Nucleic Acid %K Embryo, Mammalian %K Fibroblasts %K High-Throughput Screening Assays %K Histones %K Humans %K Mice %K Phylogeny %K Protein Processing, Post-Translational %K Repetitive Sequences, Nucleic Acid %K Sequence Analysis, DNA %X

We describe computational methods for analysis of repetitive elements from short-read sequencing data, and apply them to study histone modifications associated with the repetitive elements in human and mouse cells. Our results demonstrate that while accurate enrichment estimates can be obtained for individual repeat types and small sets of repeat instances, there are distinct combinatorial patterns of chromatin marks associated with major annotated repeat families, including H3K27me3/H3K9me3 differences among the endogenous retroviral element classes.

%B Genome Biol %V 11 %P R69 %8 2010 %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/20584328?dopt=Abstract %R 10.1186/gb-2010-11-6-r69 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2010 %T Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship. %A Kim, Hyunsoo %A Huang, Wei %A Jiang, Xiuli %A Pennicooke, Brenton %A Park, Peter J %A Johnson, Mark D %K Animals %K Base Sequence %K Brain Neoplasms %K Cell Line, Tumor %K Chromosomes, Human, Pair 12 %K Cyclin-Dependent Kinase 4 %K Databases, Nucleic Acid %K DNA Primers %K Gene Dosage %K Genomics %K Glioblastoma %K GTP-Binding Proteins %K GTPase-Activating Proteins %K Humans %K MAP Kinase Signaling System %K Mice %K Mice, Nude %K MicroRNAs %K Multigene Family %K Neoplasm Transplantation %K Oncogenes %K PTEN Phosphohydrolase %K Retinoblastoma Protein %K RNA Interference %K Signal Transduction %K Transplantation, Heterologous %X

Using a multidimensional genomic data set on glioblastoma from The Cancer Genome Atlas, we identified hsa-miR-26a as a cooperating component of a frequently occurring amplicon that also contains CDK4 and CENTG1, two oncogenes that regulate the RB1 and PI3 kinase/AKT pathways, respectively. By integrating DNA copy number, mRNA, microRNA, and DNA methylation data, we identified functionally relevant targets of miR-26a in glioblastoma, including PTEN, RB1, and MAP3K2/MEKK2. We demonstrate that miR-26a alone can transform cells and it promotes glioblastoma cell growth in vitro and in the mouse brain by decreasing PTEN, RB1, and MAP3K2/MEKK2 protein expression, thereby increasing AKT activation, promoting proliferation, and decreasing c-JUN N-terminal kinase-dependent apoptosis. Overexpression of miR-26a in PTEN-competent and PTEN-deficient glioblastoma cells promoted tumor growth in vivo, and it further increased growth in cells overexpressing CDK4 or CENTG1. Importantly, glioblastoma patients harboring this amplification displayed markedly decreased survival. Thus, hsa-miR-26a, CDK4, and CENTG1 comprise a functionally integrated oncomir/oncogene DNA cluster that promotes aggressiveness in human cancers by cooperatively targeting the RB1, PI3K/AKT, and JNK pathways.

%B Proc Natl Acad Sci U S A %V 107 %P 2183-8 %8 2010 Feb 2 %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/20080666?dopt=Abstract %R 10.1073/pnas.0909896107 %0 Journal Article %J Nature %D 2010 %T The Lkb1 metabolic sensor maintains haematopoietic stem cell survival. %A Gurumurthy, Sushma %A Xie, Stephanie Z %A Alagesan, Brinda %A Kim, Judith %A Yusuf, Rushdia Z %A Saez, Borja %A Tzatsos, Alexandros %A Ozsolak, Fatih %A Milos, Patrice %A Ferrari, Francesco %A Park, Peter J %A Shirihai, Orian S %A Scadden, David T %A Bardeesy, Nabeel %K Adenosine Triphosphate %K AMP-Activated Protein Kinases %K Animals %K Apoptosis %K Autophagy %K Bone Marrow %K Cell Cycle %K Cell Proliferation %K Cell Survival %K Energy Metabolism %K Enzyme Activation %K Female %K Hematopoiesis %K Hematopoietic Stem Cells %K Homeostasis %K Lipid Metabolism %K Male %K Membrane Potential, Mitochondrial %K Mice %K Mice, Inbred C57BL %K Mitochondria %K Multiprotein Complexes %K Protein-Serine-Threonine Kinases %K Proteins %K TOR Serine-Threonine Kinases %K Tumor Suppressor Proteins %X

Haematopoietic stem cells (HSCs) can convert between growth states that have marked differences in bioenergetic needs. Although often quiescent in adults, these cells become proliferative upon physiological demand. Balancing HSC energetics in response to nutrient availability and growth state is poorly understood, yet essential for the dynamism of the haematopoietic system. Here we show that the Lkb1 tumour suppressor is critical for the maintenance of energy homeostasis in haematopoietic cells. Lkb1 inactivation in adult mice causes loss of HSC quiescence followed by rapid depletion of all haematopoietic subpopulations. Lkb1-deficient bone marrow cells exhibit mitochondrial defects, alterations in lipid and nucleotide metabolism, and depletion of cellular ATP. The haematopoietic effects are largely independent of Lkb1 regulation of AMP-activated protein kinase (AMPK) and mammalian target of rapamycin (mTOR) signalling. Instead, these data define a central role for Lkb1 in restricting HSC entry into cell cycle and in broadly maintaining energy homeostasis in haematopoietic cells through a novel metabolic checkpoint.

%B Nature %V 468 %P 659-63 %8 2010 Dec 2 %G eng %N 7324 %1 http://www.ncbi.nlm.nih.gov/pubmed/21124451?dopt=Abstract %R 10.1038/nature09572 %0 Journal Article %J BMC Bioinformatics %D 2010 %T rSW-seq: algorithm for detection of copy number alterations in deep sequencing data. %A Kim, Tae-Min %A Luquette, Lovelace J %A Xi, Ruibin %A Park, Peter J %K Algorithms %K Base Sequence %K Computer Simulation %K DNA Copy Number Variations %K DNA, Neoplasm %K Genome, Human %K Humans %K Oligonucleotide Array Sequence Analysis %K Sequence Analysis, DNA %K Tumor Cells, Cultured %X

BACKGROUND: Recent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy. RESULTS: We develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq) to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results. CONCLUSION: We propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.

%B BMC Bioinformatics %V 11 %P 432 %8 2010 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/20718989?dopt=Abstract %R 10.1186/1471-2105-11-432 %0 Journal Article %J Nat Struct Mol Biol %D 2009 %T Drosophila MSL complex globally acetylates H4K16 on the male X chromosome for dosage compensation. %A Gelbart, Marnie E %A Larschan, Erica %A Peng, Shouyong %A Park, Peter J %A Kuroda, Mitzi I %K Acetylation %K Animals %K Blotting, Western %K Cell Line %K Chromatin Immunoprecipitation %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Female %K Gene Expression Profiling %K Histone Acetyltransferases %K Histones %K Lysine %K Male %K Mutation %K RNA Interference %K X Chromosome %X

The Drosophila melanogaster male-specific lethal (MSL) complex binds the single male X chromosome to upregulate gene expression to equal that from the two female X chromosomes. However, it has been puzzling that approximately 25% of transcribed genes on the X chromosome do not stably recruit MSL complex. Here we find that almost all active genes on the X chromosome are associated with robust H4 Lys16 acetylation (H4K16ac), the histone modification catalyzed by the MSL complex. The distribution of H4K16ac is much broader than that of the MSL complex, and our results favor the idea that chromosome-wide H4K16ac reflects transient association of the MSL complex, occurring through spreading or chromosomal looping. Our results parallel those of localized Polycomb repressive complex and its more broadly distributed chromatin mark, trimethylated histone H3 Lys27 (H3K27me3), suggesting a common principle for the establishment of active and silenced chromatin domains.

%B Nat Struct Mol Biol %V 16 %P 825-32 %8 2009 Aug %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/19648925?dopt=Abstract %R 10.1038/nsmb.1644 %0 Journal Article %J Genes Chromosomes Cancer %D 2009 %T Identifying the molecular signature of the interstitial deletion 7q subgroup of uterine leiomyomata using a paired analysis. %A Hodge, Jennelle C %A Park, Peter J %A Dreyfuss, Jonathan M %A Assil-Kishawi, Iman %A Somasundaram, Priya %A Semere, Luwam G %A Quade, Bradley J %A Lynch, Allison M %A Stewart, Elizabeth A %A Morton, Cynthia C %K Adult %K Chromosome Aberrations %K Chromosome Deletion %K Chromosomes, Human, Pair 7 %K Comparative Genomic Hybridization %K DNA-Binding Proteins %K Female %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K In Situ Hybridization, Fluorescence %K Karyotyping %K Leiomyoma %K Middle Aged %K Mosaicism %K Polymerase Chain Reaction %K Reproducibility of Results %K Uterine Neoplasms %K Uterus %X

Uterine leiomyomata (UL), the most common neoplasm in reproductive-age women, have recurrent cytogenetic abnormalities including interstitial deletion of 7q. To develop a molecular signature, matched del(7q) and non-del(7q) tumors identified by FISH or karyotyping from 11 women were profiled with expression arrays. Our analysis using paired t tests demonstrates this matched design is critical to eliminate the confounding effects of genotype and environment that underlie patient variation. A gene list ordered by genome-wide significance showed enrichment for the 7q22 target region. Modification of the gene list by weighting each sample for percent of del(7q) cells to account for the mosaic nature of these tumors further enhanced the frequency of 7q22 genes. Pathway analysis revealed two of the 19 significant functional networks were associated with development and the most represented pathway was protein ubiquitination, which can influence tumor development by stabilizing oncoproteins and destabilizing tumor suppressor proteins. Array CGH (aCGH) studies determined the only consistent genomic imbalance was deletion of 9.5 megabases from 7q22-7q31.1. Combining the aCGH data with the del(7q) UL mosaicism-weighted expression analysis resulted in a list of genes that are commonly deleted and whose copy number is correlated with significantly decreased expression. These genes include the proliferation inhibitor HPB1, the loss of expression of which has been associated with invasive breast cancer, as well as the mitosis integrity-maintenance tumor suppressor RINT1. This study provides a molecular signature of the del(7q) UL subgroup and will serve as a platform for future studies of tumor pathogenesis.

%B Genes Chromosomes Cancer %V 48 %P 865-85 %8 2009 Oct %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/19603527?dopt=Abstract %R 10.1002/gcc.20692 %0 Journal Article %J Genes Dev %D 2009 %T Long-range spreading of dosage compensation in Drosophila captures transcribed autosomal genes inserted on X. %A Gorchakov, Andrey A %A Alekseyenko, Artyom A %A Kharchenko, Peter %A Park, Peter J %A Kuroda, Mitzi I %K Animals %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Gene Expression Regulation %K Male %K X Chromosome %X

Dosage compensation in Drosophila melanogaster males is achieved via targeting of male-specific lethal (MSL) complex to X-linked genes. This is proposed to involve sequence-specific recognition of the X at approximately 150-300 chromatin entry sites, and subsequent spreading to active genes. Here we ask whether the spreading step requires transcription and is sequence-independent. We find that MSL complex binds, acetylates, and up-regulates autosomal genes inserted on X, but only if transcriptionally active. We conclude that a long-sought specific DNA sequence within X-linked genes is not obligatory for MSL binding. Instead, linkage and transcription play the pivotal roles in MSL targeting irrespective of gene origin and DNA sequence.

%B Genes Dev %V 23 %P 2266-71 %8 2009 Oct 1 %G eng %N 19 %1 http://www.ncbi.nlm.nih.gov/pubmed/19797766?dopt=Abstract %R 10.1101/gad.1840409 %0 Journal Article %J Mol Cancer %D 2009 %T Meta-analysis of glioblastoma multiforme versus anaplastic astrocytoma identifies robust gene markers. %A Dreyfuss, Jonathan M %A Johnson, Mark D %A Park, Peter J %K Astrocytoma %K Brain Neoplasms %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Gene Regulatory Networks %K Glioblastoma %K Humans %K Hypoxia-Inducible Factor 1, alpha Subunit %K Tumor Markers, Biological %K Vascular Endothelial Growth Factor A %X

BACKGROUND: Anaplastic astrocytoma (AA) and its more aggressive counterpart, glioblastoma multiforme (GBM), are the most common intrinsic brain tumors in adults and are almost universally fatal. A deeper understanding of the molecular relationship of these tumor types is necessary to derive insights into the diagnosis, prognosis, and treatment of gliomas. Although genomewide profiling of expression levels with microarrays can be used to identify differentially expressed genes between these tumor types, comparative studies so far have resulted in gene lists that show little overlap. RESULTS: To achieve a more accurate and stable list of the differentially expressed genes and pathways between primary GBM and AA, we performed a meta-analysis using publicly available genome-scale mRNA data sets. There were four data sets with sufficiently large sample sizes of both GBMs and AAs, all of which coincidentally used human U133 platforms from Affymetrix, allowing for easier and more precise integration of data. After scoring genes and pathways within each data set, we combined the statistics across studies using the nonparametric rank sum method to identify the features that differentiate GBMs and AAs. We found >900 statistically significant probe sets after correction for multiple testing from the >22,000 tested. We also used the rank sum approach to select >20 significant Biocarta pathways after correction for multiple testing out of >175 pathways examined. The most significant pathway was the hypoxia-inducible factor (HIF) pathway. Our analysis suggests that many of the most statistically significant genes work together in a HIF1A/VEGF-regulated network to increase angiogenesis and invasion in GBM when compared to AA. CONCLUSION: We have performed a meta-analysis of genome-scale mRNA expression data for 289 human malignant gliomas and have identified a list of >900 probe sets and >20 pathways that are significantly different between GBM and AA. These feature lists could be utilized to aid in diagnosis, prognosis, and grade reduction of high-grade gliomas and to identify genes that were not previously suspected of playing an important role in glioma biology. More generally, this approach suggests that combined analysis of existing data sets can reveal new insights and that the large amount of publicly available cancer data sets should be further utilized in a similar manner.

%B Mol Cancer %V 8 %P 71 %8 2009 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/19732454?dopt=Abstract %R 10.1186/1476-4598-8-71 %0 Journal Article %J Comput Stat Data Anal %D 2009 %T A permutation test for determining significance of clusters with applications to spatial and gene expression data. %A Park, P J %A Manjourides, J %A Bonetti, M %A Pagano, M %X

Hierarchical clustering is a common procedure for identifying structure in a data set, and this is frequently used for organizing genomic data. Although more advanced clustering algorithms are available, the simplicity and visual appeal of hierarchical clustering has made it ubiquitous in gene expression data analysis. Hence, even minor improvements in this framework would have significant impact. There is currently no simple and systematic way of assessing and displaying the significance of various clusters in a resulting dendrogram without making certain distributional assumptions or ignoring gene-specific variances. In this work, we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division. Given these values, one can also modify the dendrogram by combining non-significant branches. By adjusting the cut-off level of significance for branches, one can produce dendrograms with a desired level of detail for ease of interpretation. We demonstrate the usefulness of this approach by applying it to illustrative data sets.

%B Comput Stat Data Anal %V 53 %P 4290-4300 %8 2009 Oct 1 %G ENG %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/21258660?dopt=Abstract %0 Journal Article %J Blood %D 2009 %T Surface antigen phenotypes of hematopoietic stem cells from embryos and murine embryonic stem cells. %A McKinney-Freeman, Shannon L %A Naveiras, Olaia %A Yates, Frank %A Loewer, Sabine %A Philitas, Marsha %A Curran, Matthew %A Park, Peter J %A Daley, George Q %K Animals %K Antigens, CD %K Cells, Cultured %K Embryo, Mammalian %K Embryonic Stem Cells %K Female %K Hematopoietic Stem Cells %K Mice %K Mice, Inbred C57BL %K Phenotype %K Placenta %X

Surface antigens on hematopoietic stem cells (HSCs) enable prospective isolation and characterization. Here, we compare the cell-surface phenotype of hematopoietic repopulating cells from murine yolk sac, aorta-gonad-mesonephros, placenta, fetal liver, and bone marrow with that of HSCs derived from the in vitro differentiation of murine embryonic stem cells (ESC-HSCs). Whereas c-Kit marks all HSC populations, CD41, CD45, CD34, and CD150 were developmentally regulated: the earliest embryonic HSCs express CD41 and CD34 and lack CD45 and CD150, whereas more mature HSCs lack CD41 and CD34 and express CD45 and CD150. ESC-HSCs express CD41 and CD150, lack CD34, and are heterogeneous for CD45. Finally, although CD48 was absent from all in vivo HSCs examined, ESC-HSCs were heterogeneous for the expression of this molecule. This unique phenotype signifies a developmentally immature population of cells with features of both primitive and mature HSC. The prospective fractionation of ESC-HSCs will facilitate studies of HSC maturation essential for normal functional engraftment in irradiated adults.

%B Blood %V 114 %P 268-78 %8 2009 Jul 9 %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/19420357?dopt=Abstract %R 10.1182/blood-2008-12-193888 %0 Journal Article %J J Clin Endocrinol Metab %D 2009 %T Thyroid hormone-related regulation of gene expression in human fatty liver. %A Pihlajamäki, Jussi %A Boes, Tanner %A Kim, Eun-Young %A Dearie, Farrell %A Kim, Brian W %A Schroeder, Joshua %A Mun, Edward %A Nasser, Imad %A Park, Peter J %A Bianco, Antonio C %A Goldfine, Allison B %A Patti, Mary Elizabeth %K Adult %K Animals %K Diabetes Mellitus, Type 2 %K Fatty Liver %K Female %K Gene Expression Regulation %K Heat-Shock Proteins %K Humans %K Insulin Resistance %K Iodide Peroxidase %K Liver %K Male %K Mice %K Mice, Inbred C57BL %K Middle Aged %K Receptors, Leptin %K Transcription Factors %K Triiodothyronine %X

CONTEXT: Fatty liver is an important complication of obesity; however, regulatory mechanisms mediating altered gene expression patterns have not been identified. OBJECTIVE: The aim of the study was to identify novel transcriptional changes in human liver that could contribute to hepatic lipid accumulation and associated insulin resistance, type 2 diabetes, and nonalcoholic steatohepatitis. DESIGN: We evaluated gene expression in surgical liver biopsies from 13 obese (nine with type 2 diabetes) and five control subjects using Affymetrix U133A microarrays. PCR validation was performed in liver biopsies using an additional 16 subjects. We also tested thyroid hormone responses in mice fed chow or high-fat diet. SETTING: Recruitment was performed in an academic medical center. PARTICIPANTS: Individuals undergoing elective surgery for obesity or gallstones participated in the study. RESULTS: The top-ranking gene set, down-regulated in obese subjects, was comprised of genes previously demonstrated to be positively regulated by T(3) in human skeletal muscle (n = 399; P < 0.001; false discovery rate = 0.07). This gene set included genes related to RNA metabolism (SNRPE, HNRPH3, TIA1, and SFRS2), protein catabolism (PSMA1, PSMD12, USP9X, IBE2B, USP16, and PCMT1), and energy metabolism (ATP5C1, COX7C, UQCRB). We verified thyroid hormone regulation of these genes in the liver after injection of C57BL/6J mice with T(3) (100 microg/100 g body weight); furthermore, T(3)-induced increases in expression of these genes were abolished by high-fat diet. In agreement, expression of these genes inversely correlated with liver fat content in humans. CONCLUSIONS: These data suggest that impaired thyroid hormone action may contribute to altered patterns of gene expression in fatty liver.

%B J Clin Endocrinol Metab %V 94 %P 3521-9 %8 2009 Sep %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/19549744?dopt=Abstract %R 10.1210/jc.2009-0212 %0 Journal Article %J Development %D 2009 %T Apc inhibition of Wnt signaling regulates supernumerary tooth formation during embryogenesis and throughout adulthood. %A Wang, Xiu-Ping %A O'Connell, Daniel J %A Lund, Jennifer J %A Saadi, Irfan %A Kuraguchi, Mari %A Turbe-Doan, Annick %A Cavallesco, Resy %A Kim, Hyunsoo %A Park, Peter J %A Harada, Hidemitsu %A Kucherlapati, Raju %A Maas, Richard L %K Adenomatous Polyposis Coli Protein %K Animals %K beta Catenin %K Cells, Cultured %K Embryonic Development %K Fibroblast Growth Factor 8 %K Mice %K Mice, Transgenic %K MSX1 Transcription Factor %K Signal Transduction %K Tooth, Supernumerary %K Wnt Proteins %X

The ablation of Apc function or the constitutive activation of beta-catenin in embryonic mouse oral epithelium results in supernumerary tooth formation, but the underlying mechanisms and whether adult tissues retain this potential are unknown. Here we show that supernumerary teeth can form from multiple regions of the jaw and that they are properly mineralized, vascularized, innervated and can start to form roots. Even adult dental tissues can form new teeth in response to either epithelial Apc loss-of-function or beta-catenin activation, and the effect of Apc deficiency is mediated by beta-catenin. The formation of supernumerary teeth via Apc loss-of-function is non-cell-autonomous. A small number of Apc-deficient cells is sufficient to induce surrounding wild-type epithelial and mesenchymal cells to participate in the formation of new teeth. Strikingly, Msx1, which is necessary for endogenous tooth development, is dispensable for supernumerary tooth formation. In addition, we identify Fgf8, a known tooth initiation marker, as a direct target of Wnt/beta-catenin signaling. These studies identify key mechanistic features responsible for supernumerary tooth formation.

%B Development %V 136 %P 1939-49 %8 2009 Jun %G eng %N 11 %1 http://www.ncbi.nlm.nih.gov/pubmed/19429790?dopt=Abstract %R 10.1242/dev.033803 %0 Journal Article %J Nat Rev Genet %D 2009 %T ChIP-seq: advantages and challenges of a maturing technology. %A Park, Peter J %K Animals %K Chromatin Immunoprecipitation %K Computational Biology %K DNA-Binding Proteins %K Epigenesis, Genetic %K Humans %K Nucleosomes %K Sequence Analysis, DNA %X

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a technique for genome-wide profiling of DNA-binding proteins, histone modifications or nucleosomes. Owing to the tremendous progress in next-generation sequencing technology, ChIP-seq offers higher resolution, less noise and greater coverage than its array-based predecessor ChIP-chip. With the decreasing cost of sequencing, ChIP-seq has become an indispensable tool for studying gene regulation and epigenetic mechanisms. In this Review, I describe the benefits and challenges in harnessing this technique with an emphasis on issues related to experimental design and data analysis. ChIP-seq experiments generate large quantities of data, and effective computational analysis will be crucial for uncovering biological mechanisms.

%B Nat Rev Genet %V 10 %P 669-80 %8 2009 Oct %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/19736561?dopt=Abstract %R 10.1038/nrg2641 %0 Journal Article %J Genome Res %D 2009 %T Comparative analysis of H2A.Z nucleosome organization in the human and yeast genomes. %A Tolstorukov, Michael Y* %A Kharchenko, Peter V* %A Goldman, Joseph A %A Kingston, Robert E %A Park, Peter J %K Base Composition %K Chromatin Immunoprecipitation %K Genome, Fungal %K Genome, Human %K HeLa Cells %K Histones %K Humans %K Models, Molecular %K Nucleic Acid Conformation %K Nucleosomes %K Protein Conformation %K Saccharomyces cerevisiae Proteins %K Sequence Analysis, DNA %X

Eukaryotic DNA is wrapped around a histone protein core to constitute the fundamental repeating units of chromatin, the nucleosomes. The affinity of the histone core for DNA depends on the nucleotide sequence; however, it is unclear to what extent DNA sequence determines nucleosome positioning in vivo, and if the same rules of sequence-directed positioning apply to genomes of varying complexity. Using the data generated by high-throughput DNA sequencing combined with chromatin immunoprecipitation, we have identified positions of nucleosomes containing the H2A.Z histone variant and histone H3 trimethylated at lysine 4 in human CD4(+) T-cells. We find that the 10-bp periodicity observed in nucleosomal sequences in yeast and other organisms is not pronounced in human nucleosomal sequences. This result was confirmed for a broader set of mononucleosomal fragments that were not selected for any specific histone variant or modification. We also find that human H2A.Z nucleosomes protect only approximately 120 bp of DNA from MNase digestion and exhibit specific sequence preferences, suggesting a novel mechanism of nucleosome organization for the H2A.Z variant.

%B Genome Res %V 19 %P 967-77 %8 2009 Jun %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/19246569?dopt=Abstract %R 10.1101/gr.084830.108 %0 Journal Article %J Int J Radiat Oncol Biol Phys %D 2009 %T Efficacy of sunitinib and radiotherapy in genetically engineered mouse model of soft-tissue sarcoma. %A Yoon, Sam S %A Stangenberg, Lars %A Lee, Yoon-Jin %A Rothrock, Courtney %A Dreyfuss, Jonathan M %A Baek, Kwan-Hyuck %A Waterman, Peter R %A Nielsen, G Petur %A Weissleder, Ralph %A Mahmood, Umar %A Park, Peter J %A Jacks, Tyler %A Dodd, Rebecca D %A Fisher, Carolyn J %A Ryeom, Sandra %A Kirsch, David G %K Angiogenesis Inhibitors %K Animals %K Antineoplastic Agents %K Combined Modality Therapy %K Drug Screening Assays, Antitumor %K Indoles %K Mice %K Mice, Transgenic %K Pyrroles %K Random Allocation %K Receptor, Platelet-Derived Growth Factor beta %K Sarcoma %K Vascular Endothelial Growth Factor Receptor-2 %X

PURPOSE: Sunitinib (SU) is a multitargeted receptor tyrosine kinase inhibitor of the vascular endothelial growth factor and platelet-derived growth factor receptors. The present study examined SU and radiotherapy (RT) in a genetically engineered mouse model of soft tissue sarcoma (STS). METHODS AND MATERIALS: Primary extremity STSs were generated in genetically engineered mice. The mice were randomized to treatment with SU, RT (10 Gy x 2), or both (SU+RT). Changes in the tumor vasculature before and after treatment were assessed in vivo using fluorescence-mediated tomography. The control and treated tumors were harvested and extensively analyzed. RESULTS: The mean fluorescence in the tumors was not decreased by RT but decreased 38-44% in tumors treated with SU or SU+RT. The control tumors grew to a mean of 1378 mm(3) after 12 days. SU alone or RT alone delayed tumor growth by 56% and 41%, respectively, but maximal growth inhibition (71%) was observed with the combination therapy. SU target effects were confirmed by loss of target receptor phosphorylation and alterations in SU-related gene expression. Cancer cell proliferation was decreased and apoptosis increased in the SU and RT groups, with a synergistic effect on apoptosis observed in the SU+RT group. RT had a minimal effect on the tumor microvessel density and endothelial cell-specific apoptosis, but SU alone or SU+RT decreased the microvessel density by >66% and induced significant endothelial cell apoptosis. CONCLUSION: SU inhibited STS growth by effects on both cancer cells and tumor vasculature. SU also augmented the efficacy of RT, suggesting that this combination strategy could improve local control of STS.

%B Int J Radiat Oncol Biol Phys %V 74 %P 1207-16 %8 2009 Jul 15 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/19545786?dopt=Abstract %R 10.1016/j.ijrobp.2009.02.052 %0 Journal Article %J Bioinformatics %D 2009 %T Integration of heterogeneous expression data sets extends the role of the retinol pathway in diabetes and insulin resistance. %A Park, Peter J %A Kong, Sek Won %A Tebaldi, Toma %A Lai, Weil R %A Kasif, Simon %A Kohane, Isaac S %K Computational Biology %K Databases, Genetic %K Diabetes Mellitus, Type 2 %K Gene Expression Profiling %K Humans %K Insulin Resistance %K Vitamin A %X

MOTIVATION: Type 2 diabetes is a chronic metabolic disease that involves both environmental and genetic factors. To understand the genetics of type 2 diabetes and insulin resistance, the DIabetes Genome Anatomy Project (DGAP) was launched to profile gene expression in a variety of related animal models and human subjects. We asked whether these heterogeneous models can be integrated to provide consistent and robust biological insights into the biology of insulin resistance. RESULTS: We perform integrative analysis of the 16 DGAP data sets that span multiple tissues, conditions, array types, laboratories, species, genetic backgrounds and study designs. For each data set, we identify differentially expressed genes compared with control. Then, for the combined data, we rank genes according to the frequency with which they were found to be statistically significant across data sets. This analysis reveals RetSat as a widely shared component of mechanisms involved in insulin resistance and sensitivity and adds to the growing importance of the retinol pathway in diabetes, adipogenesis and insulin resistance. Top candidates obtained from our analysis have been confirmed in recent laboratory studies.

%B Bioinformatics %V 25 %P 3121-7 %8 2009 Dec 1 %G eng %N 23 %1 http://www.ncbi.nlm.nih.gov/pubmed/19786482?dopt=Abstract %R 10.1093/bioinformatics/btp559 %0 Journal Article %J Neurobiol Aging %D 2008 %T Aging elevates metabolic gene expression in brain cholinergic neurons. %A Baskerville, Karen A %A Kent, Caroline %A Personett, David %A Lai, Weil R %A Park, Peter J %A Coleman, Paul %A McKinney, Michael %K Acetylcholine %K Aging %K Animals %K Brain %K Gene Expression Regulation %K Nerve Tissue Proteins %K Neurons %K Prosencephalon %K Rats %K Rats, Inbred F344 %K Up-Regulation %X The basal forebrain (BF) cholinergic system is selectively vulnerable in human brain diseases, while the cholinergic groups in the upper pons of the brainstem (BS) resist neurodegeneration. Cholinergic neurons (200 per region per animal) were laser-microdissected from five young (8 months) and five aged (24 months) F344 rats from the BF and the BS pontine lateral dorsal tegmental/pedunculopontine nuclei (LDTN/PPN) and their expression profiles were obtained. The bioinformatics program SigPathway was used to identify gene groups and pathways that were selectively affected by aging. In the BF cholinergic system, aging most significantly altered genes involved with a variety of metabolic functions. In contrast, BS cholinergic neuronal age effects included gene groupings related to neuronal plasticity and a broad range of normal cellular functions. Transcription factor GA-binding protein alpha (GABPalpha), which controls expression of nuclear genes encoding mitochondrial proteins, was more strongly upregulated in the BF cholinergic neurons (+107%) than in the BS cholinergic population (+40%). The results suggest that aging elicits elevates metabolic activity in cholinergic populations and that this occurs to a much greater degree in the BF group than in the BS group. %B Neurobiol Aging %V 29 %P 1874-93 %8 2008 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/17560690?dopt=Abstract %R 10.1016/j.neurobiolaging.2007.04.024 %0 Journal Article %J Nature %D 2008 %T Comprehensive genomic characterization defines human glioblastoma genes and core pathways. %A Cancer Genome Atlas Research Network, The Cancer Genome Atlas %K Adolescent %K Adult %K Aged %K Aged, 80 and over %K Brain Neoplasms %K DNA Methylation %K DNA Modification Methylases %K DNA Repair %K DNA Repair Enzymes %K Female %K Gene Dosage %K Gene Expression Regulation, Neoplastic %K Genes, erbB-1 %K Genes, Tumor Suppressor %K Genome, Human %K Genomics %K Glioblastoma %K Humans %K Male %K Middle Aged %K Models, Molecular %K Mutation %K Neurofibromin 1 %K Phosphatidylinositol 3-Kinases %K Protein Structure, Tertiary %K Retrospective Studies %K Signal Transduction %K Tumor Suppressor Proteins %X

Human cancer cells typically harbour multiple chromosomal aberrations, nucleotide substitutions and epigenetic modifications that drive malignant transformation. The Cancer Genome Atlas (TCGA) pilot project aims to assess the value of large-scale multi-dimensional analysis of these molecular characteristics in human cancer and to provide the data rapidly to the research community. Here we report the interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas--the most common type of adult brain cancer--and nucleotide sequence aberrations in 91 of the 206 glioblastomas. This analysis provides new insights into the roles of ERBB2, NF1 and TP53, uncovers frequent mutations of the phosphatidylinositol-3-OH kinase regulatory subunit gene PIK3R1, and provides a network view of the pathways altered in the development of glioblastoma. Furthermore, integration of mutation, DNA methylation and clinical treatment data reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated glioblastomas, an observation with potential clinical implications. Together, these findings establish the feasibility and power of TCGA, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.

%B Nature %V 455 %P 1061-8 %8 2008 Oct 23 %G eng %N 7216 %1 http://www.ncbi.nlm.nih.gov/pubmed/18772890?dopt=Abstract %R 10.1038/nature07385 %0 Journal Article %J Nat Genet %D 2008 %T The mouse X chromosome is enriched for multicopy testis genes showing postmeiotic expression. %A Mueller, Jacob L %A Mahadevaiah, Shantha K %A Park, Peter J %A Warburton, Peter E %A Page, David C %A Turner, James M A %K Animals %K DNA Probes %K Gene Dosage %K Gene Expression Regulation, Developmental %K Genes, X-Linked %K In Situ Hybridization, Fluorescence %K Male %K Meiosis %K Mice %K Oligonucleotide Array Sequence Analysis %K Reverse Transcriptase Polymerase Chain Reaction %K RNA Probes %K RNA, Messenger %K Spermatogenesis %K Spermatozoa %K Testis %K X Chromosome %X

According to the prevailing view, mammalian X chromosomes are enriched in spermatogenesis genes expressed before meiosis and deficient in spermatogenesis genes expressed after meiosis. The paucity of postmeiotic genes on the X chromosome has been interpreted as a consequence of meiotic sex chromosome inactivation (MSCI)--the complete silencing of genes on the XY bivalent at meiotic prophase. Recent studies have concluded that MSCI-initiated silencing persists beyond meiosis and that most genes on the X chromosome remain repressed in round spermatids. Here, we report that 33 multicopy gene families, representing approximately 273 mouse X-linked genes, are expressed in the testis and that this expression is predominantly in postmeiotic cells. RNA FISH and microarray analysis show that the maintenance of X chromosome postmeiotic repression is incomplete. Furthermore, X-linked multicopy genes exhibit a similar degree of expression as autosomal genes. Thus, not only is the mouse X chromosome enriched for spermatogenesis genes functioning before meiosis, but in addition, approximately 18% of mouse X-linked genes are expressed in postmeiotic cells.

%B Nat Genet %V 40 %P 794-9 %8 2008 Jun %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/18454149?dopt=Abstract %R 10.1038/ng.126 %0 Journal Article %J Blood %D 2008 %T Pathway analysis of primary central nervous system lymphoma. %A Tun, Han W %A Personett, David %A Baskerville, Karen A %A Menke, David M %A Jaeckle, Kurt A %A Kreinest, Pamela %A Edenfield, Brandy %A Zubair, Abba C %A O'Neill, Brian P %A Lai, Weil R %A Park, Peter J %A McKinney, Michael %K Central Nervous System Neoplasms %K Computational Biology %K Gene Expression Regulation, Neoplastic %K Genome, Human %K Humans %K Immunohistochemistry %K Lymphoma, Large B-Cell, Diffuse %K Oligonucleotide Array Sequence Analysis %K Software %X

Primary central nervous system (CNS) lymphoma (PCNSL) is a diffuse large B-cell lymphoma (DLBCL) confined to the CNS. A genome-wide gene expression comparison between PCNSL and non-CNS DLBCL was performed, the latter consisting of both nodal and extranodal DLBCL (nDLBCL and enDLBCL), to identify a "CNS signature." Pathway analysis with the program SigPathway revealed that PCNSL is characterized notably by significant differential expression of multiple extracellular matrix (ECM) and adhesion-related pathways. The most significantly up-regulated gene is the ECM-related osteopontin (SPP1). Expression at the protein level of ECM-related SPP1 and CHI3L1 in PCNSL cells was demonstrated by immunohistochemistry. The alterations in gene expression can be interpreted within several biologic contexts with implications for PCNSL, including CNS tropism (ECM and adhesion-related pathways, SPP1, DDR1), B-cell migration (CXCL13, SPP1), activated B-cell subtype (MUM1), lymphoproliferation (SPP1, TCL1A, CHI3L1), aggressive clinical behavior (SPP1, CHI3L1, MUM1), and aggressive metastatic cancer phenotype (SPP1, CHI3L1). The gene expression signature discovered in our study may represent a true "CNS signature" because we contrasted PCNSL with wide-spectrum non-CNS DLBCL on a genomic scale and performed an in-depth bioinformatic analysis.

%B Blood %V 111 %P 3200-10 %8 2008 Mar 15 %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/18184868?dopt=Abstract %R 10.1182/blood-2007-10-119099 %0 Journal Article %J Cancer Res %D 2008 %T Specific genes expressed in association with progesterone receptors in meningioma. %A Claus, Elizabeth B %A Park, Peter J %A Carroll, Rona %A Chan, Jennifer %A Black, Peter M %K Chromosomes, Human, Pair 22 %K Female %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K Male %K Meningeal Neoplasms %K Meningioma %K Middle Aged %K Neoplasms, Hormone-Dependent %K Receptors, Estrogen %K Receptors, Progesterone %X

An association between hormones and meningioma has been postulated. No data exist that examine gene expression in meningioma by hormone receptor status. The data are surgical specimens from 31 meningioma patients undergoing neurosurgical resection at Brigham and Women's Hospital from March 15, 2004 to May 10, 2005. Progesterone and estrogen hormone receptors (PR and ER, respectively) were measured via immunohistochemistry and compared with gene expression profiling results. The sample is 77% female with a mean age of 55.7 years. Eighty percent were grade 1 and the mean MIB was 6.2, whereas 33% and 84% were ER+ and PR+, respectively. Gene expression seemed more strongly associated with PR status than with ER status. Genes on the long arm of chromosome 22 and near the neurofibromatosis type 2 (NF2) gene (22q12) were most frequently noted to have expression variation, with significant up-regulation in PR+ versus PR- lesions, suggesting a higher rate of 22q loss in PR- lesions. Pathway analyses indicated that genes in collagen and extracellular matrix pathways were most likely to be differentially expressed by PR status. These data, although preliminary, are the first to examine gene expression for meningioma cases by hormone receptor status and indicate a stronger association with PR than with ER status. PR status is related to the expression of genes near the NF2 gene, mutations in which have been identified as the initial event in many meningiomas. These findings suggest that PR status may be a clinical marker for genetic subgroups of meningioma and warrant further examination in a larger data set.

%B Cancer Res %V 68 %P 314-22 %8 2008 Jan 1 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/18172325?dopt=Abstract %R 10.1158/0008-5472.CAN-07-1796 %0 Journal Article %J Bioinformatics %D 2008 %T CGHweb: a tool for comparing DNA copy number segmentations from multiple algorithms. %A Lai, Weil %A Choudhary, Vidhu %A Park, Peter J %K Algorithms %K Base Sequence %K Chromosome Mapping %K Computer graphics %K Gene Dosage %K In Situ Hybridization, Fluorescence %K Internet %K Molecular Sequence Data %K Sequence Alignment %K Sequence Analysis, DNA %K Software %K User-Computer Interface %X

UNLABELLED: Accurate estimation of DNA copy numbers from array comparative genomic hybridization (CGH) data is important for characterizing the cancer genome. An important part of this process is the segmentation of the log-ratios between the sample and control DNA along the chromosome into regions of different copy numbers. However, multiple algorithms are available in the literature for this procedure and the results can vary substantially among these. Thus, a visualization tool that can display the segmented profiles from a number of methods can be helpful to the biologist or the clinician to ascertain that a feature of interest did not arise as an artifact of the algorithm. Such a tool also allows the methodologist to easily contrast his method against others. We developed a web-based tool that applies a number of popular algorithms to a single array CGH profile entered by the user. It generates a heatmap panel of the segmented profiles for each method as well as a consensus profile. The clickable heatmap can be moved along the chromosome and zoomed in or out. It also displays the time that each algorithm took and provides numerical values of the segmented profiles for download. The web interface calls algorithms written in the statistical language R. We encourage developers of new algorithms to submit their routines to be incorporated into the website. AVAILABILITY: http://compbio.med.harvard.edu/CGHweb.

%B Bioinformatics %V 24 %P 1014-5 %8 2008 Apr 1 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/18296463?dopt=Abstract %R 10.1093/bioinformatics/btn067 %0 Journal Article %J Nat Biotechnol %D 2008 %T Design and analysis of ChIP-seq experiments for DNA-binding proteins. %A Kharchenko, Peter V %A Tolstorukov, Michael Y %A Park, Peter J %K Algorithms %K Binding Sites %K Chromatin Immunoprecipitation %K Computational Biology %K DNA-Binding Proteins %K Protein Binding %K Research Design %K Sequence Analysis, DNA %K Software %K Transcription Factors %X

Recent progress in massively parallel sequencing platforms has enabled genome-wide characterization of DNA-associated proteins using the combination of chromatin immunoprecipitation and sequencing (ChIP-seq). Although a variety of methods exist for analysis of the established alternative ChIP microarray (ChIP-chip), few approaches have been described for processing ChIP-seq data. To fill this gap, we propose an analysis pipeline specifically designed to detect protein-binding positions with high accuracy. Using previously reported data sets for three transcription factors, we illustrate methods for improving tag alignment and correcting for background signals. We compare the sensitivity and spatial precision of three peak detection algorithms with published methods, demonstrating gains in spatial precision when an asymmetric distribution of tags on positive and negative strands is considered. We also analyze the relationship between the depth of sequencing and characteristics of the detected binding positions, and provide a method for estimating the sequencing depth necessary for a desired coverage of protein binding sites.

%B Nat Biotechnol %V 26 %P 1351-9 %8 2008 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/19029915?dopt=Abstract %R 10.1038/nbt.1508 %0 Journal Article %J Dev Cell %D 2008 %T Differential H3K4 methylation identifies developmentally poised hematopoietic genes. %A Orford, Keith* %A Kharchenko, Peter* %A Lai, Weil %A Dao, Maria Carlota %A Worhunsky, David J %A Ferro, Adam %A Janzen, Viktor %A Park, Peter J** %A Scadden, David T** %K Animals %K Binding Sites %K Bone Marrow Cells %K Cell Differentiation %K Cell Line %K Cell Lineage %K CpG Islands %K Embryonic Stem Cells %K Gene Expression Regulation, Developmental %K Genes, Developmental %K Genome %K Hematopoietic System %K Histones %K Humans %K Lysine %K Methylation %K Mice %K Models, Genetic %K Promoter Regions, Genetic %K Transcription Factors %K Transcription Initiation Site %K Transcription, Genetic %X

Throughout development, cell fate decisions are converted into epigenetic information that determines cellular identity. Covalent histone modifications are heritable epigenetic marks and are hypothesized to play a central role in this process. In this report, we assess the concordance of histone H3 lysine 4 dimethylation (H3K4me2) and trimethylation (H3K4me3) on a genome-wide scale in erythroid development by analyzing pluripotent, multipotent, and unipotent cell types. Although H3K4me2 and H3K4me3 are concordant at most genes, multipotential hematopoietic cells have a subset of genes that are differentially methylated (H3K4me2+/me3-). These genes are transcriptionally silent, highly enriched in lineage-specific hematopoietic genes, and uniquely susceptible to differentiation-induced H3K4 demethylation. Self-renewing embryonic stem cells, which restrict H3K4 methylation to genes that contain CpG islands (CGIs), lack H3K4me2+/me3- genes. These data reveal distinct epigenetic regulation of CGI and non-CGI genes during development and indicate an interactive relationship between DNA sequence and differential H3K4 methylation in lineage-specific differentiation.

%B Dev Cell %V 14 %P 798-809 %8 2008 May %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/18477461?dopt=Abstract %R 10.1016/j.devcel.2008.04.002 %0 Journal Article %J Epigenetics %D 2008 %T Epigenetics meets next-generation sequencing. %A Park, Peter J %K Animals %K Chromatin Immunoprecipitation %K Cooperative Behavior %K Epigenesis, Genetic %K Humans %K Nucleosomes %K Sequence Analysis, DNA %X

Next-generation sequencing is poised to unleash dramatic changes in every area of molecular biology. In the past few years, chromatin immunoprecipitation (ChIP) on tiled microarrays (ChIP-chip) has been an important tool for genome-wide mapping of DNA-binding proteins or histone modifications. Now, ChIP followed by direct sequencing of DNA fragments (ChIP-seq) offers superior data with less noise and higher resolution and is likely to replace ChIP-chip in the near future. We will describe advantages of this new technology and outline some of the issues in dealing with the data. ChIP-seq generates considerably larger quantities of data and the most challenging aspect for investigators will be computational and statistical analysis necessary to uncover biological insights hidden in the data.

%B Epigenetics %V 3 %P 318-21 %8 2008 Nov %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/19098449?dopt=Abstract %0 Journal Article %J Cancer Invest %D 2008 %T Experimental design and data analysis for array comparative genomic hybridization. %A Park, Peter J %K Chromosome Aberrations %K Comparative Genomic Hybridization %K Humans %K Neoplasms %K Oligonucleotide Array Sequence Analysis %K Research Design %K Statistics as Topic %X

Array comparative genomic hybridization (aCGH) is a technique for measuring chromosomal aberrations in genomic DNA. With the availability of high-resolution microarrays, detailed characterization of the cancer genome has become possible. In this review, we discuss several issues in the generation and interpretation of aCGH data, including array platforms, experimental design, and data analysis. Due to the complexity of the data, application of appropriate statistical methods is crucial for avoiding false positive findings. We also describe integration of copy number data with other types of data to identify functional significance of observed aberrations.

%B Cancer Invest %V 26 %P 923-8 %8 2008 Nov %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/19034774?dopt=Abstract %R 10.1080/07357900801993432 %0 Journal Article %J Bioinformatics %D 2008 %T Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes. %A Lee, Hyunju %A Kong, Sek Won %A Park, Peter J %K Computer Simulation %K DNA Mutational Analysis %K DNA, Neoplasm %K Gene Dosage %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K Models, Genetic %K Neoplasm Proteins %K Neoplasms %K Systems Integration %X

MOTIVATION: DNA copy number aberrations (CNAs) and gene expression (GE) changes provide valuable information for studying chromosomal instability and its consequences in cancer. While it is clear that the structural aberrations and the transcript levels are intertwined, their relationship is more complex and subtle than initially suspected. Most studies so far have focused on how a CNA affects the expression levels of those genes contained within that CNA. RESULTS: To better understand the impact of CNAs on expression, we investigated the correlation of each CNA to all other genes in the genome. The correlations are computed over multiple patients that have both expression and copy number measurements in brain, bladder and breast cancer data sets. We find that a CNA has a direct impact on the gene amplified or deleted, but it also has a broad, indirect impact elsewhere. To identify a set of CNAs that is coordinately associated with the expression changes of a set of genes, we used a biclustering algorithm on the correlation matrix. For each of the three cancer types examined, the aberrations in several loci are associated with cancer-type specific biological pathways that have been described in the literature: CNAs of chromosome (chr) 7p13 were significantly correlated with epidermal growth factor receptor signaling pathway in glioblastoma multiforme, chr 13q with NF-kappaB cascades in bladder cancer, and chr 11p with Reck pathway in breast cancer. In all three data sets, gene sets related to cell cycle/division such as M phase, DNA replication and cell division were also associated with CNAs. Our results suggest that CNAs are both directly and indirectly correlated with changes in expression and that it is beneficial to examine the indirect effects of CNAs. AVAILABILITY: The code is available upon request.

%B Bioinformatics %V 24 %P 889-96 %8 2008 Apr 1 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/18263644?dopt=Abstract %R 10.1093/bioinformatics/btn034 %0 Journal Article %J Nat Struct Mol Biol %D 2008 %T The MSL3 chromodomain directs a key targeting step for dosage compensation of the Drosophila melanogaster X chromosome. %A Sural, Tuba H %A Peng, Shouyong %A Li, Bing %A Workman, Jerry L %A Park, Peter J %A Kuroda, Mitzi I %K Amino Acid Substitution %K Animals %K Animals, Genetically Modified %K Drosophila melanogaster %K Drosophila Proteins %K Electrophoretic Mobility Shift Assay %K Female %K Histones %K Male %K Microarray Analysis %K Models, Biological %K Mutagenesis, Site-Directed %K Mutant Proteins %K Mutation, Missense %K Nuclear Proteins %K Protein Binding %K Sequence Deletion %K Transcription Factors %K X Chromosome %X

The male-specific lethal (MSL) complex upregulates the single male X chromosome to achieve dosage compensation in Drosophila melanogaster. We have proposed that MSL recognition of specific entry sites on the X is followed by local targeting of active genes marked by histone H3 trimethylation (H3K36me3). Here we analyze the role of the MSL3 chromodomain in the second targeting step. Using ChIP-chip analysis, we find that MSL3 chromodomain mutants retain binding to chromatin entry sites but show a clear disruption in the full pattern of MSL targeting in vivo, consistent with a loss of spreading. Furthermore, when compared to wild type, chromodomain mutants lack preferential affinity for nucleosomes containing H3K36me3 in vitro. Our results support a model in which activating complexes, similarly to their silencing counterparts, use the nucleosomal binding specificity of their respective chromodomains to spread from initiation sites to flanking chromatin.

%B Nat Struct Mol Biol %V 15 %P 1318-25 %8 2008 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/19029895?dopt=Abstract %R 10.1038/nsmb.1520 %0 Journal Article %J Genome Res %D 2008 %T Nucleosome positioning in human HOX gene clusters. %A Kharchenko, Peter V* %A Woo, Caroline J* %A Tolstorukov, Michael Y %A Kingston, Robert E** %A Park, Peter J** %K Chromatin %K Genes, Homeobox %K HeLa Cells %K Homeodomain Proteins %K Humans %K K562 Cells %K Nucleosomes %X

The distribution of nucleosomes along the genome is a significant aspect of chromatin structure and is thought to influence gene regulation through modulation of DNA accessibility. However, properties of nucleosome organization remain poorly understood, particularly in mammalian genomes. Toward this goal we used tiled microarrays to identify stable nucleosome positions along the HOX gene clusters in human cell lines. We show that nucleosome positions exhibit sequence properties and long-range organization that are different from those characterized in other organisms. Despite overall variability of internucleosome distances, specific loci contain regular nucleosomal arrays with 195-bp periodicity. Moreover, such arrays tend to occur preferentially toward the 3' ends of genes. Through comparison of different cell lines, we find that active transcription is correlated with increased positioning of nucleosomes, suggesting an unexpected role for transcription in the establishment of well-positioned nucleosomes.

%B Genome Res %V 18 %P 1554-61 %8 2008 Oct %G eng %N 10 %1 http://www.ncbi.nlm.nih.gov/pubmed/18723689?dopt=Abstract %R 10.1101/gr.075952.107 %0 Journal Article %J Bioinformatics %D 2008 %T nuScore: a web-interface for nucleosome positioning predictions. %A Tolstorukov, Michael Y** %A Choudhary, Vidhu %A Olson, Wilma K %A Zhurkin, Victor B %A Park, Peter J** %K Algorithms %K Base Sequence %K Chromosome Mapping %K Computer Simulation %K Internet %K Models, Genetic %K Molecular Sequence Data %K Nucleosomes %K Sequence Alignment %K Sequence Analysis, DNA %K Software %K User-Computer Interface %X

SUMMARY: Sequence-directed mapping of nucleosome positions is of major biological interest. Here, we present a web-interface for estimation of the affinity of the histone core to DNA and prediction of nucleosome arrangement on a given sequence. Our approach is based on assessment of the energy cost of imposing the deformations required to wrap DNA around the histone surface. The interface allows the user to specify a number of options such as selecting from several structural templates for threading calculations and adding random sequences to the analysis. AVAILABILITY: The nuScore interface is freely available for use at http://compbio.med.harvard.edu/nuScore. CONTACT: peter_park@harvard.edu; tolstorukov@gmail.com SUPPLEMENTARY INFORMATION: The site contains user manual, description of the methodology and examples.

%B Bioinformatics %V 24 %P 1456-8 %8 2008 Jun 15 %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/18445607?dopt=Abstract %R 10.1093/bioinformatics/btn212 %0 Journal Article %J Cell %D 2008 %T A sequence motif within chromatin entry sites directs MSL establishment on the Drosophila X chromosome. %A Alekseyenko, Artyom A %A Peng, Shouyong %A Larschan, Erica %A Gorchakov, Andrey A %A Lee, Ok-Kyung %A Kharchenko, Peter %A McGrath, Sean D %A Wang, Charlotte I %A Mardis, Elaine R %A Park, Peter J %A Kuroda, Mitzi I %K Animals %K Base Sequence %K Chromatin Immunoprecipitation %K DNA-Binding Proteins %K Drosophila melanogaster %K Drosophila Proteins %K Female %K Male %K Nuclear Proteins %K Transcription Factors %K X Chromosome %X

The Drosophila MSL complex associates with active genes specifically on the male X chromosome to acetylate histone H4 at lysine 16 and increase expression approximately 2-fold. To date, no DNA sequence has been discovered to explain the specificity of MSL binding. We hypothesized that sequence-specific targeting occurs at "chromatin entry sites," but the majority of sites are sequence independent. Here we characterize 150 potential entry sites by ChIP-chip and ChIP-seq and discover a GA-rich MSL recognition element (MRE). The motif is only slightly enriched on the X chromosome ( approximately 2-fold), but this is doubled when considering its preferential location within or 3' to active genes (>4-fold enrichment). When inserted on an autosome, a newly identified site can direct local MSL spreading to flanking active genes. These results provide strong evidence for both sequence-dependent and -independent steps in MSL targeting of dosage compensation to the male X chromosome.

%B Cell %V 134 %P 599-609 %8 2008 Aug 22 %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/18724933?dopt=Abstract %R 10.1016/j.cell.2008.06.033 %0 Journal Article %J PLoS One %D 2008 %T Unphosphorylated SR-like protein Npl3 stimulates RNA polymerase II elongation. %A Dermody, Jessica L %A Dreyfuss, Jonathan M %A Villén, Judit %A Ogundipe, Babatunde %A Gygi, Steven P %A Park, Peter J %A Ponticelli, Alfred S %A Moore, Claire L %A Buratowski, Stephen %A Bucheli, Miriam E %K Binding, Competitive %K Casein Kinase II %K Catalytic Domain %K Gene Expression Regulation, Fungal %K Models, Biological %K mRNA Cleavage and Polyadenylation Factors %K Nuclear Proteins %K Phosphorylation %K Poly A %K Protein Structure, Tertiary %K RNA Polymerase II %K RNA, Messenger %K RNA-Binding Proteins %K Saccharomyces cerevisiae %K Saccharomyces cerevisiae Proteins %K Transcription, Genetic %X

The production of a functional mRNA is regulated at every step of transcription. An area not well-understood is the transition of RNA polymerase II from elongation to termination. The S. cerevisiae SR-like protein Npl3 functions to negatively regulate transcription termination by antagonizing the binding of polyA/termination proteins to the mRNA. In this study, Npl3 is shown to interact with the CTD and have a direct stimulatory effect on the elongation activity of the polymerase. The interaction is inhibited by phosphorylation of Npl3. In addition, Casein Kinase 2 was found to be required for the phosphorylation of Npl3 and affect its ability to compete against Rna15 (Cleavage Factor I) for binding to polyA signals. Our results suggest that phosphorylation of Npl3 promotes its dissociation from the mRNA/RNAP II, and contributes to the association of the polyA/termination factor Rna15. This work defines a novel role for Npl3 in elongation and its regulation by phosphorylation.

%B PLoS One %V 3 %P e3273 %8 2008 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/18818768?dopt=Abstract %R 10.1371/journal.pone.0003273 %0 Journal Article %J Mol Cell %D 2007 %T MSL complex is attracted to genes marked by H3K36 trimethylation using a sequence-independent mechanism. %A Larschan, Erica %A Alekseyenko, Artyom A %A Gortchakov, Andrey A %A Peng, Shouyong %A Li, Bing %A Yang, Pok %A Workman, Jerry L %A Park, Peter J %A Kuroda, Mitzi I %K Animals %K DNA Methylation %K DNA-Binding Proteins %K Drosophila melanogaster %K Drosophila Proteins %K Female %K Gene Expression Regulation %K Histone-Lysine N-Methyltransferase %K Histones %K Male %K Nuclear Proteins %K Nucleosomes %K Recombinant Proteins %K RNA-Binding Proteins %K Transcription Factors %K Transgenes %K X Chromosome %X

In Drosophila, X chromosome dosage compensation requires the male-specific lethal (MSL) complex, which associates with actively transcribed genes on the single male X chromosome to upregulate transcription approximately 2-fold. We found that on the male X chromosome, or when MSL complex is ectopically localized to an autosome, histone H3K36 trimethylation (H3K36me3) is a strong predictor of MSL binding. We isolated mutants lacking Set2, the H3K36me3 methyltransferase, and found that Set2 is an essential gene in both sexes of Drosophila. In set2 mutant males, MSL complex maintains X specificity but exhibits reduced binding to target genes. Furthermore, recombinant MSL3 protein preferentially binds nucleosomes marked by H3K36me3 in vitro. Our results support a model in which MSL complex uses high-affinity sites to initially recognize the X chromosome and then associates with many of its targets through sequence-independent features of transcribed genes.

%B Mol Cell %V 28 %P 121-33 %8 2007 Oct 12 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/17936709?dopt=Abstract %R 10.1016/j.molcel.2007.08.011 %0 Journal Article %J PLoS Genet %D 2007 %T Network-based analysis of affected biological processes in type 2 diabetes models. %A Liu, Manway %A Liberzon, Arthur %A Kong, Sek Won %A Lai, Weil R %A Park, Peter J %A Kohane, Isaac S %A Kasif, Simon %K Animals %K Diabetes Mellitus, Type 2 %K Disease Models, Animal %K Gene Expression Profiling %K Gene Expression Regulation %K Humans %K Insulin %K Models, Biological %K Signal Transduction %K Systems Biology %X

Type 2 diabetes mellitus is a complex disorder associated with multiple genetic, epigenetic, developmental, and environmental factors. Animal models of type 2 diabetes differ based on diet, drug treatment, and gene knockouts, and yet all display the clinical hallmarks of hyperglycemia and insulin resistance in peripheral tissue. The recent advances in gene-expression microarray technologies present an unprecedented opportunity to study type 2 diabetes mellitus at a genome-wide scale and across different models. To date, a key challenge has been to identify the biological processes or signaling pathways that play significant roles in the disorder. Here, using a network-based analysis methodology, we identified two sets of genes, associated with insulin signaling and a network of nuclear receptors, which are recurrent in a statistically significant number of diabetes and insulin resistance models and transcriptionally altered across diverse tissue types. We additionally identified a network of protein-protein interactions between members from the two gene sets that may facilitate signaling between them. Taken together, the results illustrate the benefits of integrating high-throughput microarray studies, together with protein-protein interaction networks, in elucidating the underlying biological processes associated with a complex disorder.

%B PLoS Genet %V 3 %P e96 %8 2007 Jun %G eng %N 6 %1 http://www.ncbi.nlm.nih.gov/pubmed/17571924?dopt=Abstract %R 10.1371/journal.pgen.0030096 %0 Journal Article %J BMC Bioinformatics %D 2007 %T Normalization and experimental design for ChIP-chip data. %A Peng, Shouyong %A Alekseyenko, Artyom A %A Larschan, Erica %A Kuroda, Mitzi I %A Park, Peter J %K Algorithms %K Chromatin Immunoprecipitation %K Data Interpretation, Statistical %K Databases, Genetic %K Gene Expression Profiling %K Information Storage and Retrieval %K Oligonucleotide Array Sequence Analysis %K Research Design %X

BACKGROUND: Chromatin immunoprecipitation on tiling arrays (ChIP-chip) has been widely used to investigate the DNA binding sites for a variety of proteins on a genome-wide scale. However, several issues in the processing and analysis of ChIP-chip data have not been resolved fully, including the effect of background (mock control) subtraction and normalization within and across arrays. RESULTS: The binding profiles of Drosophila male-specific lethal (MSL) complex on a tiling array provide a unique opportunity for investigating these topics, as it is known to bind on the X chromosome but not on the autosomes. These large bound and control regions on the same array allow clear evaluation of analytical methods.We introduce a novel normalization scheme specifically designed for ChIP-chip data from dual-channel arrays and demonstrate that this step is critical for correcting systematic dye-bias that may exist in the data. Subtraction of the mock (non-specific antibody or no antibody) control data is generally needed to eliminate the bias, but appropriate normalization obviates the need for mock experiments and increases the correlation among replicates. The idea underlying the normalization can be used subsequently to estimate the background noise level in each array for normalization across arrays. We demonstrate the effectiveness of the methods with the MSL complex binding data and other publicly available data. CONCLUSION: Proper normalization is essential for ChIP-chip experiments. The proposed normalization technique can correct systematic errors and compensate for the lack of mock control data, thus reducing the experimental cost and producing more accurate results.

%B BMC Bioinformatics %V 8 %P 219 %8 2007 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/17592629?dopt=Abstract %R 10.1186/1471-2105-8-219 %0 Journal Article %J Nat Biotechnol %D 2006 %T A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. %A Kuo, Winston Patrick %A Liu, Fang %A Trimarchi, Jeff %A Punzo, Claudio %A Lombardi, Michael %A Sarang, Jasjit %A Whipple, Mark E %A Maysuria, Malini %A Serikawa, Kyle %A Lee, Sun Young %A McCrann, Donald %A Kang, Jason %A Shearstone, Jeffrey R %A Burke, Jocelyn %A Park, Daniel J %A Wang, Xiaowei %A Rector, Trent L %A Ricciardi-Castagnoli, Paola %A Perrin, Steven %A Choi, Sangdun %A Bumgarner, Roger %A Kim, Ju Han %A Short, Glenn F %A Freeman, Mason W %A Seed, Brian %A Jensen, Roderick %A Church, George M %A Hovig, Eivind %A Cepko, Connie L %A Park, Peter %A Ohno-Machado, Lucila %A Jenssen, Tor-Kristian %K Chromosome Mapping %K DNA Probes %K Gene Expression Profiling %K Microarray Analysis %K Oligonucleotide Array Sequence Analysis %K Reproducibility of Results %X Over the last decade, gene expression microarrays have had a profound impact on biomedical research. The diversity of platforms and analytical methods available to researchers have made the comparison of data from multiple platforms challenging. In this study, we describe a framework for comparisons across platforms and laboratories. We have attempted to include nearly all the available commercial and 'in-house' platforms. Using probe sequences matched at the exon level improved consistency of measurements across the different microarray platforms compared to annotation-based matches. Generally, consistency was good for highly expressed genes, and variable for genes with lower expression values as confirmed by quantitative real-time (QRT)-PCR. Concordance of measurements was higher between laboratories on the same platform than across platforms. We demonstrate that, after stringent preprocessing, commercial arrays were more consistent than in-house arrays, and by most measures, one-dye platforms were more consistent than two-dye platforms. %B Nat Biotechnol %V 24 %P 832-40 %8 2006 Jul %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/16823376?dopt=Abstract %R 10.1038/nbt1217 %0 Journal Article %J Genes Dev %D 2006 %T High-resolution ChIP-chip analysis reveals that the Drosophila MSL complex selectively identifies active genes on the male X chromosome. %A Alekseyenko, Artyom A %A Larschan, Erica %A Lai, Weil R %A Park, Peter J** %A Kuroda, Mitzi I** %K Animals %K Animals, Genetically Modified %K Base Sequence %K Binding Sites %K Chromatin Immunoprecipitation %K DNA %K Dosage Compensation, Genetic %K Drosophila %K Drosophila Proteins %K Female %K Gene Expression Profiling %K Genes, Insect %K Male %K Multiprotein Complexes %K Nuclear Proteins %K Oligonucleotide Array Sequence Analysis %K Recombinant Fusion Proteins %K Sex Chromosomes %K Transcription Factors %X

X-chromosome dosage compensation in Drosophila requires the male-specific lethal (MSL) complex, which up-regulates gene expression from the single male X chromosome. Here, we define X-chromosome-specific MSL binding at high resolution in two male cell lines and in late-stage embryos. We find that the MSL complex is highly enriched over most expressed genes, with binding biased toward the 3' end of transcription units. The binding patterns are largely similar in the distinct cell types, with approximately 600 genes clearly bound in all three cases. Genes identified as clearly bound in one cell type and not in another indicate that attraction of MSL complex correlates with expression state. Thus, sequence alone is not sufficient to explain MSL targeting. We propose that the MSL complex recognizes most X-linked genes, but only in the context of chromatin factors or modifications indicative of active transcription. Distinguishing expressed genes from the bulk of the genome is likely to be an important function common to many chromatin organizing and modifying activities.

%B Genes Dev %V 20 %P 848-57 %8 2006 Apr 1 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/16547173?dopt=Abstract %R 10.1101/gad.1400206 %0 Journal Article %J Cold Spring Harb Symp Quant Biol %D 2006 %T MSL complex associates with clusters of actively transcribed genes along the Drosophila male X chromosome. %A Larschan, E %A Alekseyenko, A A %A Lai, W R %A Park, P J %A Kuroda, M I %K Animals %K Binding Sites %K DNA-Binding Proteins %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Genes, Insect %K Male %K Models, Molecular %K Multigene Family %K Multiprotein Complexes %K Nuclear Proteins %K Oligonucleotide Array Sequence Analysis %K Transcription Factors %K Transcription, Genetic %K X Chromosome %X

Dosage compensation in Drosophila serves as a model system for understanding the targeting of chromatin-modifying complexes to their sites of action. The MSL (male-specific lethal) complex up-regulates transcription of the single male X chromosome, thereby equalizing levels of transcription of X-linked genes between the sexes. Recruitment of the MSL complex to its binding sites on the male X chromosome requires each of the MSL proteins and at least one of the two large noncoding roX RNAs. To better understand how the MSL complex specifically targets the X chromosome, we have defined the binding using high-resolution genomic tiling arrays. Our results indicate that the MSL complex largely associates with transcribed genes that are present in clusters along the X chromosome. We hypothesize that after initial recruitment of the MSL complex to the X chromosome by unknown mechanisms, nascent transcripts or chromatin marks associated with active transcription attract the MSL complex to its final targets. Defining MSL-complex-binding sites will provide a tool for understanding functions of large noncoding RNAs that have remained elusive.

%B Cold Spring Harb Symp Quant Biol %V 71 %P 385-94 %8 2006 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/17381321?dopt=Abstract %R 10.1101/sqb.2006.71.026 %0 Journal Article %J Bioinformatics %D 2006 %T A multivariate approach for integrating genome-wide expression data and biological knowledge. %A Kong, Sek Won %A Pu, William T %A Park, Peter J %K Algorithms %K Biology %K Chromosome Mapping %K Computer Simulation %K Databases, Protein %K Gene Expression Profiling %K Information Storage and Retrieval %K Models, Genetic %K Models, Statistical %K Multivariate Analysis %K Oligonucleotide Array Sequence Analysis %K Proteome %K Systems Integration %X

MOTIVATION: Several statistical methods that combine analysis of differential gene expression with biological knowledge databases have been proposed for a more rapid interpretation of expression data. However, most such methods are based on a series of univariate statistical tests and do not properly account for the complex structure of gene interactions. RESULTS: We present a simple yet effective multivariate statistical procedure for assessing the correlation between a subspace defined by a group of genes and a binary phenotype. A subspace is deemed significant if the samples corresponding to different phenotypes are well separated in that subspace. The separation is measured using Hotelling's T(2) statistic, which captures the covariance structure of the subspace. When the dimension of the subspace is larger than that of the sample space, we project the original data to a smaller orthonormal subspace. We use this method to search through functional pathway subspaces defined by Reactome, KEGG, BioCarta and Gene Ontology. To demonstrate its performance, we apply this method to the data from two published studies, and visualize the results in the principal component space.

%B Bioinformatics %V 22 %P 2373-80 %8 2006 Oct 1 %G eng %N 19 %1 http://www.ncbi.nlm.nih.gov/pubmed/16877751?dopt=Abstract %R 10.1093/bioinformatics/btl401 %0 Journal Article %J Curr Biol %D 2006 %T Postmeiotic sex chromatin in the male germline of mice. %A Namekawa, Satoshi H %A Park, Peter J %A Zhang, Li-Feng %A Shima, James E %A McCarrey, John R %A Griswold, Michael D %A Lee, Jeannie T %K Animals %K Chromatin %K Chromosome Positioning %K Male %K Meiosis %K Mice %K Oligonucleotide Array Sequence Analysis %K Spermatids %K Spermatogenesis %K Spermatozoa %K X Chromosome %K X Chromosome Inactivation %K Y Chromosome %X

In mammals, the X and Y chromosomes are subject to meiotic sex chromosome inactivation (MSCI) during prophase I in the male germline, but their status thereafter is currently unclear. An abundance of X-linked spermatogenesis genes has spawned the view that the X must be active . On the other hand, the idea that the imprinted paternal X of the early embryo may be preinactivated by MSCI suggests that silencing may persist longer . To clarify this issue, we establish a comprehensive X-expression profile during mouse spermatogenesis. Here, we discover that the X and Y occupy a novel compartment in the postmeiotic spermatid and adopt a non-Rabl configuration. We demonstrate that this postmeiotic sex chromatin (PMSC) persists throughout spermiogenesis into mature sperm and exhibits epigenetic similarity to the XY body. In the spermatid, 87% of X-linked genes remain suppressed postmeiotically, while autosomes are largely active. We conclude that chromosome-wide X silencing continues from meiosis to the end of spermiogenesis, and we discuss implications for proposed mechanisms of imprinted X-inactivation.

%B Curr Biol %V 16 %P 660-7 %8 2006 Apr 4 %G eng %N 7 %1 http://www.ncbi.nlm.nih.gov/pubmed/16581510?dopt=Abstract %R 10.1016/j.cub.2006.01.066 %0 Journal Article %J J Surg Res %D 2006 %T Angiogenic profile of soft tissue sarcomas based on analysis of circulating factors and microarray gene expression. %A Yoon, Sam S %A Segal, Neil H %A Park, Peter J %A Detwiller, Kara Y %A Fernando, Namali T %A Ryeom, Sandra W %A Brennan, Murray F %A Singer, Samuel %K Angiopoietin-2 %K Cluster Analysis %K Enzyme-Linked Immunosorbent Assay %K Female %K Fibroblast Growth Factors %K Humans %K Leptin %K Male %K Neovascularization, Pathologic %K Oligonucleotide Array Sequence Analysis %K Sarcoma %K Vascular Endothelial Growth Factor A %X

BACKGROUND: Broader understanding of diverse angiogenic pathways in a particular cancer can lead to better utilization of anti-angiogenic therapies. The aim of this study was to develop profiles of angiogenesis-related gene and protein expression for various histologic subtypes of soft tissue sarcomas (STS) growing in different sites. MATERIALS AND METHODS: Plasma levels of vascular endothelial growth factor (VEGF), basic fibroblast growth factor (bFGF), angiopoietin 2 (Ang2), and leptin were determined in 108 patients with primary STS. Gene expression patterns were analyzed in 38 STS samples and 13 normal tissues using oligonucleotide microarrays. RESULTS: VEGF and bFGF plasma levels were elevated 10-13 fold in STS patients compared to controls. VEGF levels were broadly elevated while bFGF levels were higher in patients with fibrosarcomas and leiomyosarcomas. Ang2 levels correlated with tumor size and were most elevated for tumors located in the trunk, while leptin levels were highest in patients with liposarcomas. Hierarchical clustering of microarray data based on angiogenesis-related gene expression demonstrated that histologic subtypes of STS often shared similar expression patterns, and these patterns were distinctly different from those of normal tissues. Matrix metalloproteinase 2, platelet-derived growth factor receptor, alpha and Notch 4 were among several genes that were up-regulated at least 7-fold in STS. CONCLUSIONS: STS demonstrate significant heterogeneity in their angiogenic profiles based on size, histologic subtype, and location of tumor growth, which may have implications for anti-angiogenic strategies. Comparison of STS to normal tissues reveals a panel of upregulated genes that may be targets for future therapies.

%B J Surg Res %V 135 %P 282-90 %8 2006 Oct %G eng %N 2 %1 http://www.ncbi.nlm.nih.gov/pubmed/16603191?dopt=Abstract %R 10.1016/j.jss.2006.01.023 %0 Journal Article %J Cancer Res %D 2006 %T A genome-wide screen reveals functional gene clusters in the cancer genome and identifies EphA2 as a mitogen in glioblastoma. %A Liu, Fenghua* %A Park, Peter J* %A Lai, Weil %A Maher, Elizabeth %A Chakravarti, Arnab %A Durso, Laura %A Jiang, Xiuli %A Yu, Yi %A Brosius, Amanda %A Thomas, Meredith %A Chin, Lynda %A Brennan, Cameron %A DePinho, Ronald A %A Kohane, Isaac %A Carroll, Rona S %A Black, Peter M %A Johnson, Mark D %K Brain Neoplasms %K Cell Growth Processes %K Genome, Human %K Glioblastoma %K Humans %K Mitogen-Activated Protein Kinases %K Multigene Family %K Nucleic Acid Hybridization %K Receptor, EphA2 %K RNA, Messenger %X

A novel genome-wide screen that combines patient outcome analysis with array comparative genomic hybridization and mRNA expression profiling was developed to identify genes with copy number alterations, aberrant mRNA expression, and relevance to survival in glioblastoma. The method led to the discovery of physical gene clusters within the cancer genome with boundaries defined by physical proximity, correlated mRNA expression patterns, and survival relatedness. These boundaries delineate a novel genomic interval called the functional common region (FCR). Many FCRs contained genes of high biological relevance to cancer and were used to pinpoint functionally significant DNA alterations that were too small or infrequent to be reliably identified using standard algorithms. One such FCR contained the EphA2 receptor tyrosine kinase. Validation experiments showed that EphA2 mRNA overexpression correlated inversely with patient survival in a panel of 21 glioblastomas, and ligand-mediated EphA2 receptor activation increased glioblastoma proliferation and tumor growth via a mitogen-activated protein kinase-dependent pathway. This novel genome-wide approach greatly expanded the list of target genes in glioblastoma and represents a powerful new strategy to identify the upstream determinants of tumor phenotype in a range of human cancers.

%B Cancer Res %V 66 %P 10815-23 %8 2006 Nov 15 %G eng %N 22 %1 http://www.ncbi.nlm.nih.gov/pubmed/17090523?dopt=Abstract %R 10.1158/0008-5472.CAN-06-1408 %0 Journal Article %J Ann Neurol %D 2005 %T Interferon-alpha/beta-mediated innate immune mechanisms in dermatomyositis. %A Greenberg, Steven A %A Pinkus, Jack L %A Pinkus, Geraldine S %A Burleson, Travis %A Sanoudou, Despina %A Tawil, Rabi %A Barohn, Richard J %A Saperstein, David S %A Briemberg, Hannah R %A Ericsson, Maria %A Park, Peter %A Amato, Anthony A %K Adult %K Aged %K Antigens, CD3 %K Antigens, CD4 %K Blood Vessels %K Dendritic Cells %K Dermatomyositis %K Female %K Gene Expression Regulation %K GTP-Binding Proteins %K Humans %K Immunohistochemistry %K Interferon-alpha %K Interferon-beta %K Lectins, C-Type %K Male %K Membrane Glycoproteins %K Microscopy, Immunoelectron %K Middle Aged %K Muscle Fibers, Skeletal %K Muscle, Skeletal %K Myositis, Inclusion Body %K Myxovirus Resistance Proteins %K Nerve Tissue Proteins %K Neuromuscular Diseases %K Oligonucleotide Array Sequence Analysis %K Promoter Regions, Genetic %K Prospective Studies %K Receptors, Immunologic %K Reverse Transcriptase Polymerase Chain Reaction %X Dermatomyositis has been modeled as an autoimmune disease largely mediated by the adaptive immune system, including a local humorally mediated response with B and T helper cell muscle infiltration, antibody and complement-mediated injury of capillaries, and perifascicular atrophy of muscle fibers caused by ischemia. To further understand the pathophysiology of dermatomyositis, we used microarrays, computational methods, immunohistochemistry and electron microscopy to study muscle specimens from 67 patients, 54 with inflammatory myopathies, 14 with dermatomyositis. In dermatomyositis, genes induced by interferon-alpha/beta were highly overexpressed, and immunohistochemistry for the interferon-alpha/beta inducible protein MxA showed dense staining of perifascicular, and, sometimes all myofibers in 8/14 patients and on capillaries in 13/14 patients. Of 36 patients with other inflammatory myopathies, 1 patient had faint MxA staining of myofibers and 3 of capillaries. Plasmacytoid dendritic cells, potent CD4+ cellular sources of interferon-alpha, are present in substantial numbers in dermatomyositis and may account for most of the cells previously identified as T helper cells. In addition to an adaptive immune response, an innate immune response characterized by plasmacytoid dendritic cell infiltration and interferon-alpha/beta inducible gene and protein expression may be an important part of the pathogenesis of dermatomyositis, as it appears to be in systemic lupus erythematosus. %B Ann Neurol %V 57 %P 664-78 %8 2005 May %G eng %N 5 %1 http://www.ncbi.nlm.nih.gov/pubmed/15852401?dopt=Abstract %R 10.1002/ana.20464 %0 Book Section %B Methods of Microarray Data Analysis IV %D 2005 %T Gene Expression Data and Survival Analysis %A Park, Peter J %E Shoemaker, Jennifer S %E Lin, Simon M %B Methods of Microarray Data Analysis IV %7 1 %I Springer US %C New York City, New York %G eng %0 Journal Article %J Proc Natl Acad Sci U S A %D 2005 %T Discovering statistically significant pathways in expression profiling studies. %A Tian, Lu %A Greenberg, Steven A %A Kong, Sek Won %A Altschuler, Josiah %A Kohane, Isaac S %A Park, Peter J %K Algorithms %K Alzheimer Disease %K Animals %K Autoimmunity %K Databases, Genetic %K Dermatomyositis %K Gene Expression Profiling %K Gene Expression Regulation %K Humans %K Interferon-alpha %K Interferon-beta %K Models, Genetic %K Myositis %K Myositis, Inclusion Body %K Oligonucleotide Array Sequence Analysis %K Predictive Value of Tests %K T-Lymphocytes %K Transcription Factors %X

Accurate and rapid identification of perturbed pathways through the analysis of genome-wide expression profiles facilitates the generation of biological hypotheses. We propose a statistical framework for determining whether a specified group of genes for a pathway has a coordinated association with a phenotype of interest. Several issues on proper hypothesis-testing procedures are clarified. In particular, it is shown that the differences in the correlation structure of each set of genes can lead to a biased comparison among gene sets unless a normalization procedure is applied. We propose statistical tests for two important but different aspects of association for each group of genes. This approach has more statistical power than currently available methods and can result in the discovery of statistically significant pathways that are not detected by other methods. This method is applied to data sets involving diabetes, inflammatory myopathies, and Alzheimer's disease, using gene sets we compiled from various public databases. In the case of inflammatory myopathies, we have correctly identified the known cytotoxic T lymphocyte-mediated autoimmunity in inclusion body myositis. Furthermore, we predicted the presence of dendritic cells in inclusion body myositis and of an IFN-alpha/beta response in dermatomyositis, neither of which was previously described. These predictions have been subsequently corroborated by immunohistochemistry.

%B Proc Natl Acad Sci U S A %V 102 %P 13544-9 %8 2005 Sep 20 %G eng %N 38 %1 http://www.ncbi.nlm.nih.gov/pubmed/16174746?dopt=Abstract %R 10.1073/pnas.0506577102 %0 Journal Article %J Genes Dev %D 2005 %T Global regulation of X chromosomal genes by the MSL complex in Drosophila melanogaster. %A Hamada, Fumika N %A Park, Peter J %A Gordadze, Polina R %A Kuroda, Mitzi I %K Animals %K Cell Line %K Dosage Compensation, Genetic %K Drosophila melanogaster %K Drosophila Proteins %K Male %K Multiprotein Complexes %K Up-Regulation %K X Chromosome %X

A long-standing model postulates that X-chromosome dosage compensation in Drosophila occurs by twofold up-regulation of the single male X, but previous data cannot exclude an alternative model, in which male autosomes are down-regulated to balance gene expression. To distinguish between the two models, we used RNA interference to deplete Male-Specific Lethal (MSL) complexes from male-like tissue culture cells. We found that expression of many genes from the X chromosome decreased, while expression from the autosomes was largely unchanged. We conclude that the primary role of the MSL complex is to up-regulate the male X chromosome.

%B Genes Dev %V 19 %P 2289-94 %8 2005 Oct 1 %G eng %N 19 %1 http://www.ncbi.nlm.nih.gov/pubmed/16204180?dopt=Abstract %R 10.1101/gad.1343705 %0 Journal Article %J Bioinformatics %D 2005 %T Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. %A Lai, Weil R %A Johnson, Mark D %A Kucherlapati, Raju %A Park, Peter J %K Algorithms %K Chromosome Mapping %K Gene Amplification %K Gene Deletion %K Nucleic Acid Hybridization %K Oligonucleotide Array Sequence Analysis %K Reproducibility of Results %K Sensitivity and Specificity %K Software %K Software Validation %X

MOTIVATION: Array Comparative Genomic Hybridization (CGH) can reveal chromosomal aberrations in the genomic DNA. These amplifications and deletions at the DNA level are important in the pathogenesis of cancer and other diseases. While a large number of approaches have been proposed for analyzing the large array CGH datasets, the relative merits of these methods in practice are not clear. RESULTS: We compare 11 different algorithms for analyzing array CGH data. These include both segment detection methods and smoothing methods, based on diverse techniques such as mixture models, Hidden Markov Models, maximum likelihood, regression, wavelets and genetic algorithms. We compute the Receiver Operating Characteristic (ROC) curves using simulated data to quantify sensitivity and specificity for various levels of signal-to-noise ratio and different sizes of abnormalities. We also characterize their performance on chromosomal regions of interest in a real dataset obtained from patients with Glioblastoma Multiforme. While comparisons of this type are difficult due to possibly sub-optimal choice of parameters in the methods, they nevertheless reveal general characteristics that are helpful to the biological investigator.

%B Bioinformatics %V 21 %P 3763-70 %8 2005 Oct 1 %G eng %N 19 %1 http://www.ncbi.nlm.nih.gov/pubmed/16081473?dopt=Abstract %R 10.1093/bioinformatics/bti611 %0 Journal Article %J Bioinformatics %D 2005 %T CrossChip: a system supporting comparative analysis of different generations of Affymetrix arrays. %A Kong, Sek Won %A Hwang, Kyu-Baek %A Kim, Richard D %A Zhang, Byoung-Tak %A Greenberg, Steven A %A Kohane, Isaac S %A Park, Peter J %K Algorithms %K DNA Probes %K Gene Expression Profiling %K Information Storage and Retrieval %K Oligonucleotide Array Sequence Analysis %K Sequence Alignment %K Sequence Analysis, DNA %K Software %X

SUMMARY: To increase compatibility between different generations of Affymetrix GeneChip arrays, we propose a method of filtering probes based on their sequences. Our method is implemented as a web-based service for downloading necessary materials for converting the raw data files (*.CEL) for comparative analysis. The user can specify the appropriate level of filtering by setting the criteria for the minimum overlap length between probe sequences and the minimum number of usable probe pairs per probe set. Our website supports a within-species comparison for human and mouse GeneChip arrays. AVAILABILITY: http://www.crosschip.org

%B Bioinformatics %V 21 %P 2116-7 %8 2005 May 1 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/15684227?dopt=Abstract %R 10.1093/bioinformatics/bti288 %0 Journal Article %J International Journal of Computational Methods %D 2004 %T Multiscale Numerical Methods for Singularly Perturbed Convection-Diffusion Equations %A Park, P J %A Hou, T Y %X

We present an efficient and robust approach in the finite element framework for numerical solutions that exhibit multiscale behavior, with applications to singularly perturbed convection-diffusion problems. The first type of equation we study is the convection-dominated convection-diffusion equation, with periodic or random coefficients; the second type of equation is an elliptic equation with singularities due to discontinuous coefficients and non-smooth boundaries. In both cases, standard methods for purely hyperbolic or elliptic problems perform poorly due to sharp boundary and internal layers in the solution.

We propose a framework in which the finite element basis functions are designed to capture the local small-scale behavior correctly. When the structure of the layers can be determined locally, we apply the multiscale finite element method, in which we solve the corresponding homogeneous equation on each element to capture the small scale features of the differential operator. We demonstrate the effectiveness of this method by computing the enhanced diffusivity scaling for a passive scalar in the cellular flow. We also carry out the asymptotic error analysis for its convergence rate and perform numerical experiments for verification. For a random flow with nonlocal layer structure, we use a variational principle to gain additional information in our attempt to design asymptotic basis functions. We also apply the same framework for elliptic equations with discontinuous coefficients or non-smooth boundaries. In that case, we construct local basis function near singularities using infinite element method in order to resolve extreme singularity. Numerical results on problems with various singularities confirm the efficiency and accuracy of this approach.

%B International Journal of Computational Methods %V 1 %P 17-65 %G eng %N 1 %0 Journal Article %J BMC Bioinformatics %D 2004 %T Combining gene expression data from different generations of oligonucleotide arrays. %A Hwang, Kyu-Baek %A Kong, Sek Won %A Greenberg, Steve A %A Park, Peter J %K Biopsy %K Cluster Analysis %K Databases, Genetic %K Dermatomyositis %K DNA Probes %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Variation %K Humans %K Myositis %K Oligonucleotide Array Sequence Analysis %K Polymyositis %K Reproducibility of Results %X

BACKGROUND: One of the important challenges in microarray analysis is to take full advantage of previously accumulated data, both from one's own laboratory and from public repositories. Through a comparative analysis on a variety of datasets, a more comprehensive view of the underlying mechanism or structure can be obtained. However, as we discover in this work, continual changes in genomic sequence annotations and probe design criteria make it difficult to compare gene expression data even from different generations of the same microarray platform. RESULTS: We first describe the extent of discordance between the results derived from two generations of Affymetrix oligonucleotide arrays, as revealed in cluster analysis and in identification of differentially expressed genes. We then propose a method for increasing comparability. The dataset we use consists of a set of 14 human muscle biopsy samples from patients with inflammatory myopathies that were hybridized on both HG-U95Av2 and HG-U133A human arrays. We find that the use of the probe set matching table for comparative analysis provided by Affymetrix produces better results than matching by UniGene or LocusLink identifiers but still remains inadequate. Rescaling of expression values for each gene across samples and data filtering by expression values enhance comparability but only for few specific analyses. As a generic method for improving comparability, we select a subset of probes with overlapping sequence segments in the two array types and recalculate expression values based only on the selected probes. We show that this filtering of probes significantly improves the comparability while retaining a sufficient number of probe sets for further analysis. CONCLUSIONS: Compatibility between high-density oligonucleotide arrays is significantly affected by probe-level sequence information. With a careful filtering of the probes based on their sequence overlaps, data from different generations of microarrays can be combined more effectively.

%B BMC Bioinformatics %V 5 %P 159 %8 2004 Oct 25 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/15504239?dopt=Abstract %R 10.1186/1471-2105-5-159 %0 Journal Article %J J Biotechnol %D 2004 %T Current issues for DNA microarrays: platform comparison, double linear amplification, and universal RNA reference. %A Park, Peter J %A Cao, Yun Anna %A Lee, Sun Young %A Kim, Jong-Woo %A Chang, Mi Sook %A Hart, Rebecca %A Choi, Sangdun %K Benchmarking %K Equipment Failure Analysis %K Nucleic Acid Amplification Techniques %K Oligonucleotide Array Sequence Analysis %K Reproducibility of Results %K Reverse Transcriptase Polymerase Chain Reaction %K RNA %K Sensitivity and Specificity %K Technology Assessment, Biomedical %K United States %X

DNA microarray technology has been widely used to simultaneously determine the expression levels of thousands of genes. A variety of approaches have been used, both in the implementation of this technology and in the analysis of the large amount of expression data. However, several practical issues still have not been resolved in a satisfactory manner, and among the most critical is the lack of agreement in the results obtained in different array platforms. In this study, we present a comparison of several microarray platforms [Affymetrix oligonucleotide arrays, custom complementary DNA (cDNA) arrays, and custom oligo arrays printed with oligonucleotides from three different sources] as well as analysis of various methods used for microarray target preparation and the reference design. The results indicate that the pairwise correlations of expression levels between platforms are relative low overall but that the log ratios of the highly expressed genes are strongly correlated, especially between Affymetrix and cDNA arrays. The microarray measurements were compared with quantitative real-time-polymerase chain reaction (QRT-PCR) results for 23 genes, and the varying degrees of agreement for each platform were characterized. We have also developed and tested a double amplification method which allows the use of smaller amounts of starting material. The added round of amplification produced reproducible results as compared to the arrays hybridized with single round amplified targets. Finally, the reliability of using a universal RNA reference for two-channel microarrays was tested and the results suggest that comparisons of multiple experimental conditions using the same control can be accurate.

%B J Biotechnol %V 112 %P 225-45 %8 2004 Sep 9 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/15313001?dopt=Abstract %R 10.1016/j.jbiotec.2004.05.006 %0 Journal Article %J Genome Biol %D 2004 %T Improving identification of differentially expressed genes in microarray studies using information from public databases. %A Kim, Richard D %A Park, Peter J %K Databases, Genetic %K Gene Expression Profiling %K Gene Expression Regulation %K Oligonucleotide Array Sequence Analysis %X

We demonstrate that the process of identifying differentially expressed genes in microarray studies with small sample sizes can be substantially improved by extracting information from a large number of datasets accumulated in public databases. The improvement comes from more reliable estimates of gene-specific variances based on other datasets. For a two-group comparison with two arrays in each group, for example, the result of our method was comparable to that of a t-test analysis with five samples in each group or to that of a regularized t-test analysis with three samples in each group. Our results are further improved by weighting the results of our approach with the regularized t-test results in a hybrid method.

%B Genome Biol %V 5 %P R70 %8 2004 %G eng %N 9 %1 http://www.ncbi.nlm.nih.gov/pubmed/15345054?dopt=Abstract %R 10.1186/gb-2004-5-9-r70 %0 Journal Article %J Computational Statistics and Data Analysis %D 2003 %T Power Comparisons for Disease Clustering Tests %A Kulldorff, M %A Tango, M %A Park, Peter J %X

Many different methods have been proposed to test for geographical disease clustering, and more generally, for spatial clustering of any type of observations while adjusting for an inhomogeneous background population generating the observations. Despite the many proposed test statistics, there has been few formal comparisons conducted. We present a collection of 1,220,000 simulated benchmark data sets generated under 51 different cluster models and the null hypothesis, to be used for power evaluations. We then use these data sets to compare the power of the spatial scan statistic, the maximized excess events test and the nonparametric M statistic. All have good power, the first having an advantage for localized hot-spot type clusters and the second for global clustering where randomly located cases generate other cases close by. By making the simulated data sets publicly available, new tests can easily be compared with previously evaluated tests by analyzing the same benchmark data.

%B Computational Statistics and Data Analysis %V 42 %P 665-684 %G eng %N 4 %0 Journal Article %J Genomics & Informatics %D 2003 %T Rank-Based Nonlinear Normalization of Oligonucleotide Arrays %A Park, Peter J %A Kahane, Isaac S %A Kim, J H %X

MOTIVATION: Many have observed a nonlinear relationship between the signal intensity and the transcript abundance in microarray data. The first step in analyzing the data is to normalize it properly, and this should include a correction for the nonlinearity. The commonly used linear normalization schemes do not address this problem. RESULTS: Nonlinearity is present in both cDNA and oligonucleotide arrays, but we concentrate on the latter in this paper. Across a set of chips, we identify those genes whose within-chip ranks are relatively constant compared to other genes of similar intensity. For each gene, we compute the sum of the squares of the differences in its within-chip ranks between every pair of chips as our statistic and we select a small fraction of the genes with the minimal changes in ranks at each intensity level. These genes are most likely to be non-differentially expressed and are subsequently used in the normalization procedure. This method is a generalization of the rank-invariant normalization (Li and Wong, 2001), using all available chips rather than two at a time to gather more information, while using the chip that is least likely to be affected by nonlinear effects as the reference chip. The assumption in our method is that there are at least a small number of nondifferentially expressed genes across the intensity range. The normalized expression values can be substantially different from the unnormalized values and may result in altered down-stream analysis.

%B Genomics & Informatics %V 1 %P 94-100 %G eng %N 2 %0 Journal Article %J AMIA Annu Symp Proc %D 2003 %T Functional relationships between gene pairs in oral squamous cell carcinoma. %A Kuo, Winston Patrick %A Mendez, Eduardo %A Chen, Chu %A Whipple, Mark E %A Farell, Greg %A Agoff, Nicholas %A Park, Peter J %K Carcinoma, Squamous Cell %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K Models, Genetic %K Mouth Neoplasms %K Oligonucleotide Array Sequence Analysis %K Statistics, Nonparametric %X

We developed a novel method for the discovery of functional relationships between pairs of genes based on gene expression profiles generated from microarrays. This approach examines all possible pairs of genes and identifies those in which the relationship between the two genes changes in different diseases or conditions. In contrast to previous methods that have focused on differentially expressed genes, this method attempts to find changes in the correlation between genes. These changes may be indicative of the functional relationships related to a disease mechanism. We demonstrate the utility of this approach by applying it to an oral squamous cell carcinoma (OSCC) microarray data set. Our results suggest new directions for future experimental investigations.

%B AMIA Annu Symp Proc %P 371-5 %8 2003 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/14728197?dopt=Abstract %0 Journal Article %J J Exp Med %D 2003 %T Human decidual natural killer cells are a unique NK cell subset with immunomodulatory potential. %A Koopman, Louise A %A Kopcow, Hernan D %A Rybalov, Basya %A Boyson, Jonathan E %A Orange, Jordan S %A Schatz, Frederick %A Masch, Rachel %A Lockwood, Charles J %A Schachter, Asher D %A Park, Peter J %A Strominger, Jack L %K Antigens, CD56 %K Decidua %K Female %K Galectin 1 %K Gene Expression %K Glycoproteins %K Homeodomain Proteins %K Humans %K Killer Cells, Natural %K NK Cell Lectin-Like Receptor Subfamily C %K Oligonucleotide Array Sequence Analysis %K Pregnancy %K Pregnancy Proteins %K Receptors, Immunologic %K Receptors, Natural Killer Cell %K T-Lymphocyte Subsets %X

Natural killer cells constitute 50-90% of lymphocytes in human uterine decidua in early pregnancy. Here, CD56(bright) uterine decidual NK (dNK) cells were compared with the CD56(bright) and CD56(dim) peripheral NK cell subsets by microarray analysis, with verification of results by flow cytometry and RT-PCR. Among the approximately 10,000 genes studied, 278 genes showed at least a threefold change with P < or = 0.001 when comparing the dNK and peripheral NK cell subsets, most displaying increased expression in dNK cells. The largest number of these encoded surface proteins, including the unusual lectinlike receptors NKG2E and Ly-49L, several killer cell Ig-like receptors, the integrin subunits alpha(D), alpha(X), beta1, and beta5, and multiple tetraspanins (CD9, CD151, CD53, CD63, and TSPAN-5). Additionally, two secreted proteins, galectin-1 and progestagen-associated protein 14, known to have immunomodulatory functions, were selectively expressed in dNK cells.

%B J Exp Med %V 198 %P 1201-12 %8 2003 Oct 20 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/14568979?dopt=Abstract %R 10.1084/jem.20030305 %0 Journal Article %J J Am Dent Assoc %D 2003 %T Microarrays and clinical dentistry. %A Kuo, Winston Patrick %A Whipple, Mark E %A Jenssen, Tor-Kristian %A Todd, Randy %A Epstein, Joel B %A Ohno-Machado, Lucila %A Sonis, Stephen T %A Park, Peter J %K Anti-Bacterial Agents %K Disease Progression %K Gene Expression Profiling %K Genetic Variation %K Genotype %K Human Genome Project %K Humans %K Mouth Diseases %K Mouth Neoplasms %K Oligonucleotide Array Sequence Analysis %K Precancerous Conditions %K Proteins %K RNA, Messenger %K Tooth Diseases %X

BACKGROUND: The Human Genome Project, or HGP, has inspired a great deal of exciting biology recently by enabling the development of new technologies that will be essential for understanding the different types of abnormalities in diseases related to the oral cavity. LITERATURE REVIEWED: The authors review current literature pertaining to the advanced microarray technologies arising from the HGP and how they can contribute to dentistry. This technology has become a standard tool for monitoring activities of genes at both academic and pharmaceutical research institutions. RESULTS: With the availability of the DNA sequences for the entire human genome, attention now is focused on understanding various diseases at the genome level. Deciphering the molecular behavior of genetically encoded proteins is crucial to obtaining a more comprehensive picture of disease processes. Important progress has been made using microarrays, which have been shown to be effective in identifying gene expression patterns and variations that correlate with cellular development, physiology and function. Arrays can be used to classify tissue samples accurately based on molecular profiles and to select candidate genes related to a number of cancers, including oral cancer. This type of oral genetic approach will aid in the understanding of disease progression, thus improving diagnosis and treatment for patients. CLINICAL IMPLICATIONS: Microarrays hold much promise for the analysis of diseases in the oral cavity. As the technology evolves, dentists may see these tools as screening tests for better managing patients' dental care.

%B J Am Dent Assoc %V 134 %P 456-62 %8 2003 Apr %G eng %N 4 %1 http://www.ncbi.nlm.nih.gov/pubmed/12733779?dopt=Abstract %0 Journal Article %J Genome Biol %D 2003 %T MicroSAGE is highly representative and reproducible but reveals major differences in gene expression among samples obtained from similar tissues. %A Blackshaw, Seth %A Kuo, Winston P %A Park, Peter J %A Tsujikawa, Motokazu %A Gunnersen, Jenny M %A Scott, Hamish S %A Boon, Wee-Ming %A Tan, Seong-Seng %A Cepko, Constance L %K 3T3 Cells %K Adult %K Age Factors %K Aged %K Aged, 80 and over %K Animals %K Cell Line %K Computers, Molecular %K Female %K Gene Expression Profiling %K Gene Library %K Genes, cdc %K Genetic Variation %K Humans %K Male %K Mice %K Mice, Inbred C57BL %K Oligonucleotide Array Sequence Analysis %K Organ Specificity %K Retina %K Sex Factors %X

BACKGROUND: Serial analysis of gene expression using small amounts of starting material (microSAGE) has not yet been conclusively shown to be representative, reproducible or accurate. RESULTS: We show that microSAGE is highly representative, reproducible and accurate, but that pronounced differences in gene expression are seen between tissue samples taken from different individuals. CONCLUSIONS: MicroSAGE is a reliable method of comprehensively profiling differences in gene expression among samples, but care should be taken in generalizing results obtained from libraries constructed from tissue obtained from different individuals and/or processed or stored differently.

%B Genome Biol %V 4 %P R17 %8 2003 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/12620102?dopt=Abstract %0 Journal Article %J Bioinformatics %D 2002 %T Comparing expression profiles of genes with similar promoter regions. %A Park, Peter J %A Butte, Atul J %A Kohane, Isaac S %K Databases, Nucleic Acid %K Gene Expression %K Gene Expression Profiling %K Gene Expression Regulation %K Gene Expression Regulation, Fungal %K Promoter Regions, Genetic %K Regulatory Sequences, Nucleic Acid %K Reproducibility of Results %K Saccharomyces cerevisiae %K Sensitivity and Specificity %K Sequence Alignment %K Sequence Analysis, DNA %K Statistics as Topic %K Transcription Factors %K Transcription, Genetic %X

MOTIVATION: Gene regulatory elements are often predicted by seeking common sequences in the promoter regions of genes that are clustered together based on their expression profiles. We consider the problem in the opposite direction: we seek to find the genes that have similar promoter regions and determine the extent to which these genes have similar expression profiles. RESULTS: We use the data sets from experiments on Saccharomyces cerevisiae. Our similarity measure for the promoter regions is based on the set of common mapped or putative transcription factor binding sites and other regulatory elements in the upstream region of the genes, as contained in the Saccharomyces cerevisiae Promoter Database. We pair up the genes with high similarity scores and compare their expression levels in time-course experiment data. We find that genes with similar promoter regions on the average have significantly higher correlation, but it can vary widely depending on the genes. This confirms that the presence of similar regulatory elements often does not correspond to similarity in expression profiles and indicates that finding transcription factor binding sites or other regulatory elements starting with the expression patterns may be limited in many cases. Regardless of the correlation, the degree to which the profiles agree under different experimental conditions can be examined to derive hypotheses concerning the role of common regulatory elements. Overall, we find that considering the relationship between the promoter regions and the expression profiles starting with the regulatory elements is a difficult but useful process that can provide valuable insights.

%B Bioinformatics %V 18 %P 1576-84 %8 2002 Dec %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/12490441?dopt=Abstract %0 Journal Article %J Proc AMIA Symp %D 2002 %T Gene expression levels in different stages of progression in oral squamous cell carcinoma. %A Kuo, Winston Patrick %A Jenssen, Tor-Kristian %A Park, Peter J %A Lingen, Mark W %A Hasina, Rifat %A Ohno-Machado, Lucila %K Carcinoma, Squamous Cell %K Disease Progression %K Gene Expression %K Humans %K Mouth Neoplasms %K Neoplasm Staging %X

Oral squamous cell carcinoma (OSCC) is one of the most common cancer types worldwide. The prognosis for patients with this disease is generally poor and little is known about its progression. Gene expression studies may provide important insights to the molecular mechanisms of this disease. We analyzed gene expression data from a small panel of patients diagnosed with OSCC. Even with only 13 patient samples we were able to find genes with significant differences in expression levels between normal, dysplasia, and cancer samples. The largest differences in expression were generally found between normal and cancer samples, but significant differences were also found for several genes between dysplasia and the other two sample types. We also represent the significance levels of differentially expressed genes on the chromosome domain. The genes and genetic features we examine are potentially important factors on the molecular level in the progression of OSCC.

%B Proc AMIA Symp %P 415-9 %8 2002 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/12474876?dopt=Abstract %0 Journal Article %J Bioinformatics %D 2002 %T Linking gene expression data with patient survival times using partial least squares. %A Park, Peter J %A Tian, Lu %A Kohane, Isaac S %K Algorithms %K Gene Expression Profiling %K Genetic Predisposition to Disease %K Genetic Testing %K Humans %K Least-Squares Analysis %K Lung Neoplasms %K Models, Genetic %K Models, Statistical %K Oligonucleotide Array Sequence Analysis %K Proportional Hazards Models %K Reproducibility of Results %K Risk Assessment %K Risk Factors %K Sensitivity and Specificity %K Survival %K Survival Analysis %K United States %X

There is an increasing need to link the large amount of genotypic data, gathered using microarrays for example, with various phenotypic data from patients. The classification problem in which gene expression data serve as predictors and a class label phenotype as the binary outcome variable has been examined extensively, but there has been less emphasis in dealing with other types of phenotypic data. In particular, patient survival times with censoring are often not used directly as a response variable due to the complications that arise from censoring. We show that the issues involving censored data can be circumvented by reformulating the problem as a standard Poisson regression problem. The procedure for solving the transformed problem is a combination of two approaches: partial least squares, a regression technique that is especially effective when there is severe collinearity due to a large number of predictors, and generalized linear regression, which extends standard linear regression to deal with various types of response variables. The linear combinations of the original variables identified by the method are highly correlated with the patient survival times and at the same time account for the variability in the covariates. The algorithm is fast, as it does not involve any matrix decompositions in the iterations. We apply our method to data sets from lung carcinoma and diffuse large B-cell lymphoma studies to verify its effectiveness.

%B Bioinformatics %V 18 Suppl 1 %P S120-7 %8 2002 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/12169539?dopt=Abstract %0 Journal Article %J Genome Biol %D 2002 %T Vector algebra in the analysis of genome-wide expression data. %A Kuruvilla, Finny G %A Park, Peter J %A Schreiber, Stuart L %K Cluster Analysis %K Computational Biology %K Gene Expression Profiling %K Genes, Fungal %K Genome, Fungal %K Humans %K Immunosuppressive Agents %K Mutation %K Oligonucleotide Array Sequence Analysis %K Saccharomyces cerevisiae %K Sirolimus %X

BACKGROUND: Data from thousands of transcription-profiling experiments in organisms ranging from yeast to humans are now publicly available. How best to analyze these data remains an important challenge. A variety of tools have been used for this purpose, including hierarchical clustering, self-organizing maps and principal components analysis. In particular, concepts from vector algebra have proven useful in the study of genome-wide expression data. RESULTS: Here we present a framework based on vector algebra for the analysis of transcription profiles that is geometrically intuitive and computationally efficient. Concepts in vector algebra such as angles, magnitudes, subspaces, singular value decomposition, bases and projections have natural and powerful interpretations in the analysis of microarray data. Angles in particular offer a rigorous method of defining 'similarity' and are useful in evaluating the claims of a microarray-based study. We present a sample analysis of cells treated with rapamycin, an immunosuppressant whose effects have been extensively studied with microarrays. In addition, the algebraic concept of a basis for a space affords the opportunity to simplify data analysis and uncover a limited number of expression vectors to span the transcriptional range of cell behavior. CONCLUSIONS: This framework represents a compact, powerful and scalable construction for analysis and computation. As the amount of microarray data in the public domain grows, these vector-based methods are relevant in determining statistical significance. These approaches are also well suited to extract biologically meaningful information in the analysis of signaling networks.

%B Genome Biol %V 3 %P RESEARCH0011 %8 2002 %G eng %N 3 %1 http://www.ncbi.nlm.nih.gov/pubmed/11897023?dopt=Abstract %0 Journal Article %J Pac Symp Biocomput %D 2001 %T A nonparametric scoring algorithm for identifying informative genes from microarray data. %A Park, P J %A Pagano, M %A Bonetti, M %K Algorithms %K Biometry %K Data Interpretation, Statistical %K Gene Expression Profiling %K Humans %K Leukemia %K Oligonucleotide Array Sequence Analysis %K Phenotype %X

Microarray data routinely contain gene expression levels of thousands of genes. In the context of medical diagnostics, an important problem is to find the genes that are correlated with given phenotypes. These genes may reveal insights to biological processes and may be used to predict the phenotypes of new samples. In most cases, while the gene expression levels are available for a large number of genes, only a small fraction of these genes may be informative in classification with statistical significance. We introduce a nonparametric scoring algorithm that assigns a score to each gene based on samples with known classes. Based on these scores, we can find a small set of genes which are informative of their class, and subsequent analysis can be carried out with this set. This procedure is robust to outliers and different normalization schemes, and immediately reduces the size of the data with little loss of information. We study the properties of this algorithm and apply it to the data set from cancer patients. We quantify the information in a given set of genes by comparing its distribution of the score statistics to a set of distributions generated by permutations that preserve the correlation structure among the genes.

%B Pac Symp Biocomput %V 6 %P 52-63 %8 2001 %G eng %1 http://www.ncbi.nlm.nih.gov/pubmed/11262969?dopt=Abstract