Publications

2016
Ordulu Z, Nucci MR, Dal Cin P, Hollowell ML, Otis CN, Hornick JL, Park PJ, Kim T-M, Quade BJ, Morton CC. Intravenous leiomyomatosis: an unusual intermediate between benign and malignant uterine smooth muscle tumors. Mod Pathol 2016;Abstract

Intravenous leiomyomatosis is an unusual smooth muscle neoplasm with quasi-malignant intravascular growth but a histologically banal appearance. Herein, we report expression and molecular cytogenetic analyses of a series of 12 intravenous leiomyomatosis cases to better understand the pathogenesis of intravenous leiomyomatosis. All cases were analyzed for the expression of HMGA2, MDM2, and CDK4 proteins by immunohistochemistry based on our previous finding of der(14)t(12;14)(q14.3;q24) in intravenous leiomyomatosis. Seven of 12 (58%) intravenous leiomyomatosis cases expressed HMGA2, and none expressed MDM2 or CDK4. Colocalization of hybridization signals for probes from the HMGA2 locus (12q14.3) and from 14q24 by interphase fluorescence in situ hybridization (FISH) was detected in a mean of 89.2% of nuclei in HMGA2-positive cases by immunohistochemistry, but in only 12.4% of nuclei in negative cases, indicating an association of HMGA2 expression and this chromosomal rearrangement (P=8.24 × 10(-10)). Four HMGA2-positive cases had greater than two HMGA2 hybridization signals per cell. No cases showed loss of a hybridization signal by interphase FISH for the frequently deleted region of 7q22 in uterine leiomyomata. One intravenous leiomyomatosis case analyzed by array comparative genomic hybridization revealed complex copy number variations. Finally, expression profiling was performed on three intravenous leiomyomatosis cases. Interestingly, hierarchical cluster analysis of the expression profiles revealed segregation of the intravenous leiomyomatosis cases with leiomyosarcoma rather than with myometrium, uterine leiomyoma of the usual histological type, or plexiform leiomyoma. These findings suggest that intravenous leiomyomatosis cases share some molecular cytogenetic characteristics with uterine leiomyoma, and expression profiles similar to that of leiomyosarcoma cases, further supporting their intermediate, quasi-malignant behavior.Modern Pathology advance online publication, 19 February 2016; doi:10.1038/modpathol.2016.36.

Ceccarelli M, Barthel FP, Malta TM, Sabedot TS, Salama SR, Murray BA, Morozova O, Newton Y, Radenbaugh A, Pagnotta SM, Anjum S, Wang J, Manyam G, Zoppoli P, Ling S, Rao AA, Grifford M, Cherniack AD, Zhang H, Poisson L, Carlotti CG, da Tirapelli DPC, Rao A, Mikkelsen T, Lau CC, Yung AWK, Rabadan R, Huse J, Brat DJ, Lehman NL, Barnholtz-Sloan JS, Zheng S, Hess K, Rao G, Meyerson M, Beroukhim R, Cooper L, Akbani R, Wrensch M, Haussler D, Aldape KD, Laird PW, Gutmann DH, Gutmann DH, Noushmehr H, Iavarone A, Verhaak RGW. Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. Cell 2016;164(3):550-63.Abstract

Therapy development for adult diffuse glioma is hindered by incomplete knowledge of somatic glioma driving alterations and suboptimal disease classification. We defined the complete set of genes associated with 1,122 diffuse grade II-III-IV gliomas from The Cancer Genome Atlas and used molecular profiles to improve disease classification, identify molecular correlations, and provide insights into the progression from low- to high-grade disease. Whole-genome sequencing data analysis determined that ATRX but not TERT promoter mutations are associated with increased telomere length. Recent advances in glioma classification based on IDH mutation and 1p/19q co-deletion status were recapitulated through analysis of DNA methylation profiles, which identified clinically relevant molecular subsets. A subtype of IDH mutant glioma was associated with DNA demethylation and poor outcome; a group of IDH-wild-type diffuse glioma showed molecular similarity to pilocytic astrocytoma and relatively favorable survival. Understanding of cohesive disease groups may aid improved clinical outcomes.

Chen F, Zhang Y, Şenbabaoğlu Y, Ciriello G, Yang L, Reznik E, Shuch B, Micevic G, De Velasco G, Shinbrot E, Noble MS, Lu Y, Covington KR, Xi L, Drummond JA, Muzny D, Kang H, Lee J, Tamboli P, Reuter V, Shelley CS, Kaipparettu BA, Bottaro DP, Godwin AK, Gibbs RA, Getz G, Kucherlapati R, Park PJ, Sander C, Henske EP, Zhou JH, Kwiatkowski DJ, Ho TH, Choueiri TK, Hsieh JJ, Akbani R, Mills GB, Hakimi AA, Wheeler DA, Creighton CJ. Multilevel Genomics-Based Taxonomy of Renal Cell Carcinoma. Cell Rep 2016;Abstract

On the basis of multidimensional and comprehensive molecular characterization (including DNA methalylation and copy number, RNA, and protein expression), we classified 894 renal cell carcinomas (RCCs) of various histologic types into nine major genomic subtypes. Site of origin within the nephron was one major determinant in the classification, reflecting differences among clear cell, chromophobe, and papillary RCC. Widespread molecular changes associated with TFE3 gene fusion or chromatin modifier genes were present within a specific subtype and spanned multiple subtypes. Differences in patient survival and in alteration of specific pathways (including hypoxia, metabolism, MAP kinase, NRF2-ARE, Hippo, immune checkpoint, and PI3K/AKT/mTOR) could further distinguish the subtypes. Immune checkpoint markers and molecular signatures of T cell infiltrates were both highest in the subtype associated with aggressive clear cell RCC. Differences between the genomic subtypes suggest that therapeutic strategies could be tailored to each RCC disease subset.

2015
Cheloufi S, Elling U, Hopfgartner B, Jung YL, Murn J, Ninova M, Hubmann M, Badeaux AI, Euong Ang C, Tenen D, Wesche DJ, Abazova N, Hogue M, Tasdemir N, Brumbaugh J, Rathert P, Jude J, Ferrari F, Blanco A, Fellner M, Wenzel D, Zinner M, Vidal SE, Bell O, Stadtfeld M, Chang HY, Almouzni G, Lowe SW, Rinn J, Wernig M, Aravin A, Shi Y, Park PJ, Penninger JM, Zuber J, Hochedlinger K. The histone chaperone CAF-1 safeguards somatic cell identity. Nature 2015;528(7581):218-24.Abstract

Cellular differentiation involves profound remodelling of chromatic landscapes, yet the mechanisms by which somatic cell identity is subsequently maintained remain incompletely understood. To further elucidate regulatory pathways that safeguard the somatic state, we performed two comprehensive RNA interference (RNAi) screens targeting chromatin factors during transcription-factor-mediated reprogramming of mouse fibroblasts to induced pluripotent stem cells (iPS cells). Subunits of the chromatin assembly factor-1 (CAF-1) complex, including Chaf1a and Chaf1b, emerged as the most prominent hits from both screens, followed by modulators of lysine sumoylation and heterochromatin maintenance. Optimal modulation of both CAF-1 and transcription factor levels increased reprogramming efficiency by several orders of magnitude and facilitated iPS cell formation in as little as 4 days. Mechanistically, CAF-1 suppression led to a more accessible chromatin structure at enhancer elements early during reprogramming. These changes were accompanied by a decrease in somatic heterochromatin domains, increased binding of Sox2 to pluripotency-specific targets and activation of associated genes. Notably, suppression of CAF-1 also enhanced the direct conversion of B cells into macrophages and fibroblasts into neurons. Together, our findings reveal the histone chaperone CAF-1 to be a novel regulator of somatic cell identity during transcription-factor-induced cell-fate transitions and provide a potential strategy to modulate cellular plasticity in a regenerative setting.

Bersani F, Lee E, Kharchenko PV, Xu AW, Liu M, Xega K, MacKenzie OC, Brannigan BW, Wittner BS, Jung H, Ramaswamy S, Park PJ, Maheswaran S, Ting DT, Haber DA. Pericentromeric satellite repeat expansions through RNA-derived DNA intermediates in cancer. Proc Natl Acad Sci U S A 2015;112(49):15148-53.Abstract

Aberrant transcription of the pericentromeric human satellite II (HSATII) repeat is present in a wide variety of epithelial cancers. In deriving experimental systems to study its deregulation, we observed that HSATII expression is induced in colon cancer cells cultured as xenografts or under nonadherent conditions in vitro, but it is rapidly lost in standard 2D cultures. Unexpectedly, physiological induction of endogenous HSATII RNA, as well as introduction of synthetic HSATII transcripts, generated cDNA intermediates in the form of DNA/RNA hybrids. Single molecule sequencing of tumor xenografts showed that HSATII RNA-derived DNA (rdDNA) molecules are stably incorporated within pericentromeric loci. Suppression of RT activity using small molecule inhibitors reduced HSATII copy gain. Analysis of whole-genome sequencing data revealed that HSATII copy number gain is a common feature in primary human colon tumors and is associated with a lower overall survival. Together, our observations suggest that cancer-associated derepression of specific repetitive sequences can promote their RNA-driven genomic expansion, with potential implications on pericentromeric architecture.

Cancer Genome Atlas Research Network TCGA. The Molecular Taxonomy of Primary Prostate Cancer. Cell 2015;163(4):1011-25.Abstract

There is substantial heterogeneity among primary prostate cancers, evident in the spectrum of molecular abnormalities and its variable clinical course. As part of The Cancer Genome Atlas (TCGA), we present a comprehensive molecular analysis of 333 primary prostate carcinomas. Our results revealed a molecular taxonomy in which 74% of these tumors fell into one of seven subtypes defined by specific gene fusions (ERG, ETV1/4, and FLI1) or mutations (SPOP, FOXA1, and IDH1). Epigenetic profiles showed substantial heterogeneity, including an IDH1 mutant subset with a methylator phenotype. Androgen receptor (AR) activity varied widely and in a subtype-specific manner, with SPOP and FOXA1 mutant tumors having the highest levels of AR-induced transcripts. 25% of the prostate cancers had a presumed actionable lesion in the PI3K or MAPK signaling pathways, and DNA repair genes were inactivated in 19%. Our analysis reveals molecular heterogeneity among primary prostate cancers, as well as potentially actionable molecular defects.

Choi J*, Lee S*, Mallard W, Clement K, Tagliazucchi GM, Lim H, Choi IY, Ferrari F, Tsankov AM, Pop R, Lee G, Rinn JL, Meissner A, Park PJ**, Hochedlinger K**. A comparison of genetically matched cell lines reveals the equivalence of human iPSCs and ESCs. Nat Biotechnol 2015;33(11):1173-81.Abstract

The equivalence of human induced pluripotent stem cells (hiPSCs) and human embryonic stem cells (hESCs) remains controversial. Here we use genetically matched hESC and hiPSC lines to assess the contribution of cellular origin (hESC vs. hiPSC), the Sendai virus (SeV) reprogramming method and genetic background to transcriptional and DNA methylation patterns while controlling for cell line clonality and sex. We find that transcriptional and epigenetic variation originating from genetic background dominates over variation due to cellular origin or SeV infection. Moreover, the 49 differentially expressed genes we detect between genetically matched hESCs and hiPSCs neither predict functional outcome nor distinguish an independently derived, larger set of unmatched hESC and hiPSC lines. We conclude that hESCs and hiPSCs are molecularly and functionally equivalent and cannot be distinguished by a consistent gene expression signature. Our data further imply that genetic background variation is a major confounding factor for transcriptional and epigenetic comparisons of pluripotent cell lines, explaining some of the previously observed differences between genetically unmatched hESCs and hiPSCs.

Jung H, Lee D, Lee J, Park D, Kim YJ, Park W-Y, Hong D**, Park PJ**, Lee E**. Intron retention is a widespread mechanism of tumor-suppressor inactivation. Nat Genet 2015;47(11):1242-8.Abstract

-A substantial fraction of disease-causing mutations are pathogenic through aberrant splicing. Although genome profiling studies have identified somatic single-nucleotide variants (SNVs) in cancer, the extent to which these variants trigger abnormal splicing has not been systematically examined. Here we analyzed RNA sequencing and exome data from 1,812 patients with cancer and identified ∼900 somatic exonic SNVs that disrupt splicing. At least 163 SNVs, including 31 synonymous ones, were shown to cause intron retention or exon skipping in an allele-specific manner, with ∼70% of the SNVs occurring on the last base of exons. Notably, SNVs causing intron retention were enriched in tumor suppressors, and 97% of these SNVs generated a premature termination codon, leading to loss of function through nonsense-mediated decay or truncated protein. We also characterized the genomic features predictive of such splicing defects. Overall, this work demonstrates that intron retention is a common mechanism of tumor-suppressor inactivation.

Somatic mutation in single human neurons tracks developmental and transcriptional history.
Lodato MA*, Woodworth MB*, Lee S*, Evrony GD, Mehta BK, Karger A, Lee S, Chittenden TW, D'Gama AM, Cai X, Luquette LJ, Lee E, Park PJ**, Walsh CA**. Somatic mutation in single human neurons tracks developmental and transcriptional history. Science 2015;350(6256):94-8.Abstract

Neurons live for decades in a postmitotic state, their genomes susceptible to DNA damage. Here we survey the landscape of somatic single-nucleotide variants (SNVs) in the human brain. We identified thousands of somatic SNVs by single-cell sequencing of 36 neurons from the cerebral cortex of three normal individuals. Unlike germline and cancer SNVs, which are often caused by errors in DNA replication, neuronal mutations appear to reflect damage during active transcription. Somatic mutations create nested lineage trees, allowing them to be dated relative to developmental landmarks and revealing a polyclonal architecture of the human cerebral cortex. Thus, somatic mutations in the brain represent a durable and ongoing record of neuronal life history, from development through postmitotic function.

Cancer Genome Atlas Research Network TCGA. Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N Engl J Med 2015;372(26):2481-98.Abstract

BACKGROUND: Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas.

METHODS: We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes.

RESULTS: Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma.

CONCLUSIONS: The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q codeletion or carried a TP53 mutation. Most lower-grade gliomas without an IDH mutation were molecularly and clinically similar to glioblastoma. (Funded by the National Institutes of Health.).

Cancer Genome Atlas Network TCGA. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 2015;517(7536):576-82.Abstract

The Cancer Genome Atlas profiled 279 head and neck squamous cell carcinomas (HNSCCs) to provide a comprehensive landscape of somatic genomic alterations. Here we show that human-papillomavirus-associated tumours are dominated by helical domain mutations of the oncogene PIK3CA, novel alterations involving loss of TRAF3, and amplification of the cell cycle gene E2F1. Smoking-related HNSCCs demonstrate near universal loss-of-function TP53 mutations and CDKN2A inactivation with frequent copy number alterations including amplification of 3q26/28 and 11q13/22. A subgroup of oral cavity tumours with favourable clinical outcomes displayed infrequent copy number alterations in conjunction with activating mutations of HRAS or PIK3CA, coupled with inactivating mutations of CASP8, NOTCH1 and TP53. Other distinct subgroups contained loss-of-function alterations of the chromatin modifier NSD1, WNT pathway genes AJUBA and FAT1, and activation of oxidative stress factor NFE2L2, mainly in laryngeal tumours. Therapeutic candidate alterations were identified in most HNSCCs.

Cancer Genome Atlas Network TCGA. Genomic Classification of Cutaneous Melanoma. Cell 2015;161(7):1681-96.Abstract

We describe the landscape of genomic alterations in cutaneous melanomas through DNA, RNA, and protein-based analysis of 333 primary and/or metastatic melanomas from 331 patients. We establish a framework for genomic classification into one of four subtypes based on the pattern of the most prevalent significantly mutated genes: mutant BRAF, mutant RAS, mutant NF1, and Triple-WT (wild-type). Integrative analysis reveals enrichment of KIT mutations and focal amplifications and complex structural rearrangements as a feature of the Triple-WT subtype. We found no significant outcome correlation with genomic classification, but samples assigned a transcriptomic subclass enriched for immune gene expression associated with lymphocyte infiltrate on pathology review and high LCK protein expression, a T cell marker, were associated with improved patient survival. This clinicopathological and multi-dimensional analysis suggests that the prognosis of melanoma patients with regional metastases is influenced by tumor stroma immunobiology, offering insights to further personalize therapeutic decision-making.

De Los Angeles A, Ferrari F, Xi R, Fujiwara Y, Benvenisty N, Deng H, Hochedlinger K, Jaenisch R, Lee S, Leitch HG, Lensch WM, Lujan E, Pei D, Rossant J, Wernig M, Park PJ, Daley GQ. Hallmarks of pluripotency. Nature 2015;525(7570):469-78.Abstract

Stem cells self-renew and generate specialized progeny through differentiation, but vary in the range of cells and tissues they generate, a property called developmental potency. Pluripotent stem cells produce all cells of an organism, while multipotent or unipotent stem cells regenerate only specific lineages or tissues. Defining stem-cell potency relies upon functional assays and diagnostic transcriptional, epigenetic and metabolic states. Here we describe functional and molecular hallmarks of pluripotent stem cells, propose a checklist for their evaluation, and illustrate how forensic genomics can validate their provenance.

Lee S*, Seo CH*, Alver BH, Hyuk Lee S, Park PJ. EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering. BMC Bioinformatics 2015;16(1):278.Abstract

BACKGROUND: RNA-seq has been widely used for genome-wide expression profiling. RNA-seq data typically consists of tens of millions of short sequenced reads from different transcripts. However, due to sequence similarity among genes and among isoforms, the source of a given read is often ambiguous. Existing approaches for estimating expression levels from RNA-seq reads tend to compromise between accuracy and computational cost. RESULTS: We introduce a new approach for quantifying transcript abundance from RNA-seq data. EMSAR (Estimation by Mappability-based Segmentation And Reclustering) groups reads according to the set of transcripts to which they are mapped and finds maximum likelihood estimates using a joint Poisson model for each optimal set of segments of transcripts. The method uses nearly all mapped reads, including those mapped to multiple genes. With an efficient transcriptome indexing based on modified suffix arrays, EMSAR minimizes the use of CPU time and memory while achieving accuracy comparable to the best existing methods. CONCLUSIONS: EMSAR is a method for quantifying transcripts from RNA-seq data with high accuracy and low computational cost. EMSAR is available at https://github.com/parklab/emsar.

Kang H, McElroy KA, Jung YL, Alekseyenko AA, Zee BM, Park PJ, Kuroda MI. Sex comb on midleg (Scm) is a functional link between PcG-repressive complexes in Drosophila. Genes Dev 2015;29(11):1136-50.Abstract

The Polycomb group (PcG) proteins are key regulators of development in Drosophila and are strongly implicated in human health and disease. How PcG complexes form repressive chromatin domains remains unclear. Using cross-linked affinity purifications of BioTAP-Polycomb (Pc) or BioTAP-Enhancer of zeste [E(z)], we captured all PcG-repressive complex 1 (PRC1) or PRC2 core components and Sex comb on midleg (Scm) as the only protein strongly enriched with both complexes. Although previously not linked to PRC2, we confirmed direct binding of Scm and PRC2 using recombinant protein expression and colocalization of Scm with PRC1, PRC2, and H3K27me3 in embryos and cultured cells using ChIP-seq (chromatin immunoprecipitation [ChIP] combined with deep sequencing). Furthermore, we found that RNAi knockdown of Scm and overexpression of the dominant-negative Scm-SAM (sterile α motif) domain both affected the binding pattern of E(z) on polytene chromosomes. Aberrant localization of the Scm-SAM domain in long contiguous regions on polytene chromosomes revealed its independent ability to spread on chromatin, consistent with its previously described ability to oligomerize in vitro. Pull-downs of BioTAP-Scm captured PRC1 and PRC2 and additional repressive complexes, including PhoRC, LINT, and CtBP. We propose that Scm is a key mediator connecting PRC1, PRC2, and transcriptional silencing. Combined with previous structural and genetic analyses, our results strongly suggest that Scm coordinates PcG complexes and polymerizes to produce broad domains of PcG silencing.

Spatiotemporal Evolution of the Primary Glioblastoma Genome.
Kim J, Lee I-H, Cho HJ, Park C-K, Jung Y-S, Kim Y, Nam SH, Kim BS, Johnson MD, Kong D-S, Seol HJ, Lee J-I, Joo KM, Yoon Y, Park W-Y, Lee J, Park PJ**, Nam D-H**. Spatiotemporal Evolution of the Primary Glioblastoma Genome. Cancer Cell 2015;28(3):318-28.Abstract

Tumor recurrence following treatment is the major cause of mortality for glioblastoma multiforme (GBM) patients. Thus, insights on the evolutionary process at recurrence are critical for improved patient care. Here, we describe our genomic analyses of the initial and recurrent tumor specimens from each of 38 GBM patients. A substantial divergence in the landscape of driver alterations was associated with distant appearance of a recurrent tumor from the initial tumor, suggesting that the genomic profile of the initial tumor can mislead targeted therapies for the distally recurred tumor. In addition, in contrast to IDH1-mutated gliomas, IDH1-wild-type primary GBMs rarely developed hypermutation following temozolomide (TMZ) treatment, indicating low risk for TMZ-induced hypermutation for these tumors under the standard regimen.

Evrony GD*, Lee E*, Mehta BK, Benjamini Y, Johnson RM, Cai X, Yang L, Haseley P, Lehmann HS, Park PJ**, Walsh CA**. Cell lineage analysis in human brain using endogenous retroelements. Neuron 2015;85(1):49-59.Abstract

Somatic mutations occur during brain development and are increasingly implicated as a cause of neurogenetic disease. However, the patterns in which somatic mutations distribute in the human brain are unknown. We used high-coverage whole-genome sequencing of single neurons from a normal individual to identify spontaneous somatic mutations as clonal marks to track cell lineages in human brain. Somatic mutation analyses in >30 locations throughout the nervous system identified multiple lineages and sublineages of cells marked by different LINE-1 (L1) retrotransposition events and subsequent mutation of poly-A microsatellites within L1. One clone contained thousands of cells limited to the left middle frontal gyrus, whereas a second distinct clone contained millions of cells distributed over the entire left hemisphere. These patterns mirror known somatic mutation disorders of brain development and suggest that focally distributed mutations are also prevalent in normal brains. Single-cell analysis of somatic mutation enables tracing of cell lineage clones in human brain.

Kann M, Ettou S*, Jung YL*, Lenz MO, Taglienti ME, Park PJ, Schermer B, Benzing T, Kreidberg JA. Genome-Wide Analysis of Wilms' Tumor 1-Controlled Gene Expression in Podocytes Reveals Key Regulatory Mechanisms. J Am Soc Nephrol 2015;26(9):2097-104.Abstract

The transcription factor Wilms' tumor suppressor 1 (WT1) is key to podocyte development and viability; however, WT1 transcriptional networks in podocytes remain elusive. We provide a comprehensive analysis of the genome-wide WT1 transcriptional network in podocytes in vivo using chromatin immunoprecipitation followed by sequencing (ChIPseq) and RNA sequencing techniques. Our data show a specific role for WT1 in regulating the podocyte-specific transcriptome through binding to both promoters and enhancers of target genes. Furthermore, we inferred a podocyte transcription factor network consisting of WT1, LMX1B, TCF21, Fox-class and TEAD family transcription factors, and MAFB that uses tissue-specific enhancers to control podocyte gene expression. In addition to previously described WT1-dependent target genes, ChIPseq identified novel WT1-dependent signaling systems. These targets included components of the Hippo signaling system, underscoring the power of genome-wide transcriptional-network analyses. Together, our data elucidate a comprehensive gene regulatory network in podocytes suggesting that WT1 gene regulatory function and podocyte cell-type specification can best be understood in the context of transcription factor-regulatory element network interplay.

Sohn K-A*, Ho JWK*, Djordjevic D, Jeong H-H, Park PJ**, Kim JH**. hiHMM: Bayesian non-parametric joint inference of chromatin state maps. Bioinformatics 2015;31(13):2066-74.Abstract

MOTIVATION: Genome-wide mapping of chromatin states is essential for defining regulatory elements and inferring their activities in eukaryotic genomes. A number of hidden Markov model (HMM)-based methods have been developed to infer chromatin state maps from genome-wide histone modification data for an individual genome. To perform a principled comparison of evolutionarily distant epigenomes, we must consider species-specific biases such as differences in genome size, strength of signal enrichment and co-occurrence patterns of histone modifications. RESULTS: Here, we present a new Bayesian non-parametric method called hierarchically linked infinite HMM (hiHMM) to jointly infer chromatin state maps in multiple genomes (different species, cell types and developmental stages) using genome-wide histone modification data. This flexible framework provides a new way to learn a consistent definition of chromatin states across multiple genomes, thus facilitating a direct comparison among them. We demonstrate the utility of this method using synthetic data as well as multiple modENCODE ChIP-seq datasets. CONCLUSION: The hierarchical and Bayesian non-parametric formulation in our approach is an important extension to the current set of methodologies for comparative chromatin landscape analysis. AVAILABILITY AND IMPLEMENTATION: Source codes are available at https://github.com/kasohn/hiHMM. Chromatin data are available at http://encode-x.med.harvard.edu/data_sets/chromatin/.

Biagioli M*, Ferrari F*, Mendenhall EM, Zhang Y, Erdin S, Vijayvargia R, Vallabh SM, Solomos N, Manavalan P, Ragavendran A, Ozsolak F, Lee JM, Talkowski ME, Gusella JF, Macdonald ME, Park PJ, Seong IS. Htt CAG repeat expansion confers pleiotropic gains of mutant huntingtin function in chromatin regulation. Hum Mol Genet 2015;Abstract

The CAG repeat expansion in the Huntington's disease gene HTT extends a polyglutamine tract in mutant huntingtin that enhances its ability to facilitate polycomb repressive complex 2 (PRC2). To gain insight into this dominant gain of function, we mapped histone modifications genome-wide across an isogenic panel of mouse embryonic stem cell (ESC) and neuronal progenitor cell (NPC) lines, comparing the effects of Htt null and different size Htt CAG mutations. We found that Htt is required in ESC for the proper deposition of histone H3K27me3 at a subset of 'bivalent' loci but in NPC it is needed at 'bivalent' loci for both the proper maintenance and the appropriate removal of this mark. In contrast, Htt CAG size, though changing histone H3K27me3, is prominently associated with altered histone H3K4me3 at 'active' loci. The sets of ESC and NPC genes with altered histone marks delineated by the lack of huntingtin or the presence of mutant huntingtin, though distinct, are enriched in similar pathways with apoptosis specifically highlighted for the CAG mutation. Thus, the manner by which huntingtin function facilitates PRC2 may afford mutant huntingtin with multiple opportunities to impinge upon the broader machinery that orchestrates developmentally appropriate chromatin status.

Pages