Publications

2011
Yoon SS, Duda DG, Karl DL, Kim T-M, Kambadakone AR, Chen Y-L, Rothrock C, Rosenberg AE, Nielsen PG, Kirsch DG, Choy E, Harmon DC, Hornicek FJ, Dreyfuss JM, Ancukiewicz M, Sahani DV, Park PJ, Jain RK, Delaney TF. Phase II study of neoadjuvant bevacizumab and radiotherapy for resectable soft tissue sarcomas. Int J Radiat Oncol Biol Phys 2011;81(4):1081-90.Abstract

PURPOSE: Numerous preclinical studies have demonstrated that angiogenesis inhibitors can increase the efficacy of radiotherapy (RT). We sought to examine the safety and efficacy of bevacizumab (BV) and RT in soft tissue sarcomas and explore biomarkers to help determine the treatment response. METHODS AND MATERIALS: Patients with ≥5 cm, intermediate- or high-grade soft tissue sarcomas at significant risk of local recurrence received neoadjuvant BV alone followed by BV plus RT before surgical resection. Correlative science studies included analysis of the serial blood and tumor samples and serial perfusion computed tomography scans. RESULTS: The 20 patients had a median tumor size of 8.25 cm, with 13 extremity, 1 trunk, and 6 retroperitoneal/pelvis tumors. The neoadjuvant treatment was well tolerated, with only 4 patients having Grade 3 toxicities (hypertension, liver function test elevation). BV plus RT resulted in ≥80% pathologic necrosis in 9 (45%) of 20 tumors, more than double the historical rate seen with RT alone. Three patients had a complete pathologic response. The median microvessel density decreased 53% after BV alone (p <.05). After combination therapy, the median tumor cell proliferation decreased by 73%, apoptosis increased 10.4-fold, and the blood flow, blood volume, and permeability surface area decreased by 62-72% (p <.05). Analysis of gene expression microarrays of untreated tumors identified a 24-gene signature for treatment response. The microvessel density and circulating progenitor cells at baseline and the reduction in microvessel density and plasma soluble c-KIT with BV therapy also correlated with a good pathologic response (p <.05). After a median follow-up of 20 months, only 1 patient had developed local recurrence. CONCLUSIONS: The results from the present exploratory study indicated that BV increases the efficacy of RT against soft tissue sarcomas and might reduce the incidence of local recurrence. Thus, this regimen warrants additional investigation. Gene expression profiles and other tissue and circulating biomarkers showed promising correlations with treatment response.

2010
Gorchakov AA, Alekseenko AA, Kharchenko PV, Park P, Kuroda M. [Dosage compensation in drosophila: sequence-specific initiation and sequence-independent spreading of MSL complex to the active genes on the male X chromosome]. Genetika 2010;46(10):1430-4.Abstract
For the dosage compensation to occur, genes on the single male X chromosomes in Drosophila must be selectively bound and acetylated by the ribonucleoprotein complex called MSL complex. It remained unknown how such exquisite specificity is achieved, and whether specific DNA sequences were involved. In the present work we demonstrate that it is transcription of the gene on the X chromosome that is important for MSL targeting, irrespective of gene origin and DNA sequence.
Blackledge NP, Zhou JC, Tolstorukov MY, Farcas AM, Park PJ, Klose RJ. CpG islands recruit a histone H3 lysine 36 demethylase. Molecular Cell 2010;38(2):179-90.Abstract

In higher eukaryotes, up to 70% of genes have high levels of nonmethylated cytosine/guanine base pairs (CpGs) surrounding promoters and gene regulatory units. These features, called CpG islands, were identified over 20 years ago, but there remains little mechanistic evidence to suggest how these enigmatic elements contribute to promoter function, except that they are refractory to epigenetic silencing by DNA methylation. Here we show that CpG islands directly recruit the H3K36-specific lysine demethylase enzyme KDM2A. Nucleation of KDM2A at these elements results in removal of H3K36 methylation, creating CpG island chromatin that is uniquely depleted of this modification. KDM2A utilizes a zinc finger CxxC (ZF-CxxC) domain that preferentially recognizes nonmethylated CpG DNA, and binding is blocked when the CpG DNA is methylated, thus constraining KDM2A to nonmethylated CpG islands. These data expose a straightforward mechanism through which KDM2A delineates a unique architecture that differentiates CpG island chromatin from bulk chromatin.

Goutagny S, Yang HW, Zucman-Rossi J, Chan J, Dreyfuss JM, Park PJ, Black PM, Giovannini M, Carroll RS, Kalamarides M. Genomic profiling reveals alternative genetic pathways of meningioma malignant progression dependent on the underlying NF2 status. Clin Cancer Res 2010;16(16):4155-64.Abstract

PURPOSE: Meningiomas are the most common central nervous system tumors in the population of age 35 and older. WHO defines three grades predictive of the risk of recurrence. Clinical data supporting histologic malignant progression of meningiomas are sparse and underlying molecular mechanisms are not clearly depicted. EXPERIMENTAL DESIGN: We identified genetic alterations associated with histologic progression of 36 paired meningioma samples in 18 patients using 500K SNP genotyping arrays and NF2 gene sequencing. RESULTS: The most frequent chromosome alterations observed in progressing meningioma samples are early alterations (i.e., present both in lower- and higher-grade samples of a single patient). In our series, NF2 gene inactivation was an early and frequent event in progressing meningioma samples (73%). Chromosome alterations acquired during progression from grade I to grade II meningioma were not recurrent. Progression to grade III was characterized by recurrent genomic alterations, the most frequent being CDKN2A/CDKN2B locus loss on 9p. CONCLUSION: Meningiomas displayed different patterns of genetic alterations during progression according to their NF2 status: NF2-mutated meningiomas showed higher chromosome instability during progression than NF2-nonmutated meningiomas, which had very few imbalanced chromosome segments. This pattern of alterations could thus be used as markers in clinical practice to identify tumors prone to progress among grade I meningiomas.

modENCODE Consortium *, Roy S*, Ernst J*, Kharchenko PV*, Kheradpour P*, Negre N*, Eaton ML*, Landolin JM*, Bristow CA*, Ma L*, Lin MF*, Washietl S*, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, Brown CD, Candeias R, Carlson JW, Carr A, Jungreis I, Marbach D, Sealfon R, Tolstorukov MY, Will S, Alekseyenko AA, Artieri C, Booth BW, Brooks AN, Dai Q, Davis CA, Duff MO, Feng X, Gorchakov AA, Gu T, Henikoff JG, Kapranov P, Li R, MacAlpine HK, Malone J, Minoda A, Nordman J, Okamura K, Perry M, Powell SK, Riddle NC, Sakai A, Samsonova A, Sandler JE, Schwartz YB, Sher N, Spokony R, Sturgill D, van Baren M, Wan KH, Yang L, Yu C, Feingold E, Good P, Guyer M, Lowdon R, Ahmad K, Andrews J, Berger B, Brenner SE, Brent MR, Cherbas L, Elgin SCR, Gingeras TR, Grossman R, Hoskins RA, Kaufman TC, Kent W, Kuroda MI, Orr-Weaver T, Perrimon N, Pirrotta V, Posakony JW, Ren B, Russell S, Cherbas P, Graveley BR, Lewis S, Micklem G, Oliver B, Park PJ, Celniker SE**, Henikoff S**, Karpen GH**, Lai EC**, MacAlpine DM**, Stein LD**, White KP**, Kellis M**. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010;330(6012):1787-97.Abstract

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

Jagani Z, Mora-Blanco LE, Sansam CG, McKenna ES, Wilson B, Chen D, Klekota J, Tamayo P, Nguyen PTL, Tolstorukov M, Park PJ, Cho Y-J, Hsiao K, Buonamici S, Pomeroy SL, Mesirov JP, Ruffner H, Bouwmeester T, Luchansky SJ, Murtie J, Kelleher JF, Warmuth M, Sellers WR, Roberts CWM, Dorsch M. Loss of the tumor suppressor Snf5 leads to aberrant activation of the Hedgehog-Gli pathway. Nat Med 2010;16(12):1429-33.Abstract

Aberrant activation of the Hedgehog (Hh) pathway can drive tumorigenesis. To investigate the mechanism by which glioma-associated oncogene family zinc finger-1 (GLI1), a crucial effector of Hh signaling, regulates Hh pathway activation, we searched for GLI1-interacting proteins. We report that the chromatin remodeling protein SNF5 (encoded by SMARCB1, hereafter called SNF5), which is inactivated in human malignant rhabdoid tumors (MRTs), interacts with GLI1. We show that Snf5 localizes to Gli1-regulated promoters and that loss of Snf5 leads to activation of the Hh-Gli pathway. Conversely, re-expression of SNF5 in MRT cells represses GLI1. Consistent with this, we show the presence of a Hh-Gli-activated gene expression profile in primary MRTs and show that GLI1 drives the growth of SNF5-deficient MRT cells in vitro and in vivo. Therefore, our studies reveal that SNF5 is a key mediator of Hh signaling and that aberrant activation of GLI1 is a previously undescribed targetable mechanism contributing to the growth of MRT cells.

Balakrishnan A, Stearns AT, Park PJ, Dreyfuss JM, Ashley SW, Rhoads DB, Tavakkolizadeh A. MicroRNA mir-16 is anti-proliferative in enterocytes and exhibits diurnal rhythmicity in intestinal crypts. Exp Cell Res 2010;316(20):3512-21.Abstract

BACKGROUND AND AIMS: The intestine exhibits profound diurnal rhythms in function and morphology, in part due to changes in enterocyte proliferation. The regulatory mechanisms behind these rhythms remain largely unknown. We hypothesized that microRNAs are involved in mediating these rhythms, and studied the role of microRNAs specifically in modulating intestinal proliferation. METHODS: Diurnal rhythmicity of microRNAs in rat jejunum was analyzed by microarrays and validated by qPCR. Temporal expression of diurnally rhythmic mir-16 was further quantified in intestinal crypts, villi, and smooth muscle using laser capture microdissection and qPCR. Morphological changes in rat jejunum were assessed by histology and proliferation by immunostaining for bromodeoxyuridine. In IEC-6 cells stably overexpressing mir-16, proliferation was assessed by cell counting and MTS assay, cell cycle progression and apoptosis by flow cytometry, and cell cycle gene expression by qPCR and immunoblotting. RESULTS: mir-16 peaked 6 hours after light onset (HALO 6) with diurnal changes restricted to crypts. Crypt depth and villus height peaked at HALO 13-14 in antiphase to mir-16. Overexpression of mir-16 in IEC-6 cells suppressed specific G1/S regulators (cyclins D1-3, cyclin E1 and cyclin-dependent kinase 6) and produced G1 arrest. Protein expression of these genes exhibited diurnal rhythmicity in rat jejunum, peaking between HALO 11 and 17 in antiphase to mir-16. CONCLUSIONS: This is the first report of circadian rhythmicity of specific microRNAs in rat jejunum. Our data provide a link between anti-proliferative mir-16 and the intestinal proliferation rhythm and point to mir-16 as an important regulator of proliferation in jejunal crypts. This function may be essential to match proliferation and absorptive capacity with nutrient availability.

Peng S, Kuroda MI, Park PJ. Quantized correlation coefficient for measuring reproducibility of ChIP-chip data. BMC Bioinformatics 2010;11:399.Abstract

BACKGROUND: Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in replicates, and a correlation coefficient between replicates is used often to assess reproducibility. However, the correlation coefficient can be misleading because it is affected not only by the reproducibility of the signal but also by the amount of binding signal present in the data. RESULTS: We develop the Quantized correlation coefficient (QCC) that is much less dependent on the amount of signal. This involves discretization of data into set of quantiles (quantization), a merging procedure to group the background probes, and recalculation of the Pearson correlation coefficient. This procedure reduces the influence of the background noise on the statistic, which then properly focuses more on the reproducibility of the signal. The performance of this procedure is tested in both simulated and real ChIP-chip data. For replicates with different levels of enrichment over background and coverage, we find that QCC reflects reproducibility more accurately and is more robust than the standard Pearson or Spearman correlation coefficients. The quantization and the merging procedure can also suggest a proper quantile threshold for separating signal from background for further analysis. CONCLUSIONS: To measure reproducibility of ChIP-chip data correctly, a correlation coefficient that is robust to the amount of signal present should be used. QCC is one such measure. The QCC statistic can also be applied in a variety of other contexts for measuring reproducibility, including analysis of array CGH data for DNA copy number and gene expression data.

Woo CJ, Kharchenko PV, Daheron L, Park PJ, Kingston RE. A region of the human HOXD cluster that confers polycomb-group responsiveness. Cell 2010;140(1):99-110.Abstract

Polycomb group (PcG) proteins are essential for accurate axial body patterning during embryonic development. PcG-mediated repression is conserved in metazoans and is targeted in Drosophila by Polycomb response elements (PREs). However, targeting sequences in humans have not been described. While analyzing chromatin architecture in the context of human embryonic stem cell (hESC) differentiation, we discovered a 1.8kb region between HOXD11 and HOXD12 (D11.12) that is associated with PcG proteins, becomes nuclease hypersensitive, and then shows alteration in nuclease sensitivity as hESCs differentiate. The D11.12 element repressed luciferase expression from a reporter construct and full repression required a highly conserved region and YY1 binding sites. Furthermore, repression was dependent on the PcG proteins BMI1 and EED and a YY1-interacting partner, RYBP. We conclude that D11.12 is a Polycomb-dependent regulatory region with similarities to Drosophila PREs, indicating conservation in the mechanisms that target PcG function in mammals and flies.

Tolstorukov MY, Kharchenko PV, Park PJ. Analysis of primary structure of chromatin with next-generation sequencing. Epigenomics 2010;2(2):187-197.Abstract

The recent development of next-generation sequencing technology has enabled significant progress in chromatin structure analysis. Here, we review the experimental and bioinformatic approaches to studying nucleosome positioning and histone modification profiles on a genome scale using this technology. These studies advanced our knowledge of the nucleosome positioning patterns of both epigenetically modified and bulk nucleosomes and elucidated the role of such patterns in regulation of gene expression. The identification and analysis of large sets of nucleosome-bound DNA sequences allowed better understanding of the rules that govern nucleosome positioning in organisms of various complexity. We also discuss the existing challenges and prospects of using next-generation sequencing for nucleosome positioning analysis and outline the importance of such studies for the entire chromatin structure field.

Xi R, Kim T-M, Park PJ. Detecting structural variations in the human genome using next generation sequencing. Brief Funct Genomics 2010;9(5-6):405-15.Abstract

Structural variations are widespread in the human genome and can serve as genetic markers in clinical and evolutionary studies. With the advances in the next-generation sequencing technology, recent methods allow for identification of structural variations with unprecedented resolution and accuracy. They also provide opportunities to discover variants that could not be detected on conventional microarray-based platforms, such as dosage-invariant chromosomal translocations and inversions. In this review, we will describe some of the sequencing-based algorithms for detection of structural variations and discuss the key issues in future development.

Day DS, Luquette LJ, Park PJ, Kharchenko PV. Estimating enrichment of repetitive elements from high-throughput sequence data. Genome Biol 2010;11(6):R69.Abstract

We describe computational methods for analysis of repetitive elements from short-read sequencing data, and apply them to study histone modifications associated with the repetitive elements in human and mouse cells. Our results demonstrate that while accurate enrichment estimates can be obtained for individual repeat types and small sets of repeat instances, there are distinct combinatorial patterns of chromatin marks associated with major annotated repeat families, including H3K27me3/H3K9me3 differences among the endogenous retroviral element classes.

Kim H, Huang W, Jiang X, Pennicooke B, Park PJ, Johnson MD. Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship. Proc Natl Acad Sci U S A 2010;107(5):2183-8.Abstract

Using a multidimensional genomic data set on glioblastoma from The Cancer Genome Atlas, we identified hsa-miR-26a as a cooperating component of a frequently occurring amplicon that also contains CDK4 and CENTG1, two oncogenes that regulate the RB1 and PI3 kinase/AKT pathways, respectively. By integrating DNA copy number, mRNA, microRNA, and DNA methylation data, we identified functionally relevant targets of miR-26a in glioblastoma, including PTEN, RB1, and MAP3K2/MEKK2. We demonstrate that miR-26a alone can transform cells and it promotes glioblastoma cell growth in vitro and in the mouse brain by decreasing PTEN, RB1, and MAP3K2/MEKK2 protein expression, thereby increasing AKT activation, promoting proliferation, and decreasing c-JUN N-terminal kinase-dependent apoptosis. Overexpression of miR-26a in PTEN-competent and PTEN-deficient glioblastoma cells promoted tumor growth in vivo, and it further increased growth in cells overexpressing CDK4 or CENTG1. Importantly, glioblastoma patients harboring this amplification displayed markedly decreased survival. Thus, hsa-miR-26a, CDK4, and CENTG1 comprise a functionally integrated oncomir/oncogene DNA cluster that promotes aggressiveness in human cancers by cooperatively targeting the RB1, PI3K/AKT, and JNK pathways.

Gurumurthy S, Xie SZ, Alagesan B, Kim J, Yusuf RZ, Saez B, Tzatsos A, Ozsolak F, Milos P, Ferrari F, Park PJ, Shirihai OS, Scadden DT, Bardeesy N. The Lkb1 metabolic sensor maintains haematopoietic stem cell survival. Nature 2010;468(7324):659-63.Abstract

Haematopoietic stem cells (HSCs) can convert between growth states that have marked differences in bioenergetic needs. Although often quiescent in adults, these cells become proliferative upon physiological demand. Balancing HSC energetics in response to nutrient availability and growth state is poorly understood, yet essential for the dynamism of the haematopoietic system. Here we show that the Lkb1 tumour suppressor is critical for the maintenance of energy homeostasis in haematopoietic cells. Lkb1 inactivation in adult mice causes loss of HSC quiescence followed by rapid depletion of all haematopoietic subpopulations. Lkb1-deficient bone marrow cells exhibit mitochondrial defects, alterations in lipid and nucleotide metabolism, and depletion of cellular ATP. The haematopoietic effects are largely independent of Lkb1 regulation of AMP-activated protein kinase (AMPK) and mammalian target of rapamycin (mTOR) signalling. Instead, these data define a central role for Lkb1 in restricting HSC entry into cell cycle and in broadly maintaining energy homeostasis in haematopoietic cells through a novel metabolic checkpoint.

Kim T-M, Luquette LJ, Xi R, Park PJ. rSW-seq: algorithm for detection of copy number alterations in deep sequencing data. BMC Bioinformatics 2010;11:432.Abstract

BACKGROUND: Recent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy. RESULTS: We develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq) to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results. CONCLUSION: We propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.

2009
Gelbart ME, Larschan E, Peng S, Park PJ, Kuroda MI. Drosophila MSL complex globally acetylates H4K16 on the male X chromosome for dosage compensation. Nat Struct Mol Biol 2009;16(8):825-32.Abstract

The Drosophila melanogaster male-specific lethal (MSL) complex binds the single male X chromosome to upregulate gene expression to equal that from the two female X chromosomes. However, it has been puzzling that approximately 25% of transcribed genes on the X chromosome do not stably recruit MSL complex. Here we find that almost all active genes on the X chromosome are associated with robust H4 Lys16 acetylation (H4K16ac), the histone modification catalyzed by the MSL complex. The distribution of H4K16ac is much broader than that of the MSL complex, and our results favor the idea that chromosome-wide H4K16ac reflects transient association of the MSL complex, occurring through spreading or chromosomal looping. Our results parallel those of localized Polycomb repressive complex and its more broadly distributed chromatin mark, trimethylated histone H3 Lys27 (H3K27me3), suggesting a common principle for the establishment of active and silenced chromatin domains.

Hodge JC, Park PJ, Dreyfuss JM, Assil-Kishawi I, Somasundaram P, Semere LG, Quade BJ, Lynch AM, Stewart EA, Morton CC. Identifying the molecular signature of the interstitial deletion 7q subgroup of uterine leiomyomata using a paired analysis. Genes Chromosomes Cancer 2009;48(10):865-85.Abstract

Uterine leiomyomata (UL), the most common neoplasm in reproductive-age women, have recurrent cytogenetic abnormalities including interstitial deletion of 7q. To develop a molecular signature, matched del(7q) and non-del(7q) tumors identified by FISH or karyotyping from 11 women were profiled with expression arrays. Our analysis using paired t tests demonstrates this matched design is critical to eliminate the confounding effects of genotype and environment that underlie patient variation. A gene list ordered by genome-wide significance showed enrichment for the 7q22 target region. Modification of the gene list by weighting each sample for percent of del(7q) cells to account for the mosaic nature of these tumors further enhanced the frequency of 7q22 genes. Pathway analysis revealed two of the 19 significant functional networks were associated with development and the most represented pathway was protein ubiquitination, which can influence tumor development by stabilizing oncoproteins and destabilizing tumor suppressor proteins. Array CGH (aCGH) studies determined the only consistent genomic imbalance was deletion of 9.5 megabases from 7q22-7q31.1. Combining the aCGH data with the del(7q) UL mosaicism-weighted expression analysis resulted in a list of genes that are commonly deleted and whose copy number is correlated with significantly decreased expression. These genes include the proliferation inhibitor HPB1, the loss of expression of which has been associated with invasive breast cancer, as well as the mitosis integrity-maintenance tumor suppressor RINT1. This study provides a molecular signature of the del(7q) UL subgroup and will serve as a platform for future studies of tumor pathogenesis.

Gorchakov AA, Alekseyenko AA, Kharchenko P, Park PJ, Kuroda MI. Long-range spreading of dosage compensation in Drosophila captures transcribed autosomal genes inserted on X. Genes Dev 2009;23(19):2266-71.Abstract

Dosage compensation in Drosophila melanogaster males is achieved via targeting of male-specific lethal (MSL) complex to X-linked genes. This is proposed to involve sequence-specific recognition of the X at approximately 150-300 chromatin entry sites, and subsequent spreading to active genes. Here we ask whether the spreading step requires transcription and is sequence-independent. We find that MSL complex binds, acetylates, and up-regulates autosomal genes inserted on X, but only if transcriptionally active. We conclude that a long-sought specific DNA sequence within X-linked genes is not obligatory for MSL binding. Instead, linkage and transcription play the pivotal roles in MSL targeting irrespective of gene origin and DNA sequence.

Dreyfuss JM, Johnson MD, Park PJ. Meta-analysis of glioblastoma multiforme versus anaplastic astrocytoma identifies robust gene markers. Mol Cancer 2009;8:71.Abstract

BACKGROUND: Anaplastic astrocytoma (AA) and its more aggressive counterpart, glioblastoma multiforme (GBM), are the most common intrinsic brain tumors in adults and are almost universally fatal. A deeper understanding of the molecular relationship of these tumor types is necessary to derive insights into the diagnosis, prognosis, and treatment of gliomas. Although genomewide profiling of expression levels with microarrays can be used to identify differentially expressed genes between these tumor types, comparative studies so far have resulted in gene lists that show little overlap. RESULTS: To achieve a more accurate and stable list of the differentially expressed genes and pathways between primary GBM and AA, we performed a meta-analysis using publicly available genome-scale mRNA data sets. There were four data sets with sufficiently large sample sizes of both GBMs and AAs, all of which coincidentally used human U133 platforms from Affymetrix, allowing for easier and more precise integration of data. After scoring genes and pathways within each data set, we combined the statistics across studies using the nonparametric rank sum method to identify the features that differentiate GBMs and AAs. We found >900 statistically significant probe sets after correction for multiple testing from the >22,000 tested. We also used the rank sum approach to select >20 significant Biocarta pathways after correction for multiple testing out of >175 pathways examined. The most significant pathway was the hypoxia-inducible factor (HIF) pathway. Our analysis suggests that many of the most statistically significant genes work together in a HIF1A/VEGF-regulated network to increase angiogenesis and invasion in GBM when compared to AA. CONCLUSION: We have performed a meta-analysis of genome-scale mRNA expression data for 289 human malignant gliomas and have identified a list of >900 probe sets and >20 pathways that are significantly different between GBM and AA. These feature lists could be utilized to aid in diagnosis, prognosis, and grade reduction of high-grade gliomas and to identify genes that were not previously suspected of playing an important role in glioma biology. More generally, this approach suggests that combined analysis of existing data sets can reveal new insights and that the large amount of publicly available cancer data sets should be further utilized in a similar manner.

Park PJ, Manjourides J, Bonetti M, Pagano M. A permutation test for determining significance of clusters with applications to spatial and gene expression data. Comput Stat Data Anal 2009;53(12):4290-4300.Abstract

Hierarchical clustering is a common procedure for identifying structure in a data set, and this is frequently used for organizing genomic data. Although more advanced clustering algorithms are available, the simplicity and visual appeal of hierarchical clustering has made it ubiquitous in gene expression data analysis. Hence, even minor improvements in this framework would have significant impact. There is currently no simple and systematic way of assessing and displaying the significance of various clusters in a resulting dendrogram without making certain distributional assumptions or ignoring gene-specific variances. In this work, we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division. Given these values, one can also modify the dendrogram by combining non-significant branches. By adjusting the cut-off level of significance for branches, one can produce dendrograms with a desired level of detail for ease of interpretation. We demonstrate the usefulness of this approach by applying it to illustrative data sets.

Pages