Publications

2013
Gehlenborg N, Noble MS, Getz G, Chin L, Park PJ. Nozzle: a report generation toolkit for data analysis pipelines. Bioinformatics 2013;29(8):1089-91.Abstract

SUMMARY: We have developed Nozzle, an R package that provides an Application Programming Interface to generate HTML reports with dynamic user interface elements. Nozzle was designed to facilitate summarization and rapid browsing of complex results in data analysis pipelines where multiple analyses are performed frequently on big datasets. The package can be applied to any project where user-friendly reports need to be created. AVAILABILITY: The R package is available on CRAN at http://cran.r-project.org/package=Nozzle.R1. Examples and additional materials are available at http://gdac.broadinstitute.org/nozzle. The source code is also available at http://www.github.com/parklab/Nozzle. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DeGennaro CM, Alver BH, Marguerat S, Stepanova E, Davis CP, Bähler J, Park PJ, Winston F. Spt6 regulates intragenic and antisense transcription, nucleosome positioning, and histone modifications genome-wide in fission yeast. Mol Cell Biol 2013;33(24):4779-92.Abstract

Spt6 is a highly conserved histone chaperone that interacts directly with both RNA polymerase II and histones to regulate gene expression. To gain a comprehensive understanding of the roles of Spt6, we performed genome-wide analyses of transcription, chromatin structure, and histone modifications in a Schizosaccharomyces pombe spt6 mutant. Our results demonstrate dramatic changes to transcription and chromatin structure in the mutant, including elevated antisense transcripts at >70% of all genes and general loss of the +1 nucleosome. Furthermore, Spt6 is required for marks associated with active transcription, including trimethylation of histone H3 on lysine 4, previously observed in humans but not Saccharomyces cerevisiae, and lysine 36. Taken together, our results indicate that Spt6 is critical for the accuracy of transcription and the integrity of chromatin, likely via its direct interactions with RNA polymerase II and histones.

Tolstorukov MY*, Sansam CG*, Lu P*, Koellhoffer EC, Helming KC, Alver BH, Tillman EJ, Evans JA, Wilson BG, Park PJ**, Roberts CWM**. Swi/Snf chromatin remodeling/tumor suppressor complex establishes nucleosome occupancy at target promoters. Proc Natl Acad Sci U S A 2013;110(25):10165-70.Abstract

Precise nucleosome-positioning patterns at promoters are thought to be crucial for faithful transcriptional regulation. However, the mechanisms by which these patterns are established, are dynamically maintained, and subsequently contribute to transcriptional control are poorly understood. The switch/sucrose non-fermentable chromatin remodeling complex, also known as the Brg1 associated factors complex, is a master developmental regulator and tumor suppressor capable of mobilizing nucleosomes in biochemical assays. However, its role in establishing the nucleosome landscape in vivo is unclear. Here we have inactivated Snf5 and Brg1, core subunits of the mammalian Swi/Snf complex, to evaluate their effects on chromatin structure and transcription levels genomewide. We find that inactivation of either subunit leads to disruptions of specific nucleosome patterning combined with a loss of overall nucleosome occupancy at a large number of promoters, regardless of their association with CpG islands. These rearrangements are accompanied by gene expression changes that promote cell proliferation. Collectively, these findings define a direct relationship between chromatin-remodeling complexes, chromatin structure, and transcriptional regulation.

2012
Ho JWK, Alekseyenko AA, Kuroda MI, Park PJ. Genome-wide mapping of protein-DNA interactions by ChIP-seq [Internet]. In: Harbers M, Kahl G Tag-Based Next Generation Sequencing. Weinheim, Germany: Wiley-VCH Verlag GmbH & Co. KGaA; 2012 Publisher's Version
Lex A, Streit M, Shulz H-J, Partl C, Schmalstieg D, Park PJ, Gehlenborg N. StratomeX: Visual analysis of large-scale heterogeneous genomics data for cancer subtype characterization [Internet]. Computer Graphics Forum 2012;31(3):1175-1184. Publisher's Version
An integrated encyclopedia of DNA elements in the human genome.
ENCODE Project C. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489(7414):57-74.Abstract

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, Bernstein BE, Bickel P, Brown JB, Cayting P, Chen Y, DeSalvo G, Epstein C, Fisher-Aylor KI, Euskirchen G, Gerstein M, Gertz J, Hartemink AJ, Hoffman MM, Iyer VR, Jung YL, Karmakar S, Kellis M, Kharchenko PV, Li Q, Liu T, Liu SX, Ma L, Milosavljevic A, Myers RM, Park PJ, Pazin MJ, Perry MD, Raha D, Reddy TE, Rozowsky J, Shoresh N, Sidow A, Slattery M, Stamatoyannopoulos JA, Tolstorukov MY, White KP, Xi S, Farnham PJ, Lieb JD, Wold BJ, Snyder M. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 2012;22(9):1813-31.Abstract

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.

Cancer Genome Atlas Research Network TCGA. Comprehensive genomic characterization of squamous cell lung cancers. Nature 2012;489(7417):519-25.Abstract

Lung squamous cell carcinoma is a common type of lung cancer, causing approximately 400,000 deaths per year worldwide. Genomic alterations in squamous cell lung cancers have not been comprehensively characterized, and no molecularly targeted agents have been specifically developed for its treatment. As part of The Cancer Genome Atlas, here we profile 178 lung squamous cell carcinomas to provide a comprehensive landscape of genomic and epigenomic alterations. We show that the tumour type is characterized by complex genomic alterations, with a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 segments of copy number alteration per tumour. We find statistically recurrent mutations in 11 genes, including mutation of TP53 in nearly all specimens. Previously unreported loss-of-function mutations are seen in the HLA-A class I major histocompatibility gene. Significantly altered pathways included NFE2L2 and KEAP1 in 34%, squamous differentiation genes in 44%, phosphatidylinositol-3-OH kinase pathway genes in 47%, and CDKN2A and RB1 in 72% of tumours. We identified a potential therapeutic target in most tumours, offering new avenues of investigation for the treatment of squamous cell lung cancers.

Cancer Genome Atlas Network TCGA. Comprehensive molecular characterization of human colon and rectal cancer. Nature 2012;487(7407):330-7.Abstract

To characterize somatic alterations in colorectal carcinoma, we conducted a genome-scale analysis of 276 samples, analysing exome sequence, DNA copy number, promoter methylation and messenger RNA and microRNA expression. A subset of these samples (97) underwent low-depth-of-coverage whole-genome sequencing. In total, 16% of colorectal carcinomas were found to be hypermutated: three-quarters of these had the expected high microsatellite instability, usually with hypermethylation and MLH1 silencing, and one-quarter had somatic mismatch-repair gene and polymerase ε (POLE) mutations. Excluding the hypermutated cancers, colon and rectum cancers were found to have considerably similar patterns of genomic alteration. Twenty-four genes were significantly mutated, and in addition to the expected APC, TP53, SMAD4, PIK3CA and KRAS mutations, we found frequent mutations in ARID1A, SOX9 and FAM123B. Recurrent copy-number alterations include potentially drug-targetable amplifications of ERBB2 and newly discovered amplification of IGF2. Recurrent chromosomal translocations include the fusion of NAV2 and WNT pathway member TCF7L1. Integrative analyses suggest new markers for aggressive colorectal carcinoma and an important role for MYC-directed transcriptional activation and repression.

Cancer Genome Atlas Network TCGA. Comprehensive molecular portraits of human breast tumours. Nature 2012;490(7418):61-70.Abstract

We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.

Hodge JC, Kim T-M, Dreyfuss JM, Somasundaram P, Christacos NC, Rousselle M, Quade BJ, Park PJ, Stewart EA, Morton CC. Expression profiling of uterine leiomyomata cytogenetic subgroups reveals distinct signatures in matched myometrium: transcriptional profilingof the t(12;14) and evidence in support of predisposing genetic heterogeneity. Hum Mol Genet 2012;21(10):2312-29.Abstract

Uterine leiomyomata (UL), the most common neoplasm in reproductive-age women, are classified into distinct genetic subgroups based on recurrent chromosome abnormalities. To develop a molecular signature of UL with t(12;14)(q14-q15;q23-q24), we took advantage of the multiple UL arising as independent clonal lesions within a single uterus. We compared genome-wide expression levels of t(12;14) UL to non-t(12;14) UL from each of nine women in a paired analysis, with each sample weighted for the percentage of t(12;14) cells to adjust for mosaicism with normal cells. This resulted in a transcriptional profile that confirmed HMGA2, known to be overexpressed in t(12;14) UL, as the most significantly altered gene. Pathway analysis of the differentially expressed genes showed significant association with cell proliferation, particularly G1/S checkpoint regulation. This is consistent with the known larger size of t(12;14) UL relative to karyotypically normal UL or to UL in the deletion 7q22 subgroup. Unsupervised hierarchical clustering demonstrated that patient variability is relatively dominant to the distinction of t(12;14) UL compared with non-t(12;14) UL or of t(12;14) UL compared with del(7q) UL. The paired design we employed is therefore important to produce an accurate t(12;14) UL-specific gene list by removing the confounding effects of genotype and environment. Interestingly, myometrium not only clustered away from the tumors, but generally separated based on associated t(12;14) versus del(7q) status. Nine genes were identified whose expression can distinguish the myometrium origin. This suggests an underlying constitutional genetic predisposition to these somatic changes which could potentially lead to improved personalized management and treatment.

Larschan E, Soruco MML, Lee O-K, Peng S, Bishop EP, Chery J, Goebel K, Feng J, Park PJ, Kuroda MI. Identification of chromatin-associated regulators of MSL complex targeting in Drosophila dosage compensation. PLoS Genet 2012;8(7):e1002830.Abstract

Sex chromosome dosage compensation in Drosophila provides a model for understanding how chromatin organization can modulate coordinate gene regulation. Male Drosophila increase the transcript levels of genes on the single male X approximately two-fold to equal the gene expression in females, which have two X-chromosomes. Dosage compensation is mediated by the Male-Specific Lethal (MSL) histone acetyltransferase complex. Five core components of the MSL complex were identified by genetic screens for genes that are specifically required for male viability and are dispensable for females. However, because dosage compensation must interface with the general transcriptional machinery, it is likely that identifying additional regulators that are not strictly male-specific will be key to understanding the process at a mechanistic level. Such regulators would not have been recovered from previous male-specific lethal screening strategies. Therefore, we have performed a cell culture-based, genome-wide RNAi screen to search for factors required for MSL targeting or function. Here we focus on the discovery of proteins that function to promote MSL complex recruitment to "chromatin entry sites," which are proposed to be the initial sites of MSL targeting. We find that components of the NSL (Non-specific lethal) complex, and a previously unstudied zinc-finger protein, facilitate MSL targeting and display a striking enrichment at MSL entry sites. Identification of these factors provides new insight into how MSL complex establishes the specialized hyperactive chromatin required for dosage compensation in Drosophila.

Balakrishnan A, Stearns AT, Park PJ, Dreyfuss JM, Ashley SW, Rhoads DB, Tavakkolizadeh A. Upregulation of proapoptotic microRNA mir-125a after massive small bowel resection in rats. Ann Surg 2012;255(4):747-53.Abstract

OBJECTIVE: Short bowel syndrome remains a condition of high morbidity and mortality, and current therapeutic options carry significant side effects. To identify new treatments we focused on postresection changes in microRNAs--short noncoding RNAs, which suppress target genes--and suggest a previously undiscovered role for microRNA-125a (mir-125a) in intestinal adaptation. METHODS: Rats underwent either 80% massive small bowel resection or transection and were harvested after 48 hours. Jejunum was harvested for microRNA microarrays, laser capture microdissection, and RNA and protein analysis. Mir-125a was overexpressed in intestinal epithelium-6 (crypt-derived) cells (IEC-6) and effects on proliferation and apoptosis determined using MTS and flow cytometry. Expression of potential targets of mir-125a in rat jejunum and IEC-6 cells was determined using quantitative real-time polymerase chain reaction (RNA) and Western blotting (protein). RESULTS: Resection upregulated mir-125a and mir-214 by 2.4-folds and 3.2-folds, respectively. Highest levels of expression were noted in the crypt fraction. Mir-125a overexpression induced apoptosis and resultant growth arrest in IEC-6 cells. The expression of the prosurvival Bcl-2 family member Mcl-1 was downregulated in both mir-125a-overexpressing IEC-6 cells and in jejunum of resected rats, confirming Mcl-1 as a previously undiscovered target of mir-125a. CONCLUSIONS: Upregulation of mir-125a suppresses the prosurvival protein Mcl1, producing the increase in apoptosis known to accompany the proliferative changes characteristic of intestinal adaptation. Our data highlight a potential role for microRNAs as mediators of the adaptive process and may facilitate the development of new therapeutic options for short bowel syndrome.

Riddle NC*, Jung YL*, Gu T*, Alekseyenko AA, Asker D, Gui H, Kharchenko PV, Minoda A, Plachetka A, Schwartz YB, Tolstorukov MY, Kuroda MI, Pirrotta V, Karpen GH, Park PJ**, Elgin SCR**. Enrichment of HP1a on Drosophila chromosome 4 genes creates an alternate chromatin structure critical for regulation in this heterochromatic domain. PLoS Genet 2012;8(9):e1002954.Abstract

Chromatin environments differ greatly within a eukaryotic genome, depending on expression state, chromosomal location, and nuclear position. In genomic regions characterized by high repeat content and high gene density, chromatin structure must silence transposable elements but permit expression of embedded genes. We have investigated one such region, chromosome 4 of Drosophila melanogaster. Using chromatin-immunoprecipitation followed by microarray (ChIP-chip) analysis, we examined enrichment patterns of 20 histone modifications and 25 chromosomal proteins in S2 and BG3 cells, as well as the changes in several marks resulting from mutations in key proteins. Active genes on chromosome 4 are distinct from those in euchromatin or pericentric heterochromatin: while there is a depletion of silencing marks at the transcription start sites (TSSs), HP1a and H3K9me3, but not H3K9me2, are enriched strongly over gene bodies. Intriguingly, genes on chromosome 4 are less frequently associated with paused polymerase. However, when the chromatin is altered by depleting HP1a or POF, the RNA pol II enrichment patterns of many chromosome 4 genes shift, showing a significant decrease over gene bodies but not at TSSs, accompanied by lower expression of those genes. Chromosome 4 genes have a low incidence of TRL/GAGA factor binding sites and a low T(m) downstream of the TSS, characteristics that could contribute to a low incidence of RNA polymerase pausing. Our data also indicate that EGG and POF jointly regulate H3K9 methylation and promote HP1a binding over gene bodies, while HP1a targeting and H3K9 methylation are maintained at the repeats by an independent mechanism. The HP1a-enriched, POF-associated chromatin structure over the gene bodies may represent one type of adaptation for genes embedded in repetitive DNA.

Schwartz YB, Linder-Basso D, Kharchenko PV, Tolstorukov MY, Kim M, Li H-B, Gorchakov AA, Minoda A, Shanower G, Alekseyenko AA, Riddle NC, Jung YL, Gu T, Plachetka A, Elgin SCR, Kuroda MI, Park PJ, Savitsky M, Karpen GH, Pirrotta V. Nature and function of insulator protein binding sites in the Drosophila genome. Genome Res 2012;22(11):2188-98.Abstract

Chromatin insulator elements and associated proteins have been proposed to partition eukaryotic genomes into sets of independently regulated domains. Here we test this hypothesis by quantitative genome-wide analysis of insulator protein binding to Drosophila chromatin. We find distinct combinatorial binding of insulator proteins to different classes of sites and uncover a novel type of insulator element that binds CP190 but not any other known insulator proteins. Functional characterization of different classes of binding sites indicates that only a small fraction act as robust insulators in standard enhancer-blocking assays. We show that insulators restrict the spreading of the H3K27me3 mark but only at a small number of Polycomb target regions and only to prevent repressive histone methylation within adjacent genes that are already transcriptionally inactive. RNAi knockdown of insulator proteins in cultured cells does not lead to major alterations in genome expression. Taken together, these observations argue against the concept of a genome partitioned by specialized boundary elements and suggest that insulators are reserved for specific regulation of selected genes.

Yang HW, Kim T-M, Song SS, Shrinath N, Park R, Kalamarides M, Park PJ, Black PM, Carroll RS, Johnson MD. Alternative splicing of CHEK2 and codeletion with NF2 promote chromosomal instability in meningioma. Neoplasia 2012;14(1):20-8.Abstract

Mutations of the NF2 gene on chromosome 22q are thought to initiate tumorigenesis in nearly 50% of meningiomas, and 22q deletion is the earliest and most frequent large-scale chromosomal abnormality observed in these tumors. In aggressive meningiomas, 22q deletions are generally accompanied by the presence of large-scale segmental abnormalities involving other chromosomes, but the reasons for this association are unknown. We find that large-scale chromosomal alterations accumulate during meningioma progression primarily in tumors harboring 22q deletions, suggesting 22q-associated chromosomal instability. Here we show frequent codeletion of the DNA repair and tumor suppressor gene, CHEK2, in combination with NF2 on chromosome 22q in a majority of aggressive meningiomas. In addition, tumor-specific splicing of CHEK2 in meningioma leads to decreased functional Chk2 protein expression. We show that enforced Chk2 knockdown in meningioma cells decreases DNA repair. Furthermore, Chk2 depletion increases centrosome amplification, thereby promoting chromosomal instability. Taken together, these data indicate that alternative splicing and frequent codeletion of CHEK2 and NF2 contribute to the genomic instability and associated development of aggressive biologic behavior in meningiomas.

Stadtfeld M, Apostolou E, Ferrari F, Choi J, Walsh RM, Chen T, Ooi SSK, Kim SY, Bestor TH, Shioda T, Park PJ, Hochedlinger K. Ascorbic acid prevents loss of Dlk1-Dio3 imprinting and facilitates generation of all-iPS cell mice from terminally differentiated B cells. Nat Genet 2012;44(4):398-405, S1-2.Abstract

The generation of induced pluripotent stem cells (iPSCs) often results in aberrant epigenetic silencing of the imprinted Dlk1-Dio3 gene cluster, compromising the ability to generate entirely iPSC-derived adult mice ('all-iPSC mice'). Here, we show that reprogramming in the presence of ascorbic acid attenuates hypermethylation of Dlk1-Dio3 by enabling a chromatin configuration that interferes with binding of the de novo DNA methyltransferase Dnmt3a. This approach allowed us to generate all-iPSC mice from mature B cells, which have until now failed to support the development of exclusively iPSC-derived postnatal animals. Our data show that transcription factor-mediated reprogramming can endow a defined, terminally differentiated cell type with a developmental potential equivalent to that of embryonic stem cells. More generally, these findings indicate that culture conditions during cellular reprogramming can strongly influence the epigenetic and biological properties of the resultant iPSCs.

Saikumar J, Hoffmann D, Kim T-M, Gonzalez VR, Zhang Q, Goering PL, Brown RP, Bijol V, Park PJ, Waikar SS, Vaidya VS. Expression, circulation, and excretion profile of microRNA-21, -155, and -18a following acute kidney injury. Toxicol Sci 2012;129(2):256-67.Abstract

MicroRNAs (miRNAs) are endogenous noncoding RNA molecules that are involved in post-transcriptional gene silencing. Using global miRNA expression profiling, we found miR-21, -155, and 18a to be highly upregulated in rat kidneys following tubular injury induced by ischemia/reperfusion (I/R) or gentamicin administration. Mir-21 and -155 also showed decreased expression patterns in blood and urinary supernatants in both models of kidney injury. Furthermore, urinary levels of miR-21 increased 1.2-fold in patients with clinical diagnosis of acute kidney injury (AKI) (n = 22) as compared with healthy volunteers (n = 25) (p < 0.05), and miR-155 decreased 1.5-fold in patients with AKI (p < 0.01). We identified 29 messenger RNA core targets of these 3 miRNAs using the context likelihood of relatedness algorithm and found these predicted gene targets to be highly enriched for genes associated with apoptosis or cell proliferation. Taken together, these results suggest that miRNA-21 and -155 could potentially serve as translational biomarkers for detection of AKI and may play a critical role in the pathogenesis of kidney injury and tissue repair process.

Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells.
Tolstorukov MY*, Goldman JA*, Gilbert C, Ogryzko V, Kingston RE**, Park PJ**. Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells. Mol Cell 2012;47(4):596-607.Abstract

Variation in chromatin composition and organization often reflects differences in genome function. Histone variants, for example, replace canonical histones to contribute to regulation of numerous nuclear processes including transcription, DNA repair, and chromosome segregation. Here we focus on H2A.Bbd, a rapidly evolving variant found in mammals but not in invertebrates. We report that in human cells, nucleosomes bearing H2A.Bbd form unconventional chromatin structures enriched within actively transcribed genes and characterized by shorter DNA protection and nucleosome spacing. Analysis of transcriptional profiles from cells depleted for H2A.Bbd demonstrated widespread changes in gene expression with a net downregulation of transcription and disruption of normal mRNA splicing patterns. In particular, we observed changes in exon inclusion rates and increased presence of intronic sequences in mRNA products upon H2A.Bbd depletion. Taken together, our results indicate that H2A.Bbd is involved in formation of a specific chromatin structure that facilitates both transcription and initial mRNA processing.

Lachke SA, Ho JWK, Kryukov GV, O'Connell DJ, Aboukhalil A, Bulyk ML, Park PJ, Maas RL. iSyTE: integrated Systems Tool for Eye gene discovery. Invest Ophthalmol Vis Sci 2012;53(3):1617-27.Abstract

PURPOSE: To facilitate the identification of genes associated with cataract and other ocular defects, the authors developed and validated a computational tool termed iSyTE (integrated Systems Tool for Eye gene discovery; http://bioinformatics.udel.edu/Research/iSyTE). iSyTE uses a mouse embryonic lens gene expression data set as a bioinformatics filter to select candidate genes from human or mouse genomic regions implicated in disease and to prioritize them for further mutational and functional analyses. METHODS: Microarray gene expression profiles were obtained for microdissected embryonic mouse lens at three key developmental time points in the transition from the embryonic day (E)10.5 stage of lens placode invagination to E12.5 lens primary fiber cell differentiation. Differentially regulated genes were identified by in silico comparison of lens gene expression profiles with those of whole embryo body (WB) lacking ocular tissue. RESULTS: Gene set analysis demonstrated that this strategy effectively removes highly expressed but nonspecific housekeeping genes from lens tissue expression profiles, allowing identification of less highly expressed lens disease-associated genes. Among 24 previously mapped human genomic intervals containing genes associated with isolated congenital cataract, the mutant gene is ranked within the top two iSyTE-selected candidates in approximately 88% of cases. Finally, in situ hybridization confirmed lens expression of several novel iSyTE-identified genes. CONCLUSIONS: iSyTE is a publicly available Web resource that can be used to prioritize candidate genes within mapped genomic intervals associated with congenital cataract for further investigation. Extension of this approach to other ocular tissue components will facilitate eye disease gene discovery.

Pages