Publications

2020
Cortés-Ciriano I, Lee JJK, Xi R, Jain D, Jung YL, Yang L, Gordenin D, Klimczak LJ, Zhang CZ, Pellman DS, Group PCAWGSVW, Park PJ, Consortium PCAWG. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing [Internet]. Nature Genetics 2020;52(3):331-341. Publisher's VersionAbstract
Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.
Reim NI*, Chuang J*, Jain D*, Alver BH, Park PJ, Winston F. The conserved elongation factor Spn1 is required for normal transcription, histone modifications, and splicing in Saccharomyces cerevisiae. Nucleic Acids Res 2020;Abstract
Spn1/Iws1 is a conserved protein involved in transcription and chromatin dynamics, yet its general in vivo requirement for these functions is unknown. Using a Spn1 depletion system in Saccharomyces cerevisiae, we demonstrate that Spn1 broadly influences several aspects of gene expression on a genome-wide scale. We show that Spn1 is globally required for normal mRNA levels and for normal splicing of ribosomal protein transcripts. Furthermore, Spn1 maintains the localization of H3K36 and H3K4 methylation across the genome and is required for normal histone levels at highly expressed genes. Finally, we show that the association of Spn1 with the transcription machinery is strongly dependent on its binding partner, Spt6, while the association of Spt6 and Set2 with transcribed regions is partially dependent on Spn1. Taken together, our results show that Spn1 affects multiple aspects of gene expression and provide additional evidence that it functions as a histone chaperone in vivo.
Yun JW, Yang L, Park H-Y, Lee C-W, Cha H, Shin H-T, Noh K-W, Choi Y-L, Park W-Y**, Park PJ**. Dysregulation of cancer genes by recurrent intergenic fusions. Genome Biol 2020;21(1):166.Abstract
BACKGROUND: Gene fusions have been studied extensively, as frequent drivers of tumorigenesis as well as potential therapeutic targets. In many well-known cases, breakpoints occur at two intragenic positions, leading to in-frame gene-gene fusions that generate chimeric mRNAs. However, fusions often occur with intergenic breakpoints, and the role of such fusions has not been carefully examined. RESULTS: We analyze whole-genome sequencing data from 268 patients to catalog gene-intergenic and intergenic-intergenic fusions and characterize their impact. First, we discover that, in contrast to the common assumption, chimeric oncogenic transcripts-such as those involving ETV4, ERG, RSPO3, and PIK3CA-can be generated by gene-intergenic fusions through splicing of the intervening region. Second, we find that over-expression of an upstream or downstream gene by a fusion-mediated repositioning of a regulatory sequence is much more common than previously suspected, with enhancers sometimes located megabases away. We detect a number of recurrent fusions, such as those involving ANO3, RGS9, FUT5, CHI3L1, OR1D4, and LIPG in breast; IGF2 in colon; ETV1 in prostate; and IGF2BP3 and SIX2 in thyroid cancers. CONCLUSION: Our findings elucidate the potential oncogenic function of intergenic fusions and highlight the wide-ranging consequences of structural rearrangements in cancer genomes.
Ettou S*, Jung YL*, Miyoshi T, Jain D, Hiratsuka K, Schumacher V, Taglienti ME, Morizane R, Park PJ**, Kreidberg JA**. Epigenetic transcriptional reprogramming by WT1 mediates a repair response during podocyte injury. Science Advances 2020;6(30):eabb5460.Abstract
In the context of human disease, the mechanisms whereby transcription factors reprogram gene expression in reparative responses to injury are not well understood. We have studied the mechanisms of transcriptional reprogramming in disease using murine kidney podocytes as a model for tissue injury. Podocytes are a crucial component of glomeruli, the filtration units of each nephron. Podocyte injury is the initial event in many processes that lead to end-stage kidney disease. Wilms tumor-1 (WT1) is a master regulator of gene expression in podocytes, binding nearly all genes known to be crucial for maintenance of the glomerular filtration barrier. Using murine models and human kidney organoids, we investigated WT1-mediated transcriptional reprogramming during the course of podocyte injury. Reprogramming the transcriptome involved highly dynamic changes in the binding of WT1 to target genes during a reparative injury response, affecting chromatin state and expression levels of target genes.
Gulhan DC, Garcia E, Lee EK, Lindemann NI, Liu JF, Matulonis UA, Park PJ, Konstantinopoulos PA. Genomic Determinants of De Novo Resistance to Immune Checkpoint Blockade in Mismatch Repair-Deficient Endometrial Cancer. JCO Precis Oncol 2020;4:492-497.
Miller DT, Cortés-Ciriano I, Pillay N, Hirbe AC, Snuderl M, Bui MM, Piculell K, Al-Ibraheemi A, Dickson BC, Hart J, Jones K, Jordan JT, Kim RH, Lindsay D, Nishida Y, Ullrich NJ, Wang X, Park PJ, Flanagan AM. Genomics of MPNST (GeM) Consortium: Rationale and Study Design for Multi-Omic Characterization of NF1-Associated and Sporadic MPNSTs. Genes 2020;11(4)Abstract
The Genomics of Malignant Peripheral Nerve Sheath Tumor (GeM) Consortium is an international collaboration focusing on multi-omic analysis of malignant peripheral nerve sheath tumors (MPNSTs), the most aggressive tumor associated with neurofibromatosis type 1 (NF1). Here we present a summary of current knowledge gaps, a description of our consortium and the cohort we have assembled, and an overview of our plans for multi-omic analysis of these tumors. We propose that our analysis will lead to a better understanding of the order and timing of genetic events related to MPNST initiation and progression. Our ten institutions have assembled 96 fresh frozen NF1-related (63%) and sporadic MPNST specimens from 86 subjects with corresponding clinical and pathological data. Clinical data have been collected as part of the International MPNST Registry. We will characterize these tumors with bulk whole genome sequencing, RNAseq, and DNA methylation profiling. In addition, we will perform multiregional analysis and temporal sampling, with the same methodologies, on a subset of nine subjects with NF1-related MPNSTs to assess tumor heterogeneity and cancer evolution. Subsequent multi-omic analyses of additional archival specimens will include deep exome sequencing (500×) and high density copy number arrays for both validation of results based on fresh frozen tumors, and to assess further tumor heterogeneity and evolution. Digital pathology images are being collected in a cloud-based platform for consensus review. The result of these efforts will be the largest MPNST multi-omic dataset with correlated clinical and pathological information ever assembled.
Chu C, Zhao B, Park PJ, Lee EA. Identification and Genotyping of Transposable Element Insertions From Genome Sequencing Data. Curr Protoc Hum Genet 2020;107(1):e102.Abstract
Transposable element (TE) mobilization is a significant source of genomic variation and has been associated with various human diseases. The exponential growth of population-scale whole-genome sequencing and rapid innovations in long-read sequencing technologies provide unprecedented opportunities to study TE insertions and their functional impact in human health and disease. Identifying TE insertions, however, is challenging due to the repetitive nature of the TE sequences. Here, we review computational approaches to detecting and genotyping TE insertions using short- and long-read sequencing and discuss the strengths and weaknesses of different approaches. © 2020 Wiley Periodicals LLC.
Touat M, Li YY, Boynton AN, Spurr LF, Iorgulescu BJ, Bohrson CL, Cortes-Ciriano I, Birzu C, Geduldig JE, Pelton K, Lim-Fat MJ, Pal S, Ferrer-Luna R, Ramkissoon SH, Dubois F, Bellamy C, Currimjee N, Bonardi J, Qian K, Ho P, Malinowski S, Taquet L, Jones RE, Shetty A, Chow K-H, Sharaf R, Pavlick D, Albacker LA, Younan N, Baldini C, Verreault M, Giry M, Guillerm E, Ammari S, Beuvon F, Mokhtari K, Alentorn A, Dehais C, Houillier C, Laigle-Donadey F, Psimaras D, Lee EQ, Nayak L, McFaline-Figueroa RJ, Carpentier A, Cornu P, Capelle L, Mathon B, Barnholtz-Sloan JS, Chakravarti A, Bi WL, Chiocca AE, Fehnel KP, Alexandrescu S, Chi SN, Haas-Kogan D, Batchelor TT, Frampton GM, Alexander BM, Huang RY, Ligon AH, Coulet F, Delattre J-Y, Hoang-Xuan K, Meredith DM, Santagata S, Duval A, Sanson M, Cherniack AD, Wen PY, Reardon DA, Marabelle A, Park PJ, Idbaih A, Beroukhim R, Bandopadhayay P, Bielle F, Ligon KL. Mechanisms and therapeutic implications of hypermutation in gliomas. Nature 2020;580(7804):517-523.Abstract
A high tumour mutational burden (hypermutation) is observed in some gliomas; however, the mechanisms by which hypermutation develops and whether it predicts the response to immunotherapy are poorly understood. Here we comprehensively analyse the molecular determinants of mutational burden and signatures in 10,294 gliomas. We delineate two main pathways to hypermutation: a de novo pathway associated with constitutional defects in DNA polymerase and mismatch repair (MMR) genes, and a more common post-treatment pathway, associated with acquired resistance driven by MMR defects in chemotherapy-sensitive gliomas that recur after treatment with the chemotherapy drug temozolomide. Experimentally, the mutational signature of post-treatment hypermutated gliomas was recapitulated by temozolomide-induced damage in cells with MMR deficiency. MMR-deficient gliomas were characterized by a lack of prominent T cell infiltrates, extensive intratumoral heterogeneity, poor patient survival and a low rate of response to PD-1 blockade. Moreover, although bulk analyses did not detect microsatellite instability in MMR-deficient gliomas, single-cell whole-genome sequencing analysis of post-treatment hypermutated glioma cells identified microsatellite mutations. These results show that chemotherapy can drive the acquisition of hypermutated populations without promoting a response to PD-1 blockade and supports the diagnostic use of mutational burden and signatures in cancer.
Huang AY, Li P, Rodin RE, Kim SN, Dou Y, Kenny CJ, Akula SK, Hodge RD, Bakken TE, Miller JA, Lein ES, Park PJ, Lee EA, Walsh CA. Parallel RNA and DNA analysis after deep sequencing (PRDD-seq) reveals cell type-specific lineage patterns in human brain. Proc Natl Acad Sci U S A 2020;117(25):13886-13895.Abstract
Elucidating the lineage relationships among different cell types is key to understanding human brain development. Here we developed parallel RNA and DNA analysis after deep sequencing (PRDD-seq), which combines RNA analysis of neuronal cell types with analysis of nested spontaneous DNA somatic mutations as cell lineage markers, identified from joint analysis of single-cell and bulk DNA sequencing by single-cell MosaicHunter (scMH). PRDD-seq enables simultaneous reconstruction of neuronal cell type, cell lineage, and sequential neuronal formation ("birthdate") in postmortem human cerebral cortex. Analysis of two human brains showed remarkable quantitative details that relate mutation mosaic frequency to clonal patterns, confirming an early divergence of precursors for excitatory and inhibitory neurons, and an "inside-out" layer formation of excitatory neurons as seen in other species. In addition our analysis allows an estimate of excitatory neuron-restricted precursors (about 10) that generate the excitatory neurons within a cortical column. Inhibitory neurons showed complex, subtype-specific patterns of neurogenesis, including some patterns of development conserved relative to mouse, but also some aspects of primate cortical interneuron development not seen in mouse. PRDD-seq can be broadly applied to characterize cell identity and lineage from diverse archival samples with single-cell resolution and in potentially any developmental or disease condition.
Goldman MJ*, Zhang J*, Fonseca NA*, Cortés-Ciriano I*, Xiang Q, Craft B, Piñeiro-Yáñez E, O'Connor BD, Bazant W, Barrera E, Muñoz-Pomer A, Petryszak R, Füllgrabe A, Al-Shahrour F, Keays M, Haussler D, Weinstein JN, Huber W, Valencia A, Park PJ, Papatheodorou I, Zhu J, Ferretti V, Vazquez M. A user guide for the online exploration and visualization of PCAWG data. Nat Commun 2020;11(1):3400.Abstract
The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation.
Wang S, Lee S, Chu C, Jain D, Kerpedjiev P, Nelson GM, Walsh JM, Alver BH, Park PJ. HiNT: a computational method for detecting copy number variations and translocations from Hi-C data [Internet]. Genome Biology 2020;21(1):73. Publisher's VersionAbstract
The three-dimensional conformation of a genome can be profiled using Hi-C, a technique that combines chromatin conformation capture with high-throughput sequencing. However, structural variations often yield features that can be mistaken for chromosomal interactions. Here, we describe a computational method HiNT (Hi-C for copy Number variation and Translocation detection), which detects copy number variations and interchromosomal translocations within Hi-C data with breakpoints at single base-pair resolution. We demonstrate that HiNT outperforms existing methods on both simulated and real data. We also show that Hi-C can supplement whole-genome sequencing in structure variant detection by locating breakpoints in repetitive regions.
Färkkliä A, Gulhan DC, Casado J, Jacobson CA, Nguyen H, Kochupurakkal B, Maliga Z, Yapp C, Chen Y-A, Schapiro D, Zhou Y, Graham JR, Dezube BJ, Munster P, Santagata S, Garcia E, Rodig S, Lako A, Chowdhury D, Shapiro GI, Matulonis UA, Park PJ, Hautaniemi S, Sorger PK, Swisher EM, D'Andrea AD, Konstantinopoulos PA. Immunogenomic profiling determines responses to combined PARP and PD-1 inhibition in ovarian cancer [Internet]. Nature Communications 2020;11(1):1459. Publisher's VersionAbstract
Combined PARP and immune checkpoint inhibition has yielded encouraging results in ovarian cancer, but predictive biomarkers are lacking. We performed immunogenomic profiling and highly multiplexed single-cell imaging on tumor samples from patients enrolled in a Phase I/II trial of niraparib and pembrolizumab in ovarian cancer (NCT02657889). We identify two determinants of response; mutational signature 3 reflecting defective homologous recombination DNA repair, and positive immune score as a surrogate of interferon-primed exhausted CD8 + T-cells in the tumor microenvironment. Presence of one or both features associates with an improved outcome while concurrent absence yields no responses. Single-cell spatial analysis reveals prominent interactions of exhausted CD8 + T-cells and PD-L1 + macrophages and PD-L1 + tumor cells as mechanistic determinants of response. Furthermore, spatial analysis of two extreme responders shows differential clustering of exhausted CD8 + T-cells with PD-L1 + macrophages in the first, and exhausted CD8 + T-cells with cancer cells harboring genomic PD-L1 and PD-L2 amplification in the second.
Pan Cancer Analysis of Whole Genomes Consortium ICGC/TCGA. Pan-cancer analysis of whole genomes [Internet]. Nature 2020;578(7793):82-93. Publisher's VersionAbstract
Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1-3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10-18.
Li Y, Roberts ND, A. WJ, Shapira O, Schumacher SE, Kumar K, Khurana E, Waszak S, Korbel JO, Haber JE, Imielinski M, Group PCAWGSVW, Weischenfeldt J, Beroukhim R, Campbell PJ, of Consortium PCAWG. Patterns of somatic structural variation in human cancer genomes [Internet]. Nature 2020;578(7793):112-121. Publisher's VersionAbstract
A key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments that range in size from kilobases to whole chromosomes1-7. Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types8. Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions-as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2-7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and-in liver cancer-frequently activate the telomerase gene TERT. A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act.
Rodriguez-Martin B, Alvarez EG, Baez-Ortega A, Zamora J, Supek F, Demeulemeester J, Santamarina M, Ju YS, Temes J, Garcia-Souto D, Detering H, Li Y, Rodriguez-Castro J, Dueso-Barroso A, Bruzos AL, Dentro SC, Blanco MG, Contino G, Ardeljan D, Tojo M, Roberts ND, Zumalave S, Edwards PAW, Weischenfeldt J, Puiggròs M, Chong Z, Chen K, Lee EA, Wala JA, Raine K, Butler A, Waszak SM, Navarro FCP, Schumacher SE, Monlong J, Maura F, Bolli N, Bourque G, Gerstein M, Park PJ, Wedge DC, Beroukhim R, Torrents D, Korbel JO, Martincorena I, Fitzgerald RC, Van Loo P, Kazazian HH, Burns KH, Group PCAWGSVW, Campbell PJ, Tubio JMC, Consortium PCAWG. Pan-cancer analysis of whole genome identifies driver rearrangements promoted by LINE-1 retrotransposition [Internet]. Nature Genetics 2020;52(3):306-319. Publisher's VersionAbstract
About half of all cancers have somatic integrations of retrotransposons. Here, to characterize their role in oncogenesis, we analyzed the patterns and mechanisms of somatic retrotransposition in 2,954 cancer genomes from 38 histological cancer subtypes within the framework of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. We identified 19,166 somatically acquired retrotransposition events, which affected 35% of samples and spanned a range of event types. Long interspersed nuclear element (LINE-1; L1 hereafter) insertions emerged as the first most frequent type of somatic structural variation in esophageal adenocarcinoma, and the second most frequent in head-and-neck and colorectal cancers. Aberrant L1 integrations can delete megabase-scale regions of a chromosome, which sometimes leads to the removal of tumor-suppressor genes, and can induce complex translocations and large-scale duplications. Somatic retrotranspositions can also initiate breakage-fusion-bridge cycles, leading to high-level amplification of oncogenes. These observations illuminate a relevant role of L1 retrotransposition in remodeling the cancer genome, with potential implications for the development of human tumors.
Sieverling L, Hong C, Koser SD, Ginsbach P, Kleinheinz K, Hutter B, Braun DM, Cortés-Ciriano I, Xi R, Kabbe R, Park PJ, Eils R, Schlesner M, Group PCAWGSVW, Brors B, Rippe K, Jones DTW, Feuerbach L, Consortium PCAWG. Genomic footprints of activated telomere maintenance mechanisms in cancer [Internet]. Nature Communications 2020;11(733) Publisher's VersionAbstract
Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we dissect whole-genome sequencing data of over 2500 matched tumor-control samples from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXXtrunc) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor samples contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXXtrunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer.
Horton CA, Alver B, Park PJ. GiniQC: a measure for quantifying noise in single-cell Hi-C data [Internet]. Bioinformatics 2020; Publisher's VersionAbstract
Single-cell Hi-C (scHi-C) allows the study of cell-to-cell variability in chromatin structure and dynamics. However, the high level of noise inherent in current scHi-C protocols necessitates careful assessment of data quality before biological conclusions can be drawn. Here we present GiniQC, which quantifies unevenness in the distribution of inter-chromosomal reads in the scHi-C contact matrix to measure the level of noise. Our examples show the utility of GiniQC in assessing the quality of scHi-C data as a complement to existing quality control measures. We also demonstrate how GiniQC can help inform the impact of various data processing steps on data quality.
Dou Y, Kwon M, Rodin RE, Cortés-Ciriano I, Doan R, J. Luquette L, Galor A, Bohrson C, Walsh CA, Park PJ. Accurate detection of mosaic variants in sequencing data without matched controls [Internet]. Nature Biotechnology 2020; Publisher's VersionAbstract

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80–90{\%} of the mosaic single-nucleotide variants and 60–80{\%} of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.

2019
Kim J, Hu C, Moufawad El Achkar C, Black LE, Douville J, Larson A, Pendergast MK, Goldkind SF, Lee EA, Kuniholm A, Soucy A, Vaze J, Belur NR, Fredriksen K, Stojkovska I, Tsytsykova A, Armant M, DiDonato RL, Choi J, Cornelissen L, Pereira LM, Augustine EF, Genetti CA, Dies K, Barton B, Williams L, Goodlett BD, Riley BL, Pasternak A, Berry ER, Pflock KA, Chu S, Reed C, Tyndall K, Agrawal PB, Beggs AH, Grant EP, Urion DK, Snyder RO, Waisbren SE, Poduri A, Park PJ, Patterson A, Biffi A, Mazzulli JR, Bodamer O, Berde CB, Yu TW. Patient-Customized Oligonucleotide Therapy for a Rare Genetic Disease. N Engl J Med 2019;Abstract
Genome sequencing is often pivotal in the diagnosis of rare diseases, but many of these conditions lack specific treatments. We describe how molecular diagnosis of a rare, fatal neurodegenerative condition led to the rational design, testing, and manufacture of milasen, a splice-modulating antisense oligonucleotide drug tailored to a particular patient. Proof-of-concept experiments in cell lines from the patient served as the basis for launching an "N-of-1" study of milasen within 1 year after first contact with the patient. There were no serious adverse events, and treatment was associated with objective reduction in seizures (determined by electroencephalography and parental reporting). This study offers a possible template for the rapid development of patient-customized treatments. (Funded by Mila's Miracle Foundation and others.).
Luquette JL, Bohrson CL, Sherman M, Park PJ. Identification of somatic mutations in single cell DNA sequencing data using a spatial model of allelic imbalance. Nature Communications 2019;10(1):3908.Abstract
Recent advances in single cell technology have enabled dissection of cellular heterogeneity in great detail. However, analysis of single cell DNA sequencing data remains challenging due to bias and artifacts that arise during DNA extraction and whole-genome amplification, including allelic imbalance and dropout. Here, we present a framework for statistical estimation of allele-specific amplification imbalance at any given position in single cell whole-genome sequencing data by utilizing the allele frequencies of heterozygous single nucleotide polymorphisms in the neighborhood. The resulting allelic imbalance profile is critical for determining whether the variant allele fraction of an observed mutation is consistent with the expected fraction for a true variant. This method, implemented in SCAN-SNV (Single Cell ANalysis of SNVs), substantially improves the identification of somatic variants in single cells. Our allele balance framework is broadly applicable to genotype analysis of any variant type in any data that might exhibit allelic imbalance.

Pages