Cook JH*, Melloni GEM*, Gulhan DC, Park PJ**, Haigis KM**. The origins and genetic interactions of KRAS mutations are allele- and tissue-specific [Internet]. Nature Communications 2021;12(1808) Publisher's VersionAbstract
Mutational activation of KRAS promotes the initiation and progression of cancers, especially in the colorectum, pancreas, lung, and blood plasma, with varying prevalence of specific activating missense mutations. Although epidemiological studies connect specific alleles to clinical outcomes, the mechanisms underlying the distinct clinical characteristics of mutant KRAS alleles are unclear. Here, we analyze 13,492 samples from these four tumor types to examine allele- and tissue-specific genetic properties associated with oncogenic KRAS mutations. The prevalence of known mutagenic mechanisms partially explains the observed spectrum of KRAS activating mutations. However, there are substantial differences between the observed and predicted frequencies for many alleles, suggesting that biological selection underlies the tissue-specific frequencies of mutant alleles. Consistent with experimental studies that have identified distinct signaling properties associated with each mutant form of KRAS, our genetic analysis reveals that each KRAS allele is associated with a distinct tissuespecific comutation network. Moreover, we identify tissue-specific genetic dependencies associated with specific mutant KRAS alleles. Overall, this analysis demonstrates that the genetic interactions of oncogenic KRAS mutations are allele- and tissue-specific, underscoring the complexity that drives their clinical consequences.
Bizzotto S*, Dou Y*, Ganz J*, Doan RN, Kwon M, Bohrson CL, Kim SN, Bae T, Abyzov A, Network NIMHBSM, Park PJ**, Walsh CA**. Landmarks of human embryonic development inscribed in somatic mutations [Internet]. Science 2021;371(6535):1249-1253. Publisher's VersionAbstract
Although cell lineage information is fundamental to understanding organismal development, very little direct information is available for humans. We performed high-depth (250×) whole-genome sequencing of multiple tissues from three individuals to identify hundreds of somatic single-nucleotide variants (sSNVs). Using these variants as "endogenous barcodes" in single cells, we reconstructed early embryonic cell divisions. Targeted sequencing of clonal sSNVs in different organs (about 25,000×) and in more than 1000 cortical single cells, as well as single-nucleus RNA sequencing and single-nucleus assay for transposase-accessible chromatin sequencing of ~100,000 cortical single cells, demonstrated asymmetric contributions of early progenitors to extraembryonic tissues, distinct germ layers, and organs. Our data suggest onset of gastrulation at an effective progenitor pool of about 170 cells and about 50 to 100 founders for the forebrain. Thus, mosaic mutations provide a permanent record of human embryonic development at very high resolution.
Kwon M, Lee S, Berselli M, Chu C, Park PJ. BamSnap: a lightweight viewer for sequencing reads in BAM files. Bioinformatics 2021;Abstract
SUMMARY: Despite the improvement in variant detection algorithms, visual inspection of the read-level data remains an essential step for accurate identification of variants in genome analysis. We developed BamSnap, an efficient BAM file viewer utilizing a graphics library and BAM indexing. In contrast to existing viewers, BamSnap can generate high-quality snapshots rapidly, with customized tracks and layout. As an example, we produced read-level images at 1000 genomic loci for >2500 whole-genomes. AVAILABILITY: BamSnap is freely available at SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Yang HW, Lee S, Yang D, Dai H, Zhang Y, Han L, Zhao S, Zhang S, Ma Y, Johnson MF, Rattray AK, Johnson TA, Wang G, Zheng S, Carroll RS, Park PJ, Johnson MD. Deletions in CWH43 cause idiopathic normal pressure hydrocephalus. EMBO Mol Med 2021;:e13249.Abstract
Idiopathic normal pressure hydrocephalus (iNPH) is a neurological disorder that occurs in about 1% of individuals over age 60 and is characterized by enlarged cerebral ventricles, gait difficulty, incontinence, and cognitive decline. The cause and pathophysiology of iNPH are largely unknown. We performed whole exome sequencing of DNA obtained from 53 unrelated iNPH patients. Two recurrent heterozygous loss of function deletions in CWH43 were observed in 15% of iNPH patients and were significantly enriched 6.6-fold and 2.7-fold, respectively, when compared to the general population. Cwh43 modifies the lipid anchor of glycosylphosphatidylinositol-anchored proteins. Mice heterozygous for CWH43 deletion appeared grossly normal but displayed hydrocephalus, gait and balance abnormalities, decreased numbers of ependymal cilia, and decreased localization of glycosylphosphatidylinositol-anchored proteins to the apical surfaces of choroid plexus and ependymal cells. Our findings provide novel mechanistic insights into the origins of iNPH and demonstrate that it represents a distinct disease entity.
Färkkilä A, Rodríguez A, Oikkonen J, Gulhan DC, Nguyen H, Domínguez J, Ramos S, Mills CE, Perez-Villatoro F, Lazaro J-B, Zhou J, Clairmont CS, Moreau LA, Park PJ, Sorger PK, Hautaniemi S, Frias S, D'Andrea AD. Heterogeneity and clonal evolution of acquired PARP inhibitor resistance in TP53- and BRCA1-deficient cells. Cancer Research 2021;Abstract
Homologous recombination (HR)-deficient cancers are sensitive to inhibitors of Poly-ADP Ribose Polymerase (PARPi), which have shown clinical efficacy in the treatment of high-grade serous cancers (HGSC). However, the majority of patients will relapse, and acquired PARPi resistance is emerging as a pressing clinical problem. Here we generated seven single-cell clones with acquired PARPi resistance derived from a PARPi-sensitive, TP53-/- and BRCA1-/- epithelial cell line generated using CRISPR/Cas9. These clones showed diverse resistance mechanisms, and some clones presented with multiple mechanisms of resistance at the same time. Genomic analysis of the clones revealed unique transcriptional and mutational profiles and increased genomic instability in comparison to a PARPi-sensitive cell line. Clonal evolutionary analyses suggested that acquired PARPi resistance arose via clonal selection from an intrinsically unstable and heterogenous cell population in the sensitive cell line, which contained pre-existing drug tolerant cells. Similarly, clonal and spatial heterogeneity in tumor biopsies from a clinical BRCA1-mutant HGSC patient with acquired PARPi resistance were observed. In an imaging-based drug screening, the clones showed heterogenous responses to targeted therapeutic agents, indicating that not all PARPi-resistant clones can be targeted with just one therapy. Furthermore, PARPi-resistant clones showed mechanism-dependent vulnerabilities to the selected agents, demonstrating that a deeper understanding on the mechanisms of resistance could lead to improved targeting and biomarkers for HGSC with acquired PARPi resistance.
Rodin RE*, Dou Y*, Kwon M, Sherman MA, D'Gama AM, Doan RN, Rento LM, Girskis KM, Bohrson CL, Kim SN, Nadig A, Luquette LJ, Gulhan DC, Brain Somatic Mosaicism Network BSM, Park PJ**, Walsh CA**. The landscape of somatic mutation in cerebral cortex of autistic and neurotypical individuals revealed by ultra-deep whole-genome sequencing. Nat Neurosci 2021;24(2):176-185.Abstract
We characterize the landscape of somatic mutations-mutations occurring after fertilization-in the human brain using ultra-deep (~250×) whole-genome sequencing of prefrontal cortex from 59 donors with autism spectrum disorder (ASD) and 15 control donors. We observe a mean of 26 somatic single-nucleotide variants per brain present in ≥4% of cells, with enrichment of mutations in coding and putative regulatory regions. Our analysis reveals that the first cell division after fertilization produces ~3.4 mutations, followed by 2-3 mutations in subsequent generations. This suggests that a typical individual possesses ~80 somatic single-nucleotide variants present in ≥2% of cells-comparable to the number of de novo germline mutations per generation-with about half of individuals having at least one potentially function-altering somatic mutation somewhere in the cortex. ASD brains show an excess of somatic mutations in neural enhancer sequences compared with controls, suggesting that mosaic enhancer mutations may contribute to ASD risk.
Sherman MA, Rodin RE, Genovese G, Dias C, Barton AR, Mukamel RE, Berger B, Park PJ**, Walsh CA**, Loh P-R**. Large mosaic copy number variations confer autism risk. Nat Neurosci 2021;24(2):197-203.Abstract
Although germline de novo copy number variants (CNVs) are known causes of autism spectrum disorder (ASD), the contribution of mosaic (early-developmental) copy number variants (mCNVs) has not been explored. In this study, we assessed the contribution of mCNVs to ASD by ascertaining mCNVs in genotype array intensity data from 12,077 probands with ASD and 5,500 unaffected siblings. We detected 46 mCNVs in probands and 19 mCNVs in siblings, affecting 2.8-73.8% of cells. Probands carried a significant burden of large (>4-Mb) mCNVs, which were detected in 25 probands but only one sibling (odds ratio = 11.4, 95% confidence interval = 1.5-84.2, P = 7.4 × 10). Event size positively correlated with severity of ASD symptoms (P = 0.016). Surprisingly, we did not observe mosaic analogues of the short de novo CNVs recurrently observed in ASD (eg, 16p11.2). We further experimentally validated two mCNVs in postmortem brain tissue from 59 additional probands. These results indicate that mCNVs contribute a previously unexplained component of ASD risk.
Jung YL, Kirli K, Alver BH, Park PJ. Resources and challenges for integrative analysis of nuclear architecture data. Curr Opin Genet Dev 2021;67:103-110.Abstract
A large amount of genomic data for profiling three-dimensional genome architecture have accumulated from large-scale consortium projects as well as from individual laboratories. In this review, we summarize recent landmark datasets and collections in the field. We describe the challenges in collection, annotation, and analysis of these data, particularly for integration of sequencing and microscopy data. We introduce efforts from consortia and independent groups to harmonize diverse datasets. As the resolution and throughput of sequencing and imaging technologies continue to increase, more efficient utilization and integration of collected data will be critical for a better understanding of nuclear architecture.
Cortés-Ciriano I, Lee JJK, Xi R, Jain D, Jung YL, Yang L, Gordenin D, Klimczak LJ, Zhang CZ, Pellman DS, Group PCAWGSVW, Park PJ, Consortium PCAWG. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing [Internet]. Nature Genetics 2020;52(3):331-341. Publisher's VersionAbstract
Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.
Dou Y, Kwon M, Rodin RE, Cortés-Ciriano I, Doan R, J. Luquette L, Galor A, Bohrson C, Walsh CA, Park PJ. Accurate detection of mosaic variants in sequencing data without matched controls [Internet]. Nature Biotechnology 2020;38(3):314-319. Publisher's VersionAbstract

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80–90{\%} of the mosaic single-nucleotide variants and 60–80{\%} of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.

Jain D, Chu C, Alver BH, Lee S, Lee EA, Park PJ. HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data. Bioinformatics 2020;Abstract
Hi-C is a common technique for assessing three-dimensional chromatin conformation. Recent studies have shown that long-range interaction information in Hi-C data can be used to generate chromosome-length genome assemblies and identify large-scale structural variations. Here, we demonstrate the use of Hi-C data in detecting mobile transposable element (TE) insertions genome-wide. Our pipeline HiTea (Hi-C based Transposable element analyzer) capitalizes on clipped Hi-C reads and is aided by a high proportion of discordant read pairs in Hi-C data to detect insertions of three major families of active human TEs. Despite the uneven genome coverage in Hi-C data, HiTea is competitive with the existing callers based on whole genome sequencing (WGS) data and can supplement the WGS-based characterization of the TE insertion landscape. We employ the pipeline to identify TE insertions from human cell-line Hi-C samples. HiTea is available at and as a Docker image. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Haider S, Tyekucheva S, Prandi D, Fox NS, Ahn J, Xu AW, Pantazi A, Park PJ, Laird PW, Sander C, Wang W, Demichelis F, Loda M, Boutros PC, Boutros PC. Systematic Assessment of Tumor Purity and Its Clinical Implications. JCO Precis Oncol 2020;4Abstract
PURPOSE: The tumor microenvironment is complex, comprising heterogeneous cellular populations. As molecular profiles are frequently generated using bulk tissue sections, they represent an admixture of multiple cell types (including immune, stromal, and cancer cells) interacting with each other. Therefore, these molecular profiles are confounded by signals emanating from many cell types. Accurate assessment of residual cancer cell fraction is crucial for parameterization and interpretation of genomic analyses, as well as for accurately interpreting the clinical properties of the tumor. MATERIALS AND METHODS: To benchmark cancer cell fraction estimation methods, 10 estimators were applied to a clinical cohort of 333 patients with prostate cancer. These methods include gold-standard multiobserver pathology estimates, as well as estimates inferred from genome, epigenome, and transcriptome data. In addition, two methods based on genomic and transcriptomic profiles were used to quantify tumor purity in 4,497 tumors across 12 cancer types. Bulk mRNA and microRNA profiles were subject to in silico deconvolution to estimate cancer cell-specific mRNA and microRNA profiles. RESULTS: We present a systematic comparison of 10 tumor purity estimation methods on a cohort of 333 prostate tumors. We quantify variation among purity estimation methods and demonstrate how this influences interpretation of clinico-genomic analyses. Our data show poor concordance between pathologic and molecular purity estimates, necessitating caution when interpreting molecular results. Limited concordance between DNA- and mRNA-derived purity estimates remained a general pan-cancer phenomenon when tested in an additional 4,497 tumors spanning 12 cancer types. CONCLUSION: The choice of tumor purity estimation method may have a profound impact on the interpretation of genomic assays. Taken together, these data highlight the need for improved assessment of tumor purity and quantitation of its influences on the molecular hallmarks of cancers.
Reim NI*, Chuang J*, Jain D*, Alver BH, Park PJ, Winston F. The conserved elongation factor Spn1 is required for normal transcription, histone modifications, and splicing in Saccharomyces cerevisiae. Nucleic Acids Res 2020;Abstract
Spn1/Iws1 is a conserved protein involved in transcription and chromatin dynamics, yet its general in vivo requirement for these functions is unknown. Using a Spn1 depletion system in Saccharomyces cerevisiae, we demonstrate that Spn1 broadly influences several aspects of gene expression on a genome-wide scale. We show that Spn1 is globally required for normal mRNA levels and for normal splicing of ribosomal protein transcripts. Furthermore, Spn1 maintains the localization of H3K36 and H3K4 methylation across the genome and is required for normal histone levels at highly expressed genes. Finally, we show that the association of Spn1 with the transcription machinery is strongly dependent on its binding partner, Spt6, while the association of Spt6 and Set2 with transcribed regions is partially dependent on Spn1. Taken together, our results show that Spn1 affects multiple aspects of gene expression and provide additional evidence that it functions as a histone chaperone in vivo.
Yun JW, Yang L, Park H-Y, Lee C-W, Cha H, Shin H-T, Noh K-W, Choi Y-L, Park W-Y**, Park PJ**. Dysregulation of cancer genes by recurrent intergenic fusions. Genome Biol 2020;21(1):166.Abstract
BACKGROUND: Gene fusions have been studied extensively, as frequent drivers of tumorigenesis as well as potential therapeutic targets. In many well-known cases, breakpoints occur at two intragenic positions, leading to in-frame gene-gene fusions that generate chimeric mRNAs. However, fusions often occur with intergenic breakpoints, and the role of such fusions has not been carefully examined. RESULTS: We analyze whole-genome sequencing data from 268 patients to catalog gene-intergenic and intergenic-intergenic fusions and characterize their impact. First, we discover that, in contrast to the common assumption, chimeric oncogenic transcripts-such as those involving ETV4, ERG, RSPO3, and PIK3CA-can be generated by gene-intergenic fusions through splicing of the intervening region. Second, we find that over-expression of an upstream or downstream gene by a fusion-mediated repositioning of a regulatory sequence is much more common than previously suspected, with enhancers sometimes located megabases away. We detect a number of recurrent fusions, such as those involving ANO3, RGS9, FUT5, CHI3L1, OR1D4, and LIPG in breast; IGF2 in colon; ETV1 in prostate; and IGF2BP3 and SIX2 in thyroid cancers. CONCLUSION: Our findings elucidate the potential oncogenic function of intergenic fusions and highlight the wide-ranging consequences of structural rearrangements in cancer genomes.
Ettou S*, Jung YL*, Miyoshi T, Jain D, Hiratsuka K, Schumacher V, Taglienti ME, Morizane R, Park PJ**, Kreidberg JA**. Epigenetic transcriptional reprogramming by WT1 mediates a repair response during podocyte injury. Science Advances 2020;6(30):eabb5460.Abstract
In the context of human disease, the mechanisms whereby transcription factors reprogram gene expression in reparative responses to injury are not well understood. We have studied the mechanisms of transcriptional reprogramming in disease using murine kidney podocytes as a model for tissue injury. Podocytes are a crucial component of glomeruli, the filtration units of each nephron. Podocyte injury is the initial event in many processes that lead to end-stage kidney disease. Wilms tumor-1 (WT1) is a master regulator of gene expression in podocytes, binding nearly all genes known to be crucial for maintenance of the glomerular filtration barrier. Using murine models and human kidney organoids, we investigated WT1-mediated transcriptional reprogramming during the course of podocyte injury. Reprogramming the transcriptome involved highly dynamic changes in the binding of WT1 to target genes during a reparative injury response, affecting chromatin state and expression levels of target genes.
Gulhan DC, Garcia E, Lee EK, Lindemann NI, Liu JF, Matulonis UA, Park PJ, Konstantinopoulos PA. Genomic Determinants of De Novo Resistance to Immune Checkpoint Blockade in Mismatch Repair-Deficient Endometrial Cancer. JCO Precis Oncol 2020;4:492-497.
Miller DT, Cortés-Ciriano I, Pillay N, Hirbe AC, Snuderl M, Bui MM, Piculell K, Al-Ibraheemi A, Dickson BC, Hart J, Jones K, Jordan JT, Kim RH, Lindsay D, Nishida Y, Ullrich NJ, Wang X, Park PJ, Flanagan AM. Genomics of MPNST (GeM) Consortium: Rationale and Study Design for Multi-Omic Characterization of NF1-Associated and Sporadic MPNSTs. Genes 2020;11(4)Abstract
The Genomics of Malignant Peripheral Nerve Sheath Tumor (GeM) Consortium is an international collaboration focusing on multi-omic analysis of malignant peripheral nerve sheath tumors (MPNSTs), the most aggressive tumor associated with neurofibromatosis type 1 (NF1). Here we present a summary of current knowledge gaps, a description of our consortium and the cohort we have assembled, and an overview of our plans for multi-omic analysis of these tumors. We propose that our analysis will lead to a better understanding of the order and timing of genetic events related to MPNST initiation and progression. Our ten institutions have assembled 96 fresh frozen NF1-related (63%) and sporadic MPNST specimens from 86 subjects with corresponding clinical and pathological data. Clinical data have been collected as part of the International MPNST Registry. We will characterize these tumors with bulk whole genome sequencing, RNAseq, and DNA methylation profiling. In addition, we will perform multiregional analysis and temporal sampling, with the same methodologies, on a subset of nine subjects with NF1-related MPNSTs to assess tumor heterogeneity and cancer evolution. Subsequent multi-omic analyses of additional archival specimens will include deep exome sequencing (500×) and high density copy number arrays for both validation of results based on fresh frozen tumors, and to assess further tumor heterogeneity and evolution. Digital pathology images are being collected in a cloud-based platform for consensus review. The result of these efforts will be the largest MPNST multi-omic dataset with correlated clinical and pathological information ever assembled.
Chu C, Zhao B, Park PJ, Lee EA. Identification and Genotyping of Transposable Element Insertions From Genome Sequencing Data. Curr Protoc Hum Genet 2020;107(1):e102.Abstract
Transposable element (TE) mobilization is a significant source of genomic variation and has been associated with various human diseases. The exponential growth of population-scale whole-genome sequencing and rapid innovations in long-read sequencing technologies provide unprecedented opportunities to study TE insertions and their functional impact in human health and disease. Identifying TE insertions, however, is challenging due to the repetitive nature of the TE sequences. Here, we review computational approaches to detecting and genotyping TE insertions using short- and long-read sequencing and discuss the strengths and weaknesses of different approaches. © 2020 Wiley Periodicals LLC.
Touat M, Li YY, Boynton AN, Spurr LF, Iorgulescu BJ, Bohrson CL, Cortes-Ciriano I, Birzu C, Geduldig JE, Pelton K, Lim-Fat MJ, Pal S, Ferrer-Luna R, Ramkissoon SH, Dubois F, Bellamy C, Currimjee N, Bonardi J, Qian K, Ho P, Malinowski S, Taquet L, Jones RE, Shetty A, Chow K-H, Sharaf R, Pavlick D, Albacker LA, Younan N, Baldini C, Verreault M, Giry M, Guillerm E, Ammari S, Beuvon F, Mokhtari K, Alentorn A, Dehais C, Houillier C, Laigle-Donadey F, Psimaras D, Lee EQ, Nayak L, McFaline-Figueroa RJ, Carpentier A, Cornu P, Capelle L, Mathon B, Barnholtz-Sloan JS, Chakravarti A, Bi WL, Chiocca AE, Fehnel KP, Alexandrescu S, Chi SN, Haas-Kogan D, Batchelor TT, Frampton GM, Alexander BM, Huang RY, Ligon AH, Coulet F, Delattre J-Y, Hoang-Xuan K, Meredith DM, Santagata S, Duval A, Sanson M, Cherniack AD, Wen PY, Reardon DA, Marabelle A, Park PJ, Idbaih A, Beroukhim R, Bandopadhayay P, Bielle F, Ligon KL. Mechanisms and therapeutic implications of hypermutation in gliomas. Nature 2020;580(7804):517-523.Abstract
A high tumour mutational burden (hypermutation) is observed in some gliomas; however, the mechanisms by which hypermutation develops and whether it predicts the response to immunotherapy are poorly understood. Here we comprehensively analyse the molecular determinants of mutational burden and signatures in 10,294 gliomas. We delineate two main pathways to hypermutation: a de novo pathway associated with constitutional defects in DNA polymerase and mismatch repair (MMR) genes, and a more common post-treatment pathway, associated with acquired resistance driven by MMR defects in chemotherapy-sensitive gliomas that recur after treatment with the chemotherapy drug temozolomide. Experimentally, the mutational signature of post-treatment hypermutated gliomas was recapitulated by temozolomide-induced damage in cells with MMR deficiency. MMR-deficient gliomas were characterized by a lack of prominent T cell infiltrates, extensive intratumoral heterogeneity, poor patient survival and a low rate of response to PD-1 blockade. Moreover, although bulk analyses did not detect microsatellite instability in MMR-deficient gliomas, single-cell whole-genome sequencing analysis of post-treatment hypermutated glioma cells identified microsatellite mutations. These results show that chemotherapy can drive the acquisition of hypermutated populations without promoting a response to PD-1 blockade and supports the diagnostic use of mutational burden and signatures in cancer.
Huang AY, Li P, Rodin RE, Kim SN, Dou Y, Kenny CJ, Akula SK, Hodge RD, Bakken TE, Miller JA, Lein ES, Park PJ, Lee EA, Walsh CA. Parallel RNA and DNA analysis after deep sequencing (PRDD-seq) reveals cell type-specific lineage patterns in human brain. Proc Natl Acad Sci U S A 2020;117(25):13886-13895.Abstract
Elucidating the lineage relationships among different cell types is key to understanding human brain development. Here we developed parallel RNA and DNA analysis after deep sequencing (PRDD-seq), which combines RNA analysis of neuronal cell types with analysis of nested spontaneous DNA somatic mutations as cell lineage markers, identified from joint analysis of single-cell and bulk DNA sequencing by single-cell MosaicHunter (scMH). PRDD-seq enables simultaneous reconstruction of neuronal cell type, cell lineage, and sequential neuronal formation ("birthdate") in postmortem human cerebral cortex. Analysis of two human brains showed remarkable quantitative details that relate mutation mosaic frequency to clonal patterns, confirming an early divergence of precursors for excitatory and inhibitory neurons, and an "inside-out" layer formation of excitatory neurons as seen in other species. In addition our analysis allows an estimate of excitatory neuron-restricted precursors (about 10) that generate the excitatory neurons within a cortical column. Inhibitory neurons showed complex, subtype-specific patterns of neurogenesis, including some patterns of development conserved relative to mouse, but also some aspects of primate cortical interneuron development not seen in mouse. PRDD-seq can be broadly applied to characterize cell identity and lineage from diverse archival samples with single-cell resolution and in potentially any developmental or disease condition.