Main

It is estimated that 1 in 10 individuals—more than 30 million people in the USA—have a diagnosis of a rare disease4. Although this term encompasses an estimated 7,000 different conditions, with a notable proportion having a known or presumed genetic aetiology, treatments are available for only around 5% of them4,5. The rarity of these conditions often makes it economically infeasible to develop treatments based on traditional modalities. Previous work has shown that, in some cases, splice-switching ASOs may restore functional protein levels and be administered in a safe and timely manner1. Such therapy presents a treatment opportunity; however, identifying individuals with genetic variants suitable for such attempts remains a challenge1,6. Here, we present a framework to systematically discover and develop splice-switching treatments for individuals with rare diseases, using ataxia-telangiectasia (A-T) as a model.

A-T is an autosomal recessive disease caused by biallelic loss of function of ATM, a gene involved in the cellular response to DNA double-strand breaks7. Clinically, A-T is characterized by progressive cerebellar degeneration, immunodeficiency and predisposition to cancer, with early manifestations including ataxia, involuntary movements, neuropathy, oculomotor apraxia, dysphagia, slurred speech and ocular and skin telangiectasias3. The disease has a prevalence of 1 in 40,000 to 100,000 live births worldwide2. In ‘classical’ A-T (the most typical and prevalent clinical presentation), the average life expectancy is 25 years, and early death is most frequently due to lung disease or cancer2. Hypomorphic ATM variants can cause a milder, slower-progressing phenotype, albeit still with severe morbidity8.

A-T is the most common form of inherited childhood progressive ataxia in many countries9. Nevertheless, effective therapies are lacking—particularly for the profound neurological manifestations that have a marked effect on quality of life. The large coding size (9.2 kb) of ATM poses a challenge for gene therapy approaches, as AAV vectors that are at present available have a packaging capacity of around 4.7 kb (ref. 10). The protein product of ATM is an intracellular kinase, which makes enzyme replacement approaches difficult. By contrast, A-T could be an appealing candidate for splice-switching ASO therapy, given that ASOs readily distribute to the cerebellum when injected intrathecally11.

The Global A-T Family Data Platform is an international initiative led by family advocates and primarily funded by A-T Children’s Project (ATCP), a patient-advocacy foundation for A-T. ATCP has systematically collected clinical information and genomic DNA (gDNA) from individuals who have been clinically or genetically diagnosed with A-T. Through this effort, whole-genome sequencing (WGS) was performed, yielding a WGS dataset of 235 individuals with A-T. We analysed this ‘ATCP cohort’ to establish molecular diagnoses, identify pathogenic variants that could be amenable to splice-switching ASO therapy, develop proof-of-concept individualized ASOs and initiate a pilot investigational trial.

Patients and diagnostic yield

The ATCP cohort comprised de-identified phenotypic and genomic data from 235 individuals who were clinically or genetically diagnosed with A-T, from 209 families. Of these individuals, 46% were female (108/235) and the median age of clinical A-T diagnosis was 3.6 years (range 0–32; Extended Data Table 1 and Supplementary Table 1).

The results of molecular genetic testing were available for only a small subset of individuals within the cohort (23%, 55/235), mostly from sequencing of ATM coding regions (Fig. 1a and Extended Data Table 1). We therefore performed an extensive set of WGS analyses on all individuals to generate a comprehensive picture of genetic variation in A-T. We identified single-nucleotide variants (SNVs) and short indels using GATK12 (Supplementary Table 2), VarScan213 and Strelka214; and copy number variants (CNVs), structural variants (SVs) and mobile DNA element insertions using Delly15, Pindel16, MELT17 and xTea18 (Supplementary Table 3). For completeness, we manually inspected sequencing reads across the ATM locus using the Integrative Genomics Viewer to identify variants that might have been missed by the aforementioned methods. Variants were named with respect to the GRCh38/hg38 reference genome and NM_000051.3, the canonical RefSeq transcript for ATM.

Fig. 1: ATM disease candidate variants.
figure 1

a, Diagnostic (left; green) and ASO therapeutic (right; blue) yields of clinical genetic testing, WES simulation and WGS analysis for the 235 patients in the ATCP cohort. The WES simulation assumed complete recovery of all exonic variants ± 20 bp. b, Positions of ATM disease candidate variants. All SNVs and short indels, as well as SVs that are fully contained in a single intron, are depicted. Two variants for which a splice-switching ASO was developed (c.7865C>T and c.5763-1050A>G) in this study are indicated. Deletion variants that span a splice junction were considered to be located at the junction (position 0). Short indels and SVs were considered to be located at the boundary of the variant closest to a nearby splice junction. c, Types and ASO amenability of all disease candidate variants present in the cohort. d, Cumulative fraction and box plot distribution of the age at diagnosis, based on the possession of SDVs and a known hypomorphic variant (c.5763-1050A>G). The P values (shown in the figure) were calculated through the log-rank test using the survminer R package, and were adjusted by the Bonferroni correction. Shaded areas, fraction ± 95% confidence interval. For the box plots: centre, median; lower hinge, 25th percentile; upper hinge, 75th percentile; lower whisker, smallest value greater than or equal to lower hinge − 1.5 × interquartile range (IQR); upper whisker, largest value less than or equal to upper hinge + 1.5 × IQR.

We assessed the pathogenicity of variants using coding and splice impact predictions, population allele frequency (gnomAD and TOPMed) and experimental evidence (Supplementary Table 1; for details, see Methods). Variants that we considered to be disease-causing were referred to as ‘disease candidate’ variants. We noted that some of the disease candidate variants had been formally classified as variants of uncertain significance (VUSs) by American College of Medical Genetics and Genomics (ACMG) guidelines19. These were considered disease-causing for the purpose of this analysis if (1) there was strong support from computational predictions as disrupting gene function (for example, SpliceAI20 or REVEL21 for splicing and coding impact predictions, respectively); and (2) no other variants were identified by WGS that could account for the clinical diagnosis (Supplementary Note 1). Similarly, some coding CNVs were considered disease candidate variants even when officially classified as VUSs by ACMG guidelines. Use of such a broader definition of pathogenicity is not unprecedented in WGS studies22, in which a near-complete landscape of mutational events for a given individual can be characterized. Such VUSs constituted a relatively small proportion of the disease candidate variants (22 of 220 unique variants; 27 of 469 total mutational events).

At least 2 disease candidate ATM variants were identified in 232 out of 235 individuals (98.7%; Fig. 1a and Supplementary Table 1). Two of the other three individuals had one disease candidate variant and a VUS that we considered to have plausible biological impact, but were not confident enough to classify as a disease candidate variant (Extended Data Fig. 1 and Supplementary Note 2). In the remaining case, only one heterozygous disease candidate variant was found (that is, a second pathogenic event was not found). We also noted that in 3 out of 235 cases, our WGS analyses resulted in a revision of previously reported genetic testing results (Supplementary Note 3).

To confirm the fidelity of our variant calls, gDNA samples were obtained from 16 patients. Sanger sequencing validated all tested variant calls in these samples (31/31 mutational events; Supplementary Tables 4 and 5). We were also able to use experimental or in silico approaches to analyse variant phasing in 88 patients (from 78 families): (1) trio Sanger sequencing was performed for 5 cases for which gDNA samples were available (Supplementary Tables 4 and 6); (2) for two cases, read-based phasing was performed by using WhatsHap23; (3) for 47 cases, analyses of variant co-occurrence in gnomAD strongly predicted that the 2 observed variants were in trans; (4) 32 cases involved homozygous ATM variants; and (5) for 5 cases, a variant appeared homozygous because it was in trans to a deletion. In aggregate, these analyses confirmed that 88 cases (from 78 families) have disease candidate variants on both alleles (Supplementary Table 1; for more details on phasing, see Methods).

We simulated the diagnostic yield of an exon-targeted sequencing approach if performed in lieu of WGS. Perfect coverage of all exonic variants ± 20 nucleotides from flanking regions would have established a molecular diagnosis in 207 of 235 individuals (a diagnostic yield of 88.1%, compared to 98.7% achieved with comprehensive WGS analyses; Fig. 1a,b).

Disease candidate variants

We identified 469 disease candidate ATM mutational events in 235 patients. These comprised a total of 220 unique ATM variants, ranging from SNVs to short indels and SVs. A substantial fraction of these events (75%, 353/469) were predicted to confer strong loss of function, including frameshift variants (31%, 145/469), stop-gain or start-loss variants (22%, 104/469), splice-site-destroying variants (16%, 75/469; defined on the basis of SpliceAI20 and MaxEntScan24 predictions; for details, see Methods) and missense variants or short in-frame indels with previous experimental evidence of complete loss of ATM kinase activity (29/469, 6.2%; Fig. 1c and Supplementary Table 7).

SVs that severely disrupt the open reading frame of ATM (out-of-frame inversions or deletions; inversions or deletions involving the start or stop codon; or duplication of more than 10 middle coding exons) made up 5.8% (27/469) of all disease candidate mutational events (Supplementary Table 3). These included 26 copy number changes (23 deletions, 2 duplications and 1 complex rearrangement (combined inversion and deletion); Extended Data Fig. 2 and Supplementary Fig. 2) and 1 copy number-neutral inversion event (involving the first 16 exons of ATM; Supplementary Fig. 3). Of the 23 deletion events, 16 involved the last 2 exons of ATM (a region that includes the stop codon and partially encodes the ATM kinase domain), present in 15 patients (1 patient was homozygous for the deletion, hence 16 events in 15 patients).

The 27 SVs disrupting the open reading frame, together with the 353 additional strong loss-of-function mutational events mentioned above (frameshift, stop-gain, start-loss, splice-site-destroying or missense variants with previous experimental evidence of loss of function), are expected to lead to deficient and/or absent ATM protein expression or function (ATM null). This group, which we refer to as ‘severely disruptive variants’ (SDVs), constitutes 81% (380/469) of the disease candidate mutational events encountered. More than half of the individuals in this cohort (66%, 156/235) had two of these SDV events, which would be expected to lead to the classical A-T phenotype with a more severe prognosis. Indeed, individuals with 2 SDV events leading to ATM null status tended to have an earlier clinical diagnosis compared to individuals with other genotypes (79/235; P <  0.05, log-rank test), even after factoring out 13 patients with a known hypomorphic variant (c.5763-1050A>G, prevalent in the UK) as one of the pathogenic variants (Fig. 1d).

In addition, we detected six insertion events (100 bp–4 kb) fully contained within introns. These included two pseudogene insertions (4 kb; one homozygous patient), one tandem duplication (382 bp; Supplementary Fig. 4), and three Alu retrotransposon insertions (around 280 bp, one n = 1 event; around 131 bp, one n = 2 event). Although these intronic events do not directly disrupt the coding sequence (and therefore were not counted as SDVs), they could still contribute to loss of function by activating cryptic splice sites or disrupting canonical splicing. For instance, the 382-bp tandem duplication was strongly predicted by SpliceAI20 to cause inclusion of an out-of-frame (256-bp) pseudoexon in intron 33, which would lead to premature translational termination and nonsense-mediated decay (Supplementary Fig. 4).

Evidence supporting the remaining 83 disease candidate variants (mostly missense and intronic and synonymous variants) was obtained from previous interpretations in ClinVar, their rarity and/or other predictions of functional impact (for details of the procedure for determining disease candidate variants, see Methods; for the overlap between disease candidate variants and ClinVar variants, see Extended Data Fig. 3a).

Of the variants encountered, 64% (141/220) were singletons (that is, appeared only once in this cohort). The remaining 36% (79/220) were observed at least twice. The most highly recurrent variant was c.5932G>T (p.Glu1978Ter), present in 16 individuals (Extended Data Fig. 3b and Supplementary Table 8).

ASO-amenable variants

We developed a taxonomy to classify disease-causing variants on the basis of their predicted amenability to splice-switching ASO intervention, using categories of ‘probably’, ‘possibly’ or ‘unlikely’ (further details of the taxonomy are provided in Fig. 2 and Methods). In brief, the taxonomy addresses two questions:

(1) Does the variant directly damage the coding function and/or a canonical splice site? If yes, by how much? Damage to the coding function was assessed by using in silico predictor (REVEL21) and experimental evidence from the literature; damage to canonical splicing function was assessed by using SpliceAI20, MaxEntScan24 and LaBranchoR25.

(2) Does the variant lead to a type of mis-splicing that might be rescued by an ASO? Mis-splicing events can be induced when a variant creates a novel splice site or weakens exon definition (through disruption of canonical donor–acceptor sites or splicing enhancers, or creation of splicing silencers). Of the two types, a gain of a splice site is more amenable to rescue than a weakening of exon definition, because repressing a cryptic splice site can often be achieved by simply blocking the site with an ASO, whereas promoting a weakened exon requires identifying and blocking silencer elements within or near the exon. These possibilities were assessed with in silico splicing predictions based on SpliceAI and MaxEntScan, as well as experimental evidence.

Fig. 2: Taxonomy of ASO-amenable variants.
figure 2

Two-step logic for classifying the amenability of genetic variants to splice modulation rescue (probably, possibly or unlikely). First, each variant is evaluated for its damaging impact on either canonical splicing or protein-coding function, on the basis of SpliceAI, MaxEntScan, LaBranchoR and REVEL. Second, each variant is evaluated for potential mis-splicing impact: gain of a splice site, exon skipping or intron retention. See Methods for complete details.

Using this taxonomy, we encountered 36 probably/possibly ASO-amenable mutational events (representing 17 unique variants) in 35 patients from the ATCP cohort (15%, 35/235; Fig. 1a and Supplementary Table 9). Specifically, 22 individuals (9%, 22/235) had at least one probably amenable variant (Fig. 3), and 13 (6%, 13/235) had a possibly amenable variant (Extended Data Fig. 4a). Four of the 17 probably/possibly amenable variants were found in more than a single patient, with the most highly recurrent variant (c.5763-1050A>G) present in 13 of the 235 patients. Only 22% (8/36) of the ASO-amenable variants had been previously classified in ClinVar as pathogenic or likely pathogenic (Extended Data Fig. 3a); 50% (18/36) had been previously classified as likely benign, uncertain significance or conflicting interpretations of pathogenicity; and 28% (10/36) were not reported in ClinVar (as opposed to around 17% for non-ASO-amenable variants). These findings suggest that ASO-amenable variants are often misinterpreted and underrepresented in ClinVar.

Fig. 3: Probably ASO-amenable variants.
figure 3

Cases indicates the number of individuals with the variant in the ATCP cohort of 235 individuals. The homozygous DUSP16 pseudogene insertion event (Supplementary Table 3) is not shown. For the full list of probably ASO-amenable variants, see Supplementary Table 9. SRE, splicing regulatory element; ΔMES, change in MaxEntScan score when a given variant is introduced.

The majority (26/36, 72%) of the probably/possibly ASO-amenable mutational events were deep intronic, defined as being located more than 20 nucleotides away from an exon (Fig. 1b). These included six SV events that were fully contained in an intron (Supplementary Fig. 4 and Supplementary Tables 3 and 9). Restricting the analysis to exons and the flanking intronic regions (20 or more nucleotides from splice junctions), the yield of ASO-amenable variants was only 4% (1%, probably; 3%, possibly). Hence, WGS (9%, probably; 6%, possibly) provides a 3.5-fold (counting all amenable variants; 35 cases versus 10 cases) to 7-fold (counting only probably amenable variants; 22 cases vs. 3 cases) higher ‘ASO therapeutic yield’ than whole-exome sequencing (WES; Fig. 1a).

We used RNA sequencing (RNA-seq) or experimental minigene splicing assays to benchmark the accuracy of our predictions. For two probably ASO-amenable variants (c.7865C>T, c.5763-1050A>G), patient fibroblasts were available for RNA-seq; for nine additional probably/possibly ASO-amenable variants, gDNA samples were available, and experimental minigene splicing assays were successfully established (for more details about minigene assays, see Methods and Supplementary Tables 1012). All tested variants (11/11) yielded the predicted mis-splicing consequences (for RNA-seq, see the following sections; for minigene assays, see Extended Data Fig. 5). Furthermore, we conducted ASO screens for six probably ASO-amenable variants (two based on patient fibroblasts and four based on minigene assays), and successfully identified ASOs capable of correcting mis-splicing for all six (see the following sections for patient-fibroblast-based screens, Fig. 4 for minigene-based screens and Supplementary Table 9 for a summary).

Fig. 4: Validation of ASO amenability.
figure 4

ac, For four probably ASO-amenable variants for which a minigene-based splicing assay is robustly established (a, c.2839-579_2839-576del (GGTAA>G) and c.2839-581G>A; b, c.6348-986G>T; and c, c.3994-159A>G), small-scale ASO screening was performed, which showed that the mis-splicing events caused by all tested variants can be rescued by ASOs. NT, NT-22 (a non-targeting ASO). Blue asterisks, white (or black) arrows and white (or black) ‘x’ marks indicate bands validated by Sanger sequencing (orange asterisk indicates a band not validated by Sanger sequencing). ΔMES, change in MaxEntScan score when a given variant is introduced. For gel source data, see Supplementary Fig. 1.

Overall, four ASO-amenable variants were recurrent within the cohort: c.5763-1050A>G, c.2250G>A, chr11:108243936-108243949insAlu and c.2639-22_2639-20del were found in thirteen, five, two and two individuals, respectively (Supplementary Table 9). In addition, several different ASO-amenable variants had splicing consequences that appeared addressable with a single ASO; for instance, both c.2839-579_2839-576del and c.2839-581G>A result in the inclusion of the same pseudoexon (Fig. 3). We therefore subdivided variants into ‘treatment groups’, each potentially addressable with a single ASO drug. On the basis of these patterns, around 70% (24/35) of amenable individuals were predicted to be treatable with a total of five different splice-switching ASOs, whereas developing splice-switching treatments for all 35 individuals would require 15 distinct therapeutic ASO drugs (Extended Data Fig. 4b and Supplementary Table 9).

Selection of variants for ASO development

Two ASO-amenable ATM variants were selected for further development. The first was c.5763-1050A>G (Fig. 3). The variant, located deep in intron 38, results in the inclusion of a 137-bp pseudoexon, thereby causing a frameshift in the resulting mature product26. This variant is associated with a mild A-T phenotype owing to partial leakiness of its gain-of-splicing effect. It is a founder variant in the UK, with an estimated disease allele frequency of 18% in individuals with A-T in the UK (refs. 27,28). In the ATCP cohort, composed predominantly of individuals from the USA, it was found in a compound heterozygous state in 13 unrelated patients, representing a population frequency of 5.5% (13/235) and a disease allele frequency of 2.8% (13/469).

The second variant was c.7865C>T (Fig. 3). This variant has been previously identified in the homozygous state in patients with classical (severe) A-T. A lymphoblastoid cell line with this variant had no residual protein or enzymatic activity29 (A. M. R. Taylor, personal communication); that is, this variant is a null variant. This variant is predicted to have a benign coding effect (p.Ala2622Val; predicted benign by REVEL), with a pathogenic effect that is mediated by mis-splicing: it creates a strong splice donor site within exon 53 (of 63 exons), causing truncation of the exon by 64 bp, which results in frameshift and subsequent premature translational termination. Inhibition of this splice donor site with a morpholino oligonucleotide has been previously shown to rescue the cellular phenotype of a patient-derived cell line30.

c.7865C>T was encountered in one individual in the ATCP cohort (DDP_ATCP_520, currently 20 years old). Separately, we also identified a second, younger (one year old at referral; currently six years old) child with A-T with this variant as well (Extended Data Fig. 6a and Supplementary Note 4). Whereas most patients with A-T are diagnosed after the initial onset of symptoms (Supplementary Table 1), this child was diagnosed as an infant on the basis of an abnormally low T cell receptor excision circle (TREC) count. TREC assays are used in newborn screening to identify infants at risk for severe combined immune deficiency (SCID), but have also incidentally identified cases of A-T (ref. 31). Exome sequencing in this child showed compound heterozygosity for two ATM variants: c.7865C>T and c.8585-13_8598del (confirmed by trio Sanger sequencing; Supplementary Table 6). The latter is a 27-bp deletion at the intron–exon junction of exon 59, strongly predicted to cause complete loss of function (Supplementary Note 4). This combination of variants predicted a classical, early-onset A-T phenotype (Fig. 1d).

Correcting c.7865C>T and c.5763-1050A>G

A fibroblast cell line was established from the younger patient with c.7865C>T. PCR with reverse transcription (RT–PCR) and RNA-seq analysis of splicing patterns in this cell line showed an abnormal truncation of exon 53 owing to premature splice donor site usage, consistent with previous studies30 (Extended Data Fig. 6b,c). Analysis using allele-specific PCR primers (designed to exclude the non-target c.8585-13_8598del allele) showed that ATM mis-splicing by c.7865C>T is complete, without detectable leakiness (Extended Data Fig. 6d).

To attempt to correct the mis-splicing, we designed an initial set of 12 ASOs, targeting either the novel splice donor site in the exon (AT001-AT011; Fig. 5a and Supplementary Table 13) or a nearby predicted splice regulatory element (AT012; Extended Data Fig. 7a). Several of these ASOs were effective in restoring normal splicing on allele-specific RT–PCR in patient fibroblasts (Supplementary Table 14), with two designs (AT007 and AT008) showing the most promise. Both ASOs were 22 nt long and were offset only by 3 nt, revealing an efficacy hotspot. A set of 20 additional ASOs was designed against this region and tested for the mis-splicing rescue in a secondary screen (Extended Data Fig. 7b), revealing additional ASOs with an efficacy similar to that of AT008.

Fig. 5: Preclinical development of AT008 (atipeksen).
figure 5

a, Design and screening of ASOs targeting c.7865C>T. Biologically independent experiments (independent transfections) were conducted (n = 3, initial; n = 2, fine-tuning). For some ASOs in the fine-tuning screening, the first two letters are omitted. Error bars, mean ± 95% confidence interval (shown only for conditions with n ≥ 3). A one-sample two-tailed t-test was used to assess statistical significance; means were compared to a constant value of 0 because no background normal splicing was observed in cells that were mock-transfected or transfected with non-targeting ASOs. *P < 0.05; **P < 0.01. AT010, *P = 0.0441; AT004, *P = 0.0240; AT001, *P = 0.0453; AT002, **P = 0.0010; AT005, **P = 0.0093; AT006, **P = 0.0060; AT007, **P = 0.0001; AT008, **P = 0.0082. Four top-performing ASOs (blue letters) were selected for further validation (b,c). For RT–PCR gel source data, see Supplementary Fig. 1. b, ASO-mediated restoration of irradiation-induced ATM signalling in patient fibroblasts, measured by immunoblotting. 07, 08, 22 and 26 represent AT007, AT008, AT022 and AT026, respectively; NT, NT-22 (a non-targeting ASO). pP53, phospho-P53; pKAP1, phospho-KAP1. Biologically independent experiments (independent transfections) were conducted: pP53 (n = 2, hypomorphic cases ± irradiation; n = 4, AT022, AT026; n = 5, the other conditions), pKAP1 (n = 4, AT007 and AT022; n = 5, the other conditions). Error bars, mean ± 95% confidence interval (shown only for conditions with n ≥ 3). A two-sample (comparing each condition to NT-22) two-tailed t-test was used for statistical analysis. *P < 0.05; **P < 0.01. For pP53, AT007, **P = 0.0024; AT008, **P = 0.0001; AT022, **P = 0.0471. For pKAP1, AT008, *P = 0.0201; AT022, *P = 0.0175; AT026, **P = 0.0073. Representative blot images are shown in Extended Data Fig. 8a. For blot source data, see Supplementary Fig. 1. c, ASO-mediated restoration of irradiation-induced ATM signalling in patient fibroblasts, measured by immunofluorescence staining. Scale bar, 50 μm. For a quantitative summary of the complete results, see Extended Data Fig. 8b.

Four ASOs (AT008, AT007, AT022 and AT026) were selected for further validation. To determine whether the ASOs could rescue ATM cellular function in addition to splicing, we studied their ability to restore cellular defence responses to radiation, as measured by phosphorylation of P53 and KAP1, two key downstream effectors in the ATM-mediated DNA damage response pathway32. Immunoblotting experiments using antibodies against phospho-P53 and phospho-KAP1 showed ASO-mediated rescue of radiation-induced ATM signalling in patient cell lines (Fig. 5b and Extended Data Fig. 8a). Partial retention of ATM kinase activity has been shown to be associated with increased median age of diagnosis and age of death, and later onset of neurological symptoms33. In particular, after ASO treatment, the levels of phospho-P53 and phospho-KAP1 exceeded those measured in fibroblasts bearing the hypomorphic c.5763-1050A>G variant, supporting the potential clinical relevance of this rescue.

AT008 (22 nt) and AT026 (20 nt) were selected for additional characterization. Both ASOs were capable of rescuing radiation-induced P53 and KAP1 activation by immunocytochemistry (Fig. 5c and Extended Data Fig. 8b), confirming the previous immunoblot results. Dose–response analysis showed that AT008 and AT026 had half-maximum effective concentration (EC50) and half-maximum inhibitory concentration (IC50) values in the nanomolar range (Extended Data Fig. 8c), consistent with previously reported ASOs for clinical use1, but AT008 was three times more potent than AT026. AT008 also showed substantially greater suppression of abnormal splicing compared to AT026 in RNA-seq experiments from treated patient fibroblasts (52% versus 28% correction for AT008 versus AT026, versus 0–6% in controls; Extended Data Fig. 8d). Both ASOs were also found to have the undesirable effect of increasing exon 53 skipping (although we noted that ASOs that target intronic splicing silencers could potentially avoid exon skipping, AT012, designed to explore this strategy, did not show any promising activity; Extended Data Fig. 7). Nevertheless, even adjusting for this undesirable effect, AT008 and AT026 boosted the levels of functional ATM transcript from 0–5% (untreated patient fibroblasts) to 29% (AT008) and 18% (AT026), in addition to restoring P53 and KAP1 phosphorylation as described above. In silico analyses showed that AT008 and AT026 had clean off-target profiles in the human genome (no binding sites within exons, or within introns up to 5 kb from the nearest exon, allowing end-trimming of up to 6 nt and internal mismatches of up to 2 nt; Supplementary Figs. 5 and 6). These off-target profiles compared favourably with nusinersen and milasen, two clinically administered intrathecal ASOs (Supplementary Figs. 7 and 8). In vitro cellular toxicity assays showed that AT008 is as tolerable as a non-targeting control ASO (Supplementary Fig. 9).

Parallel experiments were performed for the c.5763-1050A>G variant. A fibroblast cell line was established from a patient with A-T from the ATCP cohort (DDP_ATCP_42) and used to confirm the mis-splicing consequences of c.5763-1050A>G (Fig. 3 and Extended Data Fig. 9). ASOs were designed to block the pseudoexon usage associated with this allele, and screening in patient fibroblasts successfully identified a lead ASO that was capable of rescuing ATM function (Extended Data Figs. 10 and 11, Supplementary Note 5 and Supplementary Figs. 9 and 11).

Individualized trial for ATM c.7865C>T

The c.7865C>T-targeting ASO AT008 (renamed atipeksen) was selected for further clinical development. It was chosen because of the association of c.7865C>T with severe disease (classical A-T), the robustness of atipeksen-mediated RNA and cellular functional rescue and the opportunity for early therapeutic intervention before the onset of major neurological morbidity in the previously identified young child with this variant. (Note that the other identified individual with this variant, DDP_ATCP_520, was not considered to be a suitable candidate for clinical intervention owing to the advanced stage of the disease).

Following appropriate good manufacturing practice (GMP) and animal safety studies (previously described1), and in accordance with draft guidance from the US Food and Drug Administration (FDA; FDA-2020-D-2199, FDA-2021-D-0320, FDA-2021-D-1139 and FDA-2021-D-1140), the child was enrolled in an expanded access trial to determine whether atipeksen might prevent or slow disease progression. Intrathecal injection was chosen rather than intracerebroventricular injection after considering safety, pharmacokinetics and biodistribution, as outlined in Supplementary Note 8. Dosing was initiated at the age of two, with careful exploratory dose escalation followed by a maintenance regimen of 42 mg every three months (for dosing and assessments schedules, pharmacokinetic measurements, exploratory disease biomarkers and dose justification simulations, see Extended Data Fig. 12 and Supplementary Fig. 12). This is anticipated to be a multi-year trial, given that the most notable neurological decline in classic A-T occurs between the ages of 5 and 10 (refs. 34,35,36; at present, the patient is six years old). Investigational treatment has been well tolerated for three years so far with some encouraging signs (Supplementary Note 6).

Discussion

ASOs exemplify a class of therapeutic agents suitable for developing variant-targeted therapies, including those applicable to as few individuals as a single patient1. Albeit promising, these opportunities also present challenges1,6. Our results begin to shed light on the generalizability and scalability of using ASOs to ameliorate the effect of splice-switching genetic variants.

In our cohort, we found that 9% of individuals with A-T had at least one variant probably amenable to intervention with splice-switching ASOs, and an additional 6% had possibly ASO-amenable variants, totalling 15% of the cohort. Notably, reaching this fraction required the use of WGS, as most splice-switching ASO-amenable variants (72%, 26/36) were located within deep intronic regions that would have been missed by exon-based sequencing. Furthermore, a substantial fraction of the ASO-amenable intronic variants (23%, 6/26) were not SNVs or short indels but rather SVs or mobile DNA element insertions, highlighting the need for comprehensive variant analytics. We note that upfront screening with RNA-seq in addition to WGS could be of additional benefit37 but could present some complexities (for example, logistical challenges collecting high-quality RNA samples) or other limitations (for example, the inability to assess genes not expressed in accessible tissues). Certainly, RNA-seq as a follow-up test is crucial for variants flagged as being potentially ASO-amenable.

These results provide a useful first estimate (9–15%) of the prevalence of this class of variants with potential therapeutic implications in the recessive genetic disorder A-T. Through a meta-analysis of published literature38,39, we also found that individuals with inherited retinal disease caused by recessive ABCA4 deficiency (Stargardt disease or cone rod dystrophy) are likely to have ASO-amenable variants in similar proportions (Supplementary Note 7 and Supplementary Tables 15 and 16), leading us to suggest that around 15% is a reasonable first estimate for other recessive genetic conditions. Of course, conditions with unusual genetic architecture, such as spinal muscular atrophy (SMA), will pose exceptions: although SMA is caused by recessive variants in SMN1, all individuals with SMA can be treated via ASO- or small-molecule-mediated splice modulation40,41 because all patients also have an evolutionarily acquired, pathogenic splice ‘variant’ in SMN2, a gene that can substitute for SMN1. Another example is familial dysautonomia, in which almost all cases are associated with a founder variant that causes a tissue-specific splicing defect42 and may be amenable to splice modulation rescue43. With respect to dominant loss-of-function conditions, because these are associated with one pathogenic variant per individual, not two, one can expect the prevalence of splice-switching ASO-amenable variants to be roughly half (that is, about 5–8%).

Within the cohort of 235 individuals with A-T, we identified 35 individuals with ASO-amenable variants, and successfully conducted small-scale ASO screening for several probably amenable variants. ASOs successfully rescued mis-splicing for each of the tested variants, providing evidence of the utility of our predictive taxonomy. Splice-switching ASOs that are capable of rescuing functional ATM deficits in patient cell lines were successfully developed for two recurrent ATM variants. One of these is being tested in a pilot first-in-human investigational trial, which has been underway for the past 36 months.

We note that it might be possible to treat around 70% of these individuals with as few as 5 splice-switching ASOs, whereas treating all 35 individuals would require 15 distinct drugs (Extended Data Fig. 4b). This illustrates both the opportunities and the challenges of developing individualized splice-switching ASOs under a traditional ‘one-drug-at-a-time’ model. These opportunities and challenges will apply to other variant-specific treatment strategies (for instance, CRISPR editing), underscoring the need for innovative, platform-based scientific and regulatory approaches to unlock their potential.

Overall, our study provides a framework for how individualized genetic therapy might work, using genome sequencing and analysis to efficiently identify variants that might be amenable to splice-switching intervention, in time to design, test and deploy an appropriate ASO. If successful, these approaches could someday encourage a re-evaluation of which genetic findings are considered ‘actionable’. However, it is worth emphasizing that this approach at present remains investigational, and contains uncertainty and risk that have to be weighed carefully in a clinical context. Key logistical, regulatory, economic and ethical considerations remain as well1,6,44,45 (FDA draft guidance: FDA-2020-D-2199, FDA-2021-D-0320, FDA-2021-D-1139 and FDA-2021-D-1140). Continued study of patient-customized ASO therapies in serious genetic illness, facilitated by this framework, will be important to gather the body of evidence necessary to support the viability of this approach.

Methods

Patients

The WGS and clinical data of 235 patients with A-T were provided by the Global A-T Family Data Platform of ATCP. Our access to the data was approved by the Data Access Committee of ATCP.

Selected patients with A-T enrolled at the Manton Center for Orphan Disease Research under the approval of the Institutional Review Board (IRB) at Boston Children’s Hospital (10-02-0053). These included the individual with c.7865C>T who has been treated with AT008 (atipeksen), and individuals in the ATCP cohort, who were enrolled for WGS variant call validation by Sanger sequencing and mis-splicing validation by minigene assay and RNA-seq. gDNA samples extracted from the saliva of patients were provided by the Broad Institute. Whole-blood samples were provided by their physicians through the ATCP foundation, and RNA samples were extracted from these.

Functional studies using the cells derived from patients and their families were performed after obtaining appropriate consent under the auspices of an IRB-approved protocol maintained by the Manton Center for Orphan Disease Research Gene Discovery Core at Boston Children’s Hospital. Skin fibroblasts were derived from a 2-mm punch biopsy taken from the patient’s skin using explant culture. Fibroblasts were maintained and passaged in medium containing DMEM (Fisher Scientific) supplemented with 10% fetal bovine serum (Gibco). Fibroblasts used in experiments were under passage 20.

Variant calling

WGS reads were aligned to GRCh38/hg38 using BWA (v.0.7.17) (ref. 46) and pre-processing and quality control were performed according to GATK Best Practice Workflows12. Multiple computational tools were used to call diverse types of variants, including GATK-HaplotypeCaller (v.3.5) (ref. 12), VarScan2 (v.2.4.4) (ref. 13) and Strelka2 (v.2.9.10) (ref. 14) for SNVs and short indels (less than 50 nt in length; Supplementary Table 2) and Delly (v.0.8.6) (ref. 15), Pindel (v.0.2.5b8) (ref. 16), MELT (v.2.2.2) (ref. 17), and xTea (v.0.1.7)18 for SVs (50 or more nt in length; Supplementary Table 3), with MELT and xTea used specifically for transposons. For large CNVs with imprecise boundaries, we manually inspected read alignments on Integrative Genome Viewer (IGV; v.2.8.9) (ref. 47) to determine rough boundaries of the variants.

Relatedness

To analyse relatedness among the 235 individuals in the ATCP cohort, we used VCFtools (v.0.1.17) with the ‘relatedness2’ option48 (Supplementary Table 1), which is based on the KING software package49. For individuals with the annotated relatedness information in the clinical record, all annotations were consistent with the inferred relatedness.

Variant effect prediction

For variant effect prediction, VEP (release 100)50 was used to annotate all SNVs and short indels. Protein-coding impact was evaluated using REVEL21 and experimental evidence of protein functionality in the literature (Supplementary Table 7); mis-splicing impact was evaluated using MaxEntScan24, SpliceAI20 and LaBranchoR25. LaBranchoR-predicted branchpoint coordinates on GRCh37/hg19 were downloaded. After converting them to the GRCh38/hg38 coordinates (using LiftOver), their potential overlap and distance to variants were examined. For variants shorter than 50 nucleotides, the allele frequencies were looked up in gnomAD (v.3.1) (ref. 51) and TOPMed (freeze 8; ref. 52). For SVs, the allele frequencies were looked up in dbVar53, DGV54 and gnomAD SVs (v.2.1) (ref. 55). The clinical significance of variants was looked up in ClinVar (as of 2 June 2020; ref. 56). All disease candidate SV events were confirmed by manually inspecting the raw sequencing data on IGV.

ACMG classification

Disease candidate variants were classified using a five-tiered system in accordance with the guidelines outlined by ACMG19,57. For predicted loss-of-function variants, we used specialized ACMG recommendations to apply the PVS1 criteria58. For CNVs, we used a specialized scoring framework proposed by ACMG and Clinical Genome Resource (ClinGen)59.

Determination of disease candidate variants

First, we defined SDVs. These include frameshift, stop-gain, start-loss and splice-site-destroying variants, and structural variants affecting one or more exons, as well as missense variants and short in-frame indels with previous experimental evidence of functional loss. Splice-site-destroying variants were defined as: (SpliceAI donor/acceptor loss score ≥ 0.1 at a canonical splice site) AND (MaxEntScan donor/acceptor score with the ALT allele < MaxEntScan donor/acceptor score with the REF allele) AND [(MaxEntScan donor/acceptor score with the ALT allele < 2) OR (MaxEntScan donor/acceptor score with the ALT allele < 0.3 × MaxEntScan donor/acceptor score with the REF allele)]. All SDVs were considered as disease candidate variants. Second, variants that were annotated as pathogenic or likely pathogenic in ClinVar were considered as disease candidate variants.

For the patients in whom fewer than two disease candidate events were identified in the previous two steps, we analysed the remaining variants in each patient on the basis of the population and cohort allele frequencies. We filtered out variant calls whose population or cohort allele frequencies are higher than that of c.5932G>T (p.Glu1978Ter); this variant has the highest allele frequency in this ATCP cohort among the variants annotated as pathogenic in ClinVar. It has gnomAD v.3.1 and ATCP cohort allele frequencies of 0.0000349045 and 0.034 (16/470), respectively. For the variant calls that had passed the allele frequency filter, their protein-coding and splicing impacts were examined on the basis of multiple computational tools: REVEL (for protein-coding impacts) and SpliceAI and MaxEntScan (for splicing impacts). Missense variants that were predicted as pathogenic by REVEL (score ≥ 0.5) were considered as disease candidate variants. Mis-splicing events with a SpliceAI score of 0.1 or higher were considered as likely true events. If the consequence of the mis-splicing is predicted to result in frameshift or loss of a crucial domain of the protein, the variant that caused the mis-splicing was classified as a disease candidate variant. For the patients in whom fewer than two disease candidate events were identified up to this step, we reviewed the remaining variants on a case-by-case basis (Supplementary Note 2).

Sanger sequencing validation of a subset of disease candidate variants was performed using available patient gDNA samples. The PCR protocol comprised 10 ng template DNA, 10 µl KAPA2G Robust HotStart ReadyMix (2X; Kapa Biosystems), 1 µl site-specific primer pairs (10 µM), and PCR-grade water to a final volume of 20 µl. The cycling parameters were 94 °C for 3 min; 30 cycles of 94 °C for 15 s, 60 °C for 15 s, 72 °C for 15 s; 72 °C for 3 min; and held at 4 °C. Validation primers are listed in Supplementary Table 4. All PCR amplicons were visualized on 2% agarose gels. Variants and corresponding genotypes were confirmed by Sanger sequencing (Supplementary Table 5).

Phasing of disease candidate variants

Trio Sanger sequencing

Trio Sanger sequencing was performed on the family of the patient (with c.7865C>T) who has been under treatment with atipeksen, as well as on five individuals in the ATCP cohort (four families; DDP_ATCP_42 (with c.5763-1050A>G), DDP_ATCP_218, DDP_ATCP_38/39, DDP_ATCP_96). In all six cases, we confirmed with Sanger sequencing that the two disease candidate variants in each case are in trans (Supplementary Tables 1 and 6).

Homozygosity

In 32 cases (32 families), disease candidate variants were found to be homozygous. In five other cases (five families), disease candidate variants appeared homozygous owing to being in trans with a deletion at a locus overlapping the variants (Supplementary Table 1).

Read-based phasing

When the distance between two disease candidate variants is shorter than the read length, the two variants can be phased using read-based phasing methods. We used WhatsHap (v.1.0) (ref. 23), a read-based phasing tool, to analyse such cases, and found that in two cases (one family), the two disease candidate variants were in trans. These two variants were only 62 bp apart and were also confirmed by manual inspection of the raw sequencing data on IGV (Supplementary Table 1).

Variant co-occurrence

The gnomAD variant co-occurrence database can be used to predict that the two variants are likely to be in cis or in trans60. If two variants are in the same haplotype (that is, in cis), they tend to appear in the same individual. This analysis could be performed only for individuals whose two disease candidate variants are represented in the gnomAD database (v.2.1.1, in GRCh37/hg19 coordinates) at a global allele frequency of higher than 0% and less than 5%. A total of 47 individuals (38 families) in the ATCP cohort met these criteria. The analysis showed that 2 disease candidate variants are highly likely to be on different haplotypes in all of the 47 individuals (Supplementary Table 1).

ASO amenability taxonomy

General rules

(1) If a variant damages both a canonical splice site and protein-coding function at the same time, more severe damage is considered as the representative damage of the variant. (2) Solid experimental evidence on mis-splicing or coding impact of a variant, if available, can override computational predictions. For a schematic illustration of the taxonomy, see Fig. 2.

Damage to canonical splicing

(1) Severe: (i) SpliceAI donor/acceptor loss score at a canonical splice site ≥ 0.1, (ii) MaxEntScan donor/acceptor score with the ALT allele at the site < MaxEntScan donor/acceptor score with the REF allele at the site, AND (iii) [MaxEntScan donor/acceptor score with the ALT allele at the site < 2] OR [MaxEntScan donor/acceptor score with the ALT allele at the site < 0.3 × MaxEntScan donor/acceptor score with the REF allele at the site].

(2) Moderate: (i) NOT severe (as defined above), (ii) SpliceAI donor/acceptor loss score at a canonical splice site ≥ 0.1, AND (iii) [MaxEntScan donor/acceptor score with the ALT allele at the site < MaxEntScan donor/acceptor score with the REF allele at the site, MaxEntScan donor/acceptor score with the ALT allele at the site ≥ 2, AND MaxEntScan donor/acceptor score with the ALT allele at the site ≥ 0.3 × MaxEntScan donor/acceptor score with the REF allele at the site] OR [The variant is ≤3 nt away from the LaBranchoR-predicted branchpoint OR the distance between the LaBranchoR-predicted branchpoint and the site is changed by >3 nt by the variant].

(3) No to little: NEITHER severe NOR moderate (as defined above).

Damage to protein-coding function

(1) Severe: (i) frameshift, stop-gain, or start-loss variant OR (ii) missense variant predicted as pathogenic by REVEL (score > 0.5).

(2) No to little: (i) NOT severe AND (ii) synonymous variant or missense variant predicted as benign by REVEL (score ≤ 0.5).

Mis-splicing type

(1) Gain of mis-splicing (gain): (i) SpliceAI donor/acceptor gain score at a non-canonical site ≥ 0.1 AND (ii) MaxEntScan donor/acceptor score with the ALT allele at the site ≥ 2.

(2) Exon skipping or intron retention (skipping or retention): SpliceAI donor/acceptor loss score at any canonical site ≥ 0.1 without an accompanying gain of mis-splicing by SpliceAI (donor/acceptor gain score < 0.1 at any non-canonical splice site).

(3) Neither: NEITHER gain, skipping, NOR retention.

Minigene assay

Plasmid construction

To generate a minigene, we used the pSpliceExpress plasmid, which was a gift from S. Stamm (Addgene plasmid 32485; http://n2t.net/addgene:32485; RRID: Addgene_32485; ref. 61). The genomic fragment with a variant of interest was cloned into the pSpliceExpress donor vector using the BP recombination reaction. The inserted fragments for reference and alternative alleles were generated by a two-step PCR procedure. In the first round of PCR, the genomic region of interest was amplified from patient gDNA with attB tagged primers, which added 12 nucleotides of the attB1 and attB2 sites to the ends of amplicons. The second PCR reaction used the first PCR products as templates and extended them to contain complete attB sequences using universal adapter primer pairs. All PCR reactions were performed with Phusion Hot Start II DNA polymerase (Thermo Fisher Scientific) or PrimeSTAR GXL DNA polymerase (Takara Bio). Primer sequences used for minigene constructions were listed in Supplementary Table 10. Full attB PCR products were purified using the PureLink PCR Purification Kit or PureLink Quick Gel Extraction Kit (Invitrogen). Gateway BP Clonase II Enzyme Mix (Invitrogen) was used to recombine attB PCR products into pSpliceExpress. In brief, approximately 25 fmol (1 kb PCR product is 0.65 ng fmol−1) of purified attB PCR product was added to 75 ng of donor vector, TE buffer and 1 µl of BP Clonase Enzyme Mix to a final reaction volume of 5 µl. The reaction was incubated at room temperature for 1 h, after which 0.5 µl Proteinase K was added to stop the reaction. One microlitre of each BP Clonase reaction product was transformed into 25 µl OneShot TOP10 Chemically Competent Escherichia coli (Thermo Fisher Scientific). Transformed E. coli was spread on LB agar plates with ampicillin (1× LB agar with 50 µg ml−1 ampicillin) and incubated overnight at 37 °C. To screen for positive colonies containing the desired plasmids, a dozen colonies for each variant were picked up and diluted in 50 µl sterile water. Subsequently, colony PCRs were performed using Phusion Hot Start II DNA polymerase (Thermo Fisher Scientific), followed with 2% agarose gel inspection. The cycling programme was: bacteria were lysed and DNA was denatured at 98 °C for 10 min, followed by 30 cycles of 98 °C for 10 s, optimal annealing temperature for 20 s and 72 °C for 30 s, and final extension for 5 min at 72 °C. Primer sequences used for colony PCR are listed in Supplementary Table 10. Positive colonies were inoculated in liquid LB with ampicillin (1× LB and 50 µg ml−1 ampicillin) and were cultured in a shaking incubator at 275 rpm at 37 °C for 12–18 h. Plasmid DNA was extracted from overnight cultures using PureLink Quick Plasmid Miniprep Kit (Invitrogen) or ZR plasmid Miniprep Kit (Zymo Research). The genotypes and the sequences of plasmid inserts were confirmed by Sanger sequencing (Supplementary Table 11). At least one wild-type and one mutant plasmid were identified for each variant.

In some variants, full attB PCR products could not be amplified directly from patient gDNA owing to low quality or unavailability of the patient gDNA. In these cases, a wild-type fragment was amplified from human male gDNA (Promega) and used to construct reference plasmids as described above. The Q5 site-directed mutagenesis kit was used to introduce the variants into the reference plasmids (New England Biolabs). Twenty-five-microlitre PCR reactions were set up with mutagenic primers (Supplementary Table 10) and Q5 Hot Start High Fidelity 2X Master Mix to introduce the variant into the reference plasmids and amplify the mutant plasmids. The samples were denatured at 98 °C for 30 s and subjected to 25 cycles of 98 °C for 10 s, 50–72 °C (various annealing temperatures were tested) for 10 to 30 s, 72 °C for 20–30 s per kb, followed by a final extension at 72 °C for 2 min. The linear PCR products were ligated into the plasmid through DpnI restriction digestion and ligation. The mutant plasmids were transformed into competent E. coli. Single colonies were screened and inoculated in liquid LB and ampicillin. Plasmid DNA was collected from overnight cultures.

Splicing assay

Around 1 × 105 HEK293T cells were seeded in 24-well plates. When the cells reached about 90% confluency, they were transfected using Lipofectamine 3000 (Thermo Fisher Scientific). For each transfection, 4 µl of plasmid was added to each well along with 1.5 µl Lipofectamine, 2 µl P3000 and 50 µl Opti-MEM (Thermo Fisher Scientific). For some transfections, ASOs were also added at a final concentration of 200 µM. Twenty-four hours after transfection, total RNA was extracted using the PureLink RNA Mini Kit (Invitrogen). RNA was then reverse-transcribed into cDNA in a 4-μl total reaction consisting of 3 µl RNA and 1 µl of SuperScript IV VILO Master Mix (Thermo Fisher Scientific). The reverse transcription reactions were incubated at 25 °C for 10 min, 50 °C for 10 min and 85 °C for 5 min. To detect transcripts transcribed from the transfected plasmids, 1 µl cDNA was amplified using Phusion Hot Start II DNA polymerase (Thermo Fisher Scientific), 2× KAPA SYBR Fast qPCR Master Mix (Kapa Biosystems) or 2× KAPA HiFi HotStart ReadyMix (Kapa Biosystems). For primers, we used rat insulin primers that bind to the minigene exons flanking the inserted ATM gene region (Supplementary Table 10). The final PCR products were run and visualized on 2% agarose gel. Mis-splicing bands were extracted using the PureLink Quick Gel Extraction (Invitrogen) and confirmed by Sanger sequencing (Supplementary Table 12).

Quality control

If the amount of the canonical splicing isoform represented less than 50% of the total amount of all ATM isoforms, we disqualified and excluded the minigene assay plasmids for further analysis. We found that some of the plasmids bearing the ATM gene region did not express the normally spliced isoform even without any variant, which makes them unsuitable to assess the mis-splicing effects of variants. Therefore, we excluded them from the analysis. The minigene assay plasmids carrying the ATM gene contexts of two variants (c.3489C>T [in DDP_ATCP_138] and c.4801A>G [in DDP_ATCP_302]) did not pass this criterion as they showed predominant skipping of the exon of interest even in the absence of the variant of interest in the ATM gene region of the plasmids.

ASO development

ASOs

For c.7865C>T, a total of 32 ASOs were designed (12 for the initial screening and 20 for the fine-tuning screening). The ASOs were designed to be complementary to either the region encompassing the novel splice donor site in exon 53 created by c.7865C>T or predicted splice silencers surrounding the exon 53 canonical splice donor site. These silencers were predicted on the basis of a previously published hexamer-based model62. For c.5763-1050A>G, a total of 27 ASOs were designed (12 for the initial screening and 15 for the fine-tuning screening) to be complementary to the regions encompassing the novel splice donor site in intron 38 created by c.5763-1050A>G, the cryptic acceptor site of the pseudoexon in intron 38 or predicted splice silencers within the pseudoexon (also predicted on the basis of the hexamer model). For minigene-based validation of ASO amenability, a total of 24 ASOs were designed for 4 ASO-amenable variants (c.2839-579_2839-576del, c.2839-581G>A, c.6348-986G>T and c.3994-159A>G). The ASOs were designed to block either the splice donor/acceptor site or predicted exonic splicing silencers within a pseudoexon of interest. NT-20 and NT-22 (non-targeting oligonucleotides with the same chemistry) were used as negative controls1. For in vitro toxicity testing, ASO-tox, a gapmer with known toxicity, was used. All ASO sequences and detailed chemical modifications of ASOs are provided in Supplementary Table 13. All ASOs were manufactured by Microsynth. The ASO drug substance used in the atipeksen N-of-1 clinical trial was manufactured by ChemGenes in accordance with GMP guidelines.

ASO screening

Fibroblasts were transfected with 200 nM ASOs using Lipofectamine 3000 (Thermo Fisher Scientific). Twenty-four hours after transfection, total RNA was isolated using PureLink RNA Mini (Invitrogen). cDNA synthesis using oligo-dT and random hexamers was performed using the Superscript VILO reverse transcriptase kit (Invitrogen). For allele-specific PCR, primers were designed to specifically exclude the non-target allele in each patient (Extended Data Figs. 6d and 9c and Supplementary Table 14). For c.5763-1050A>G, the distance between the two ATM variants was too far (around 2 kb) to distinguish the two bands representing normally and abnormally spliced products (which differ by 137 bp) on a agarose gel; therefore, a nested PCR was performed. PCR was performed using 1 µl of cDNA and a standard condition (35 cycles; 98 °C for 5 s, 60 °C for 15 s, 72 °C for 45 s). Relative quantities of the normally and abnormally spliced transcripts were measured by 1.5% agarose gel electrophoresis and densitometry analysis using ImageJ.

ASO validation

Immunoblotting

Fibroblasts were transfected with 400 nM ASO as described above. Forty-eight hours after transfection, cells were irradiated with 10 Gy using a caesium-137 source, and then incubated for 30 min at 37 °C. Cell lysates were then collected using RIPA buffer (Boston Bioproducts) supplemented with Roche PhosSTOP (Sigma-Aldrich). Lysates were incubated with 4× Laemmli buffer (BioRad) and loaded onto 4–15% precast gradient protein gels (BioRad) and separated by electrophoresis. Protein samples were then transferred to PVDF membranes, which were subsequently incubated overnight with primary antibodies for phospho-P53 (Cell Signaling Tech, diluted 1:500) and phospho-KAP1 (Bethyl Lab, diluted 1:1,000). GAPDH was used as a loading control and primary antibody for GAPDH (Proteintech) was diluted to 1:250. Following incubation with secondary antibodies that were diluted to 1:5,000 for phospho-P53, phospho-KAP1 and GAPDH (Li-Cor), targets were visualized with the Li-Cor Odyssey system and quantified with densitometry analysis (ImageJ).

Immunocytochemistry

Fibroblasts were transfected with 200 nM of ASOs as described above. Forty-eight hours after transfection, cells were irradiated with 1.5 Gy using a caesium-137 source, and then incubated for 60 min at 37 °C. Cells were washed in PBS, fixed in 4% (w/v) paraformaldehyde and permeabilized with 0.1% (w/v) Triton X-100 in PBS at room temperature. Cells were then incubated overnight in PBS with 3% BSA and antibodies to phospho-P53 (Cell Signaling Tech) and phospho-KAP1 (Bethyl Lab) and were visualized with immunoglobulin G Alexa Fluor conjugates (Life Technologies). DNA was counterstained with Hoechst 33342. Images were collected with the ImageXpress Micro microscope (Molecular Devices) and processed with MetaXpress (Molecular Devices). The abundance of targets expressed in nuclei was quantified.

Dose–response

Fibroblasts were electroporated using the Neon Transfection System (Thermo Fisher Scientific) with varying amounts of ASOs: 0–1,000 nM (0, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1,000 nM; final concentrations). Twenty-four hours after electroporation, total RNA was isolated as described above. cDNA synthesis, RT–PCR, gel electrophoresis and densitometry were performed as described above.

RNA-seq

Fibroblasts were transfected with 200 nM of ASOs as described above. Forty-eight hours after transfection, total RNA was isolated as described above. RNA-seq libraries were prepared using the KAPA Hyper Prep kit (KAPA Biosystems). Sequencing was performed on an Illumina HiSeq 2500 (for sequencing; 2 × 100 bp). For alignment, STAR (v.2.7.5c) (ref. 63) was used to map reads on GRCh38/hg38 in the paired-end, two-pass mode to yield BAM files that were sorted by chromosomal coordinates. Gene annotation was not provided to the alignment program to avoid any biased alignment favouring annotated splice junctions. The sorted BAM files were indexed using SAMtools (v.1.10) (ref. 64). IGV was used to draw sashimi plots, which showed the number of reads supporting splice junctions.

Off-target analysis

The following derivative sequences were computationally generated from the sequences of AT008 (atipeksen), AT026, AT056, nusinersen and milasen: (1) sequences with progressively trimmed ends, starting from the full-length ASO sequences down to 16 nt in length; (2) sequences with up to 2 nt mismatches; and (3) sequences with a 1-nt internal insertion or deletion (Supplementary Figs. 58 and 11). BWA (v.0.7.17) (ref. 46) was used to align the generated sequences on GRCh38/hg38 and the RefSeq transcriptome sequences, downloaded from the UCSC Genome Browser.

In vitro ASO toxicity assay

An FITC Annexin V Apoptosis Detection Kit I (BD 556547, BD Biosciences) was used to quantitatively measure the percentage of cells undergoing apoptosis after transfection with ASOs at different concentrations as described above. Cells were collected, washed with PBS and resuspended in 1× binding buffer four days after transfection. Five hundred microlitres of the resuspended cells was stained with 5 µl of Annexin V-FITC and 5 µl propidium iodide (PI) in the dark at room temperature for 15 min. The cells were analysed using a flow cytometer (BD FACSAria III system) and were quantified by FlowJo software. The Annexin-V-positive and PI-negative fraction was ‘early apoptotic’, and the Annexin-V-positive and PI-positive fraction was ‘late apoptotic or necrotic’.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.