INTRODUCTION

Huntington disease (HD, OMIM 143100) is an autosomal-dominant neurodegenerative disorder caused by CAG-repeat expansion within the HTT gene. HD is characterized by progressive cognitive, psychiatric, and motor symptoms.1 Initial motor signs define clinical disease onset,2,3 usually occurring in midadulthood.

Polymerase chain reaction (PCR)–based fragment sizing of CAG repeats is used for genetic confirmation in symptomatic individuals, and presymptomatic predictive testing in at-risk individuals.1 HTT alleles with ≤35 CAG repeats are usually4 nonpathogenic, but intermediate alleles (27–35 repeats) may undergo germline expansion to pathogenic sizes.5,6 Alleles with ≥40 CAG repeats are pathogenic within normal lifespans.

A recent study of reduced penetrance (RP) allele frequency in the general population estimated penetrance of 36–39 CAG-repeat alleles at <1% by age 65,7 markedly lower than the 63.9% penetrance by age 65 for repeat sizes 36–39 within clinically ascertained families.8

CAG length, the major determinant of age of onset (AOO),9,10,11 accounts for approximately 70% of variation of AOO.9,10,11 AOO variance increases with decreased CAG length and most individuals with CAG sizes of 36–39 (the RP range) will not manifest with HD within a normal lifespan.10,12

Recently, a known variant13 LRG_763t1:c.118A>G, the loss of the CAA interruption penultimate to the 3′ end of the HTT CAG tract (Fig. 1a), was identified as a cis-acting modifier of AOO.14,15,16 This loss of interruption (LOI) variant is associated with significantly earlier AOO when present on pathogenic range HD alleles,14,15,16 with consistently early onset observed within carrier families.14 However, since previous studies included relatively few patients (i.e., n ≤ 21), the magnitude of this effect on AOO has remained uncertain.

Fig. 1: Structure of the LOI variant and effect on age of onset.
figure 1

(a) Structure of the loss of repeat interruption (LOI) variant in the HTT CAG tract (Human Genome Variation Society [HGVS] nomenclature LRG_763t1:c.118A>G). Polyglutamine and polyproline-encoding codons are identified by blue and orange backgrounds, respectively. Interrupting sequence is marked by blue text, with variants shown in red. Here we represent the LOI alongside a polyproline variant that is found in linkage with the polyglutamine variant in >90% of cases (i.e., LRG_763t1:c.[118A>G;127A>G]). (b) Schematic representation of the misestimation of uninterrupted CAG tract length in diagnostic fragment sizing of LOI alleles, using an example case of a canonical repeat carrier or LOI carrier with a fragment size indicating 17 CAG repeats. This diagram shows that standard diagnostic fragment sizing primers still anneal to the LOI variant sequence, and produce a fragment whose size is identical to a sample with a canonical repeat interruption. This causes underestimation of uninterrupted CAG tract length by 2 repeats since CAG-repeat counts are inferred from fragment sizes based on the assumption of a canonical interrupting sequence. (c) Age of onset for patients with canonical repeat interruptions or loss of repeat interruption, expressed as years earlier than Langbehn-predicted onset for each patient’s CAG size. Predicted onset for LOI carriers is calculated both from diagnostic CAG-repeat number (i.e., results from fragment sizing assay), and from true uninterrupted CAG-repeat number (i.e., diagnostic repeat number + 2). P values indicate results of Wilcoxon rank-sum tests. (d) Age of onset for patients with canonical repeat interruptions or loss of repeat interruption, expressed as a ratio of actual onset age over Langbehn-predicted onset for each patient’s CAG size. Predicted onset for LOI carriers is calculated both from diagnostic CAG-repeat number (i.e., results from fragment sizing assay) and from true uninterrupted CAG-repeat number (i.e., diagnostic repeat number + 2). P values indicate results of Wilcoxon rank-sum tests. (e) Onset ages in a cohort of 49 LOI carriers (red dots, with exponential fit shown using dotted red line) compared with Langbehn-predicted mean onsets across diagnostic CAG sizes (blue line) and corrected uninterrupted CAG sizes (orange line).

Importantly, it remains unclear whether the LOI variant’s effect on AOO is a true biological effect, or merely an artifact of diagnostic underestimation of the uninterrupted CAG length. The variant causes the loss of the interrupting CAA codon but does not cause diagnostic sizing failure, as standard primers still recognize the variant sequence. Consequently, although polyglutamine repeats are counted accurately, uninterrupted CAG tracts are underestimated by two trinucleotide repeats for LOI carriers, since CAG-length inference assumes canonical interrupting sequences (Fig. 1b). Since uninterrupted CAG repeats predict AOO better than polyglutamine repeats,14,15 it has been unclear to what extent this CAG-length underestimation fully underlies the earlier onset in LOI carriers, or whether in addition there is a biological effect beyond this underestimate.

We developed a novel PCR assay to identify LOI carriers, and screened DNA from all unrelated HD pedigrees in the University of British Columbia (UBC) HD Biobank, enabling unbiased estimation of variant frequency among affected patients. Additionally, we have combined clinical data from additional patients identified through our screen with previously published data to better characterize the LOI variant’s effect on AOO. This represents a 2.3-fold increase in LOI carrier sample size over any published cohort.

In this report, we identify a true biological effect of the LOI on AOO beyond increasing uninterrupted CAG length. We demonstrate that the implications of the LOI are particularly relevant to persons with RP range CAG-repeat lengths. We therefore propose that this variant should be tested for and considered when interpreting predictive testing results, particularly for those in the RP range.

MATERIALS AND METHODS

Ethics statement

Samples and clinical data were collected with informed consent and ethical approval from the UBC Children’s and Women’s Health Centre Research Ethics Board (H06-70467, H06-70410) and the Ruhr University Bochum Medical Faculty Ethics Committee (18-6563-BR).

PCR assay for LOI variant detection

We optimized a two-step PCR assay for LOI detection. First, we amplified a CAG tract–surrounding region using Roche Taq polymerase (11146165001) using the following primers: forward 5′-GGGACGCAAGGCGCCGTAG; reverse 5′-GGCTGAGGAAGCTGAGGAG. Cycle conditions were as follows: 7 minutes denaturation (95 °C), 35 cycles (30 seconds at 95 °C, 30 seconds at 63 °C, 30 seconds at 72 °C), 7 minutes extension (72 °C). The forward primer contains a mismatch allowing selective amplification of the rs13102260 major allele (G), which reduces off-target amplification downstream.

We diluted these PCR products 1:10, then performed PCR using AmpliTaq Gold360 DNA Polymerase (4398833), with 1 uL of diluted product per 25 uL PCR reaction. The forward primer sequence for this step (5′-GCAGCAGCAGCAGCCGCCG) is LOI-specific; reverse primer and cycle conditions were identical to the previous step.

We separated products on 3% agarose and confirmed positive samples by clonal Sanger sequencing as described.14

Cohort information

To determine LOI variant frequency in the symptomatic HD population, we assembled and screened genomic DNA samples from 491 unrelated affected HD probands. This cohort is comprised of DNA from the first affected family member with known CAG size and AOO from each UBC HD Biobank pedigree.

To estimate LOI frequency among symptomatic individuals in the RP range, we identified all affected individuals with reported allele sizes 36–39 from UBC (n = 37) and Bochum (n = 22) biobanks, and screened these 59 RP subjects for the LOI.

To determine the effect of the LOI on AOO, we compiled data from all available published LOI carriers (n = 16 individuals reported by Wright et al.;14 n = 21 individuals reported by GeM-HD15) and individuals newly identified in this study (n = 12), for a total of 49 individuals. GeM-HD CAG lengths and AOO for LOI carriers were extracted from published information using WebPlotDigitize (https://automeris.io/WebPlotDigitizer/) based on harmonized results of three independent raters.

For comparison, we identified 473 symptomatic non-LOI carriers with known CAG sizes and AOO; this group excluded individuals with known sequence variants in the interrupting region, and individuals with CAG sizes >52 to match the upper CAG size of the LOI samples.

AOO statistical analyses

We predicted mean AOO for each CAG size according to Langbehn et al.10 and compared with observed AOO in years earlier (predicted onset–observed onset) and as a ratio (observed onset ÷ predicted onset).

Because diagnostic sizing underestimates true uninterrupted CAG size in LOI carriers, we analyzed predicted AOO for LOI carriers using both diagnostic CAG size (from fragment analysis) and true uninterrupted CAG size (diagnostic CAG size + 2).

We performed statistical analyses using R. We identified significant differences in AOO between genotypes using Wilcoxon rank-sum tests, and fitted a mean predicted AOO curve to the LOI data using an exponential model.

RESULTS

The LOI variant is present in ~1% of symptomatic HD patients

Of 491 unrelated symptomatic patients (CAG 37–88), we identified 5 LOI carriers, putting the variant frequency at 1.02% (95% CI: 0.13–1.91%) within the symptomatic population. In all cases, the LOI was on the expanded allele. This cohort includes all unrelated probands in the UBC Biobank with available CAG size and AOO data.

LOI carriers exhibit significantly earlier onset across all uninterrupted CAG sizes

We compared AOO in 49 individuals carrying the LOI variant to predicted mean AOO using the Langbehn formula.10 Based on predicted onset according to diagnostic CAG sizing, LOI carriers present on average 20.4 years earlier than expected. (Fig. 1c). After correcting for CAG size underestimation to reflect true uninterrupted CAG sizes, LOI carriers still showed a mean of 9.5 years earlier onset. (Fig. 1c). LOI carriers showed significantly earlier AOO based on either diagnostic (p = 2.2 × 10−16) or uninterrupted CAG size (p = 2.8 × 10−12).

To eliminate biases caused by averaging across individuals with different CAG sizes, we determined AOO ratios for all individuals (actual onset age ÷ expected onset age). AOO ratio significantly differed between canonical repeat interruption carriers (AOO ratio = 0.96) and LOI carriers, based either on diagnostic CAG size (0.68, p = 2.2 × 10−16) or uninterrupted CAG size (0.81, p = 3.1 × 10−10) (Fig. 1d).

To determine whether CAG size–dependent effects drive hastened onset, we plotted LOI carriers by CAG size (Fig. 1e). Mean AOO is earlier than Langbehn-predicted onset based on either diagnostic or uninterrupted CAG size across all repeat sizes examined (36–50).

The LOI is enriched among early onset and symptomatic RP range cases

Individuals in our screening cohort with AOO within the 10th percentile for their CAG size showed 5.5-fold enrichment for the LOI (Fig. 2a).

Fig. 2: Prevalence of the LOI variant in HD patient sub-populations.
figure 2

(a) Prevalence of the loss of interruption (LOI) variant among symptomatic Huntington disease (HD) populations as a proportion of total patients carrying the LOI allele. Prevalence is presented for the overall HD symptomatic population, among cases with onset earlier than predicted for diagnostic CAG size (<10th percentile), within the reduced penetrance (RP) allele range (36–39), and among cases in the RP allele range with onset earlier than predicted for CAG size (<10th percentile). (b) Proportion of patients carrying the LOI at each CAG size within the RP range (diagnostic CAG sizes 36–39).

We also assembled a cohort of symptomatic RP allele carriers, comprising all symptomatic individuals with diagnostic CAG-repeat lengths of 36–39 (n = 59). This cohort was markedly enriched for the LOI, which was present in 32.2% of these symptomatic RP allele carriers, and 86% of those with diagnostic CAG sizes of 36 or 37 (Fig. 2a, b). Among symptomatic RP allele carriers with AOO within the 10th percentile for their CAG size, 77.8% had the LOI (Fig. 2a).

DISCUSSION

Diagnostic CAG sizing underestimates true uninterrupted CAG size for LOI carriers

Individuals at risk for HD who request predictive testing frequently cite reasons involving making informed choices about their future, including education or careers.17 These decisions are likely influenced by information these individuals receive regarding their likelihood of presenting symptomatically with HD.

It is crucial to recognize that uninterrupted CAG tract lengths are underestimated by two CAG repeats in LOI carriers (Fig. 1b). For LOI carriers, information on potential AOO will be more inaccurate for individuals with shorter CAG tracts, in whom two additional uninterrupted repeats constitute a larger percentage of total CAG count. In some cases, underestimation by two CAG repeats means LOI carriers actually have 40 or 41 repeats, but are mistakenly informed they fall within the RP range. In our cohort, 15.6% of symptomatic patients with diagnostic CAG sizes 38–39 were inaccurately informed that they carry an RP allele.

These inaccuracies have consequences beyond assessment of onset, including eligibility for clinical trials enrolling individuals within specific CAG size or CAG–age product score ranges.

LOI presence may influence penetrance of RP range alleles

Symptomatic RP range individuals have significant enrichment for the LOI variant (~30-fold more frequent than in fully penetrant alleles). The LOI is found in the vast majority of symptomatic RP range individuals with CAG sizes 36–37 (86%), or with AOO in the earliest 10th percentile for their CAG size (77.8%), suggesting that the LOI may influence whether RP range individuals develop HD during their expected lifetime. We examined all available premanifest RP subjects in the UBC Biobank (n = 59), but could not assess how long RP individuals lacking the LOI remain asymptomatic, because we lacked an adequate cohort of sufficiently old asymptomatic RP allele carriers. We hypothesize that RP range individuals with the LOI more frequently become symptomatic within a normal lifespan. This may explain why general population RP allele penetrance7 is lower than penetrance within clinically ascertained families.8

LOI screening during HD predictive testing could improve patient information

Screening for the LOI during predictive testing for HD would improve accuracy of CAG sizing results and relevance of information patients are given, particularly for persons with 36–39 repeats. As shown in Fig. 1c–e, correcting diagnostic sizing of LOI carriers to reflect true uninterrupted CAG sizes does not fully eliminate the discrepancy between actual and predicted onset in these patients. Individuals with a family history of RP alleles or earlier-than-expected onset are a particularly high priority for LOI screening.

Correcting CAG size reduces, but does not eliminate, the LOI variant’s effect on AOO, suggesting a CAG-independent biological effect

Correcting for CAG-repeat underestimation in LOI carriers does not eliminate the effect of the LOI on earlier AOO. Although the mechanism remains to be determined, we have previously proposed that the increased somatic instability observed in LOI alleles compared with alleles with canonical repeat interruptions may produce huntingtin protein with even longer polyglutamine lengths, causing greater damage and earlier onset.14 Repeat instability has been implicated as a modifier of numerous repeat expansion disorders, including HD,16,18 and is observed to increase in the absence of repeat interruptions,19 making this mechanism plausible. However, it is possible that the LOI acts through other mechanisms, such as translation efficiency alteration or increased repeat associated non-AUG (RAN) translation toxicity.20

In conclusion, three independent groups14,15,16 have observed that the LOI hastens AOO in small cohorts. Now, our combined and updated data set has elucidated LOI frequency and refined the magnitude of the effect on AOO. Finally, by developing and optimizing a novel LOI-specific PCR assay, we provide a reliable tool to offer routine screening to patients who could be impacted by inheritance of the LOI.