Introduction

Primary antibody deficiency (PAD) is considered as the most frequent form of primary immunodeficiency (PID), with a prevalence of about 1 in 600 in the general population.1,2 The clinical picture is highly variable (ranging from asymptomatic to severe) and includes infection, autoimmunity, allergy, lymphoproliferation, enteropathy, and malignancy. The wide spectrum of immunological presentations of PAD constitutes B-cell lymphopenia, agammaglobulinemia, hypogammaglobulinemia, immunoglobulin (Ig) isotype deficiencies, hyper-IgM phenotype (HIgM), specific antibody deficiency, as well as a transient form of a humoral immunodeficiency.3,4 Several pathogenic variants in PAD patients that play key roles in B-cell activation, proliferation, differentiation, class-switch recombination, somatic hypermutation, and apoptosis have been identified; however, the etiology remains unknown in a majority of the patients.5,6,7

Positive predictive values of clinical algorithms for identifying PAD range from 19.1% in patients with hypogammaglobulinemia to 33.3% in patients with a HIgM in the Medicaid database,8 indicating the necessity of a correct genetic diagnosis. Besides confirming the clinical diagnosis, a molecular diagnosis also plays a pivotal role in the identification of new genetic defects, presymptomatic diagnosis, treatment decisions, prognosis prediction, and family counseling.9 Moreover, the clinical presentation of pathogenic variants in the known PID genes varies due to the severity of pathogenic variant, the protein domain involved, and the presence of modifying genes or environmental factors.10

Advances in next-generation sequencing (NGS) methods allow an unbiased approach to obtain a correct diagnosis in patients with PAD. Targeted NGS panels with several hundreds of known PID genes may provide a first screening step, thus improving the classical approach of the sequencing of selected candidate genes.11 Nevertheless, this method is not sufficiently efficient due to the heterogeneous nature of these diseases, resulting in a clinical sensitivity of 15–40% in PID patients.12,13,14,15 Moreover, the majority of previous studies have only focused on patients with common variable immunodeficiency (CVID),16,17 and the molecular basis of a considerable proportion of patients with other forms of PAD remains unknown, particularly in high-frequency disorders such as selective IgA deficiency (IgAD) and IgG subclass deficiency.18,19,20 The aim of this study was to sequence the exomes of a substantial number of PAD patients with unidentified genetic defects and to investigate its impact on the clinical diagnosis and subsequent clinical management.

Materials and methods

Detailed methods, including study design, clinical/immunologic phenotyping, exome sequencing and variant assessment, and statistical approach are described in the Supplementary Methods. Informed consent (including explanations about the risks and benefits of research-based NGS) for the performed evaluations was obtained from all diagnosed patients (Supplementary Table S1) and their relatives, according to the principles of the ethics committee of the Tehran University of Medical Sciences. Data availability. Data for various analyses are mentioned throughout the text and derived data supporting the findings of this study are available from the corresponding author upon request. Any other data associated with this study are available in the Supplementary Data.

Results

Demographic features of undefined PAD patients

Among all registered PAD patients (n = 545, 27.2% of PID registry), molecular defects were found in 49 individuals with agammaglobulinemia and 28 patients with HIgM using conventional genetic methods whereas 342 patients were deceased or unavailable for the molecular investigation during the study period (Supplementary Table S2 and Fig. S1). The 126 remaining available patients (70 males, 56 females) from 109 unrelated kindreds were classified as undefined PAD and enrolled for exome sequencing (ES, Table 1). Although the national registry encompasses all age ranges of PAD patients, most patients were children and adolescents at the time of the study (52.3% were less than 18 years old) and parental consanguinity was recorded in 82.5%. The median age of the patients at onset of symptoms was 2 years (range 0.5–36 years; early-onset manifestation in 95.2%) and the median diagnostic delay (the gap between onset of the symptoms and diagnosis of PAD) was 4 years (range 0.25–39 years). Based on the immunologic profile of the remaining undefined patients they were classified as CVID in 81, HIgM in 14, agammaglobulinemia in 11, IgAD in 11, specific antibody deficiency in 5, IgG subclass deficiency in 3, and 1 patient was diagnosed with IgM deficiency. Of note, 10 patients progressed to a more severe form of PAD during the course of the disease. All patients had undergone clinical and immunologic phenotyping according to a standard classification for manifestations of the diseases. A summary of the results for all 126 patients is provided in Supplementary Table S3.

Table 1 Clinical characteristics of 126 primary antibody deficient patients

Molecular diagnosis outcome

ES analysis and subsequent confirmatory sequencing among the first-degree relatives of our patients resulted in a genetic diagnosis in 86 of the 126 probands (68.2 %), where 2 patients with variants newly implicated in disease were identified (CD70), 37 patients with known PID genes with newly identified phenotypes (19 unique genes), and 47 patients with pathogenic variants leading to the expected phenotypes (15 unique genes, Table 2). Experimental data and the results of functional assays on a selected group of this genetically diagnosed cohort have been published previously.21,22,23,24,25,26,27 The remaining 40 patients were classified as a group with nondefinitive pathogenic genetic variants.

Table 2 Diagnostic yield and summary of exome sequencing analysis in 126 primary antibody deficient patients

The majority of our patients were born in consanguineous marriages and they would thus be expected to demonstrate an autosomal (homozygous) recessive defect. However, the mode of inheritance was judged to be recessive in only 23 of the 35 genes (65.7%) accounting for genetic inheritance in 58 patients (67.4%). Three genes were found to be X-linked (8.5%) and 9 genes were assigned as autosomal dominant due to loss of function (6 genes; 17.1%) or gain of function (3 genes; 8.5%, Table 3). All disease-causing variants were pathogenic or likely pathogenic based on the American College of Medical Genetics and Genomics standards and were private or rare (Supplementary Methods). The type of all pathogenic variants is illustrated in Table 2 and Supplementary Table S4. Of note, large deletions of coding regions were identified in four patients carrying genetic defects within the LRBA and DOCK8 genes. Detail of the genetic diagnosis of the different types of PAD patients is summarized in Supplementary Figure S2.

Table 3 Inheritance pattern of the 35 genes identified in the study

Genotypic and phenotypic correlation

We decided to evaluate the genotype–clinical phenotype correlation in our cohort of PAD patients especially for affected individuals within the same family. However, no significant correlation was observed, indicating an effect of environmental factors and/or other modifier genes on the medical complications of the patients (Supplementary Table S5).

Given the immunologic heterogeneity of undefined PAD patients, we used the identified underlying gene defect to correlate the potential immunopathologic mechanisms with the point of arrest in B cell development using a B cell subset analysis. The Euro classification is shown for available patients in Supplementary Table S2 and the B cell pattern classification is illustrated in Supplementary Figure S3. An association between the affected gene and the pattern of abnormalities in the size of the B cell subsets were identified, mirroring the respective pathologic mechanism of the damaged molecule. Based on the observed five distinct B cell patterns, we could demonstrate that combined B cell production and germinal center defects (low numbers of transitional B cells and memory B cells) often represent DNA repair/recombination gene defects, being associated with an increased radiosensitivity and a mild form combined immunodeficiency, involving the RAG1, DCLRE1C, DNMT3B, and ZBTB24 genes. Early peripheral B cell maturation or survival arrest (loss of naive mature, marginal zone–like, and memory B cells) is associated with pathogenic variants in TNFRSF13B and TNFRSF13C, which is in line with impaired baseline constitutive activation and subsequently impaired antiapoptotic signaling. Pathogenic variants in B cell receptor (BCR) associated genes (e.g., BTK, BLNK, and IGMH) show a phenotype of both B cell activation and proliferation defects (combined reduction of marginal zone–like and memory B cells). Isolated germinal center blockage (exclusive decrease in the number of memory B cells associated with a normal or high level of IgM) suggests gene defects in costimulatory molecules for T-dependent immunity (e.g., CD27, CD70, and ICOS), which modify signaling for class-switch recombination and somatic hypermutation. Finally, the pattern of postgerminal center impairment (defects in terminal plasma-cell maturation, survival, or homing), leading to an isolated reduction of long-term plasma cells, might be compatible with LRBA and XIAP deficiencies (Supplementary Table S3).

Clinical and immunological features of solved versus unsolved patients

In the 86 PAD patients with a genetic diagnosis, defects in genes that encode proteins involved in the postgerminal center survival pathway accounted for 19.7% of the total disease-causing etiologies while proteins involved in DNA repair and recombination pathways defects accounted for 17.4% of genetic defects. Defects in B cell receptor signaling (in 13.9% of patients) and in the PI3K signaling pathway (in 11.6% of patients) were also other frequently observed defects.

The lowest diagnostic yield was obtained in patients with agammaglobulinemia (2 of 11 tested; 18.1%). We performed stratification on the patients who underwent sequencing to determine the parameters associated with a diagnosis. Of note, consanguinity and the severity of clinical presentation were similar between those who had a molecular defect identified (n = 86) and those who did not (n = 40). However the clinical diagnosis of agammaglobulinemia (p < 0.001), a late age of presentation (onset of disease >10 y; p = 0.03), and the absence of multiple affected family members (p = 0.01) were significantly more frequent in the patients who had no genetic defect identified. Genetic defect was identified in 90% of patients with a progressive form of PAD suggesting a higher rate of diagnostic yield in this subgroup of patients compared with other patients (p = 0.01). Among the 40 patients with nondefinitive pathogenic genetic variants, immunologic phenotypes were mainly compatible with a pattern of postgerminal center impairment (p = 0.03).

Clinical implications of molecular diagnosis

Our therapeutic approach was changed in 26 patients (20.6%) from Ig replacement to hematopoietic stem cell transplantation (HSCT) in a selected group with atypical combined immunodeficiency (LRBA, DCLRE1C, RAG1, PRKCD, JAK3, PNP, CD27, and CD70 pathogenic variants). Regular screening for cancer and avoidance of malignancy triggers were added to a routine management of 15 patients (11.9%) with defects in their DNA repair system. More mechanistically precise treatment, such as supplementation of rapamycin in patients with PI3KR1, PI3KCD, and LRBA deficiencies, was initiated in 22 patients. The results of ES aided 49 families (38.8%) in family counseling, leading to the performance of the prenatal diagnosis in 25 families (19.8%). In total, a correct genetic diagnosis affected the clinical treatment and management of 48.4% of probands in whom a pathogenic or probably pathogenic variant was identified.

Discussion

PAD is a group of clinically and genetically heterogeneous disorders, necessitating a wide molecular approach for a definitive diagnosis.28 Although it is not yet a consensus, recent genetic diagnostic studies on undefined PAD patients have tried to include all known PID genes in this subgroup of PID patients using NGS.

The success rate of this approach was reported to be 23.5% in 34 UK CVID patients,17 30% in a US CVID cohort of 50 patients,16 and 41.6% in 36 patients in a multinational antibody deficiency cohort29 (Supplementary Table S5). The diagnostic yield in these three NGS investigations was close to the results of targeted gene panel testing (15–40%, with a lower time consumption and cost for the latter)12,13,14,15 due to inadequate computational analysis for copy-number variation (CNV), and lack of utilization of the gene discovery power of NGS. Moreover, the rate of consanguinity of these three NGS studies was less than 10%.16,17 Because the majority of PID genes are autosomal recessive diseases, the high percentage of consanguineous marriages (82.5% in the current study) simplifies the analysis of the NGS data by increasing the likelihood that the disease-causing variants are homozygous pathogenic variants and a genetic defect was identified in 68% of the patients in our cohort, which represents the highest published diagnostic yield to date. The pattern of inheritance was also autosomal recessive in two thirds of the patients (mainly due to LRBA, DNMT3B, and ZBTB24 deficiencies), in contrast to less consanguineous, well-known PAD cohorts in Western countries with high frequency of autosomal dominant diseases, e.g., Germany (Center for Chronic Immunodeficiency, 25% solved mainly due to NFKB2, CTLA4 and NFKB1 deficiencies, personal communication with Dr. Bodo Grimbacher, 2017) or UK (UKPID Registry, 27% solved mainly due to TACI, PI3KCD, and PI3KR1 deficiencies, personal communication with Dr. Sinisa Savic, 2017). The male excess in our baseline population shows that X-linked disorders contribute approximately 7% of the patients. Taken together with an increased trend toward the discovery of autosomal dominant genes in PID genes during recent years,30 documenting 24.4% of our highly consanguineous cohort with autosomal dominant defects underscores a cautionary note in considering different Mendelian patterns of inheritance. Furthermore, we have reported five male patients with X-linked disorders reported as having consanguineous parents. Intriguingly, among the undiagnosed 40 patients, 29 (72.5%) had parental consanguinity.

We would not have been able to find the disease-causing variant in 40% of currently solved patients if we had used targeted sequencing alone, utilizing a list of known genes associated with PAD.18 Therefore, a high-throughput genomic approach should be performed as a first screening step for patients with PAD due to the overlap of clinical phenotypes derived from distinct genotypes. In line with this notion, our investigation resulted in an expansion of the clinical spectrum of pathogenic variants in 19 genes. We identified the late age of onset, the absence of affected family members, having unsolved agammaglobulinemia, and in immunologic profile of postgerminal center impairment as four factors associated with a lower yield of identified genetic variants. The yield for molecular diagnosis was highest in the progressive forms of PAD.

Of note, BTK, BLNK, and µ heavy chain deficient patients with atypical presentations of hypogammaglobulinemia and normal peripheral B cell counts, mimicking a CVID-like phenotype, indicates the potential problem using a conventional genetic approach, which is based on the Ig profile and lymphocyte counts. Although several efforts have been made during the past decade to classify PAD based on different clinical phenotypes and Ig profiles, our results complement other studies suggesting an early and comprehensive genetic strategy for all PAD patients. Reduced penetrance or variable expressivity was observed in our patients with LRBA deficiency (presenting as agammaglobulinemia, HIgM and CVID-like phenotypes) and PI3KR1 deficiency (presenting as HIgM and CVID), indicating an extension of the PID phenotype spectrum. Moreover, there is considerable immunologic heterogeneity in individuals with exactly the same gene defect within families. This finding is consistent with progression of different forms of PAD as we identified the causative genetic defect in four cases with progression of IgAD to CVID (P8, P35, P67, P68), two patients with progression of HIgM to agammaglobulinemia (P32, P43), two patients with progression of IgG subclass deficiency to CVID (P84, P85), and one patient with progression of CVID to agammaglobulinemia (P34). Nonetheless, in contrast to the Ig profile, the B cell developmental pattern showed a robust association with the disease associated genes. This general observation emphasizes the importance of filtering variants in collaboration with the treating clinicians and immunologists.

On the other hand, patients with hypomorphic pathogenic variants in severe combined immunodeficiency associated genes (e.g., RAG1, JAK3, PRKDC, and DCLRE1C) and incomplete and atypical presentation of syndromic disorders (e.g., associated with ZBTB24, DNMT3B, DKC1, and TTC7A pathogenic variants) illustrates the need for careful assessment of PAD patients to provide a reliable prognosis and to initiate appropriate treatment. Among the atypical patients, there was an IgA deficiency patient with a PNP pathogenic variant, whose clinical and immunologic profile has been described previously.23 Recently, another study has reported a 13-year-old patient with a homozygous missense pathogenic variant with a late-onset PNP deficiency (p.A117T) diagnosed with hypogammaglobulinemia at the age of 10,31 suggesting that residual PNP activity in patients with hypomorphic pathogenic variants can show atypical presentation like in adenosine deaminase deficiency, another enzyme important for purine degradation and salvage.32

Of note, a subgroup of patients presented with pathogenic variants in genes associated with the hyper IgE syndrome (HIES). A large deletion in DOCK8 was identified in a 7-year-old female with early-onset upper respiratory tract infection and mild eczema/food allergy and an immunologic phenotype resembling hypo IgM with low marginal zone and low memory B cell pattern. Due to the importance of DOCK8 in controlling both actin cytoskeleton-dependent and -independent immune responses, several humoral immune abnormalities have been reported in these patients including elevated, normal, or decreased levels of IgG and IgA, but nearly always IgE levels are elevated and IgM levels are reduced.33 The serum level of IgE was not compatible with HIES (between 87 and 150 IU/ml) in our patient, similar to another case with large exons 1–2 deletion of DOCK8.34 Three patients with different forms of PAD also carried PGM3 pathogenic variants. Deleterious defects in PGM3 affect glycosylation with a broad spectrum of clinical features. However, hypomorphic pathogenic variants with a residual function can only affect cell–cell recognition and immune signaling partially. Similar to our findings, among approximately 40 reported PGM3 deficient patients, the absence of B cells (p.G340del in 3 patients35,36) and hypo IgM phenotype (p.D502Y, in 1 patient35) have also been reported previously as well as normal levels of IgE in several patients.37 Of note, recent glycoproteomic studies have shown that PGM3 pathogenic variants do not affect the IgE molecule directly. However, tri-/tetra-antennary glycans of B cells of patients show significant biochemical changes. Therefore, elevated IgE, as well as other humoral immune dysregulation may be caused by decreased specific glycans on other glycoproteins that are involved in Ig production or receptor recognition.38

Establishing a correct differential diagnosis list for PAD including a more broad panel of PID genes is also suggested by our results, as pathogenic variants in RAC2 (congenital defects of phagocyte genes with defects of neutrophils motility), NLRP12, and MVK (autoinflammatory disorders genes affecting the inflammasome), STAT2 (gene involved in intrinsic and innate immunity), and STAT3 (known gene involved in immune dysregulation) could manifest with an aberrant Ig profile. Among these patients, P28 with an IL12RB1 pathogenic variants was a male receiving bacillus Calmette–Guérin (BCG) vaccination at birth without any complications and presented hepatomegaly, autoimmune enteropathy, and transient oral candidiasis at the age of 6 months. At 2 years of age, he was diagnosed with IgG2 subclass deficiency when Klebsiella pneumoniae infection in the upper respiratory tract was detected. Lymphocyte transformation test and IFN-γ production after IL-12 stimulation showed a low normal result compared with age-matched healthy controls. These findings, and previous genetic causes reported to underlie antibody deficiency in other studies, urge us to expand the expected genetic candidate of PAD.

As demonstrated in our workup chart (Fig. 1), a stepwise clinical and molecular diagnosis in patients with different types of PAD is suggested. In a mixed group of PAD patients, patients with gene defects that are associated with other forms of PID should be detected by NGS. However, such patients are likely to exhibit several abnormal immune parameters in addition to a perturbed Ig profile (Fig. 2, Supplementary Fig. S4). In our suggested decision tree for the diagnostic workup, probable pathogenesis, and appropriate treatment choices, it is recommended to start with clinical/immune parameters (particularly the B cell subset pattern) whereupon it is possible to distinguish other or additional defects and, in some cases, a genetically defined defect. What is left from the cohort will be a group of idiopathic PAD patients that should be investigated for other probable etiologies, including modifier genes; defects in enhancer, promoter, and intronic regions or other structural abnormalities; low-grade mosaicism; epigenetic markers; and environmental susceptibility factors. Another potential factor that should be considered is inadequate coverage of the gene of interest (variants that have low quality or reads support less than 4 are usually filtered during the analysis process).

Fig. 1: Clinical, immunological and genetic approach for a molecular diagnosis of primary antibody deficiency.
figure 1

CNV copy-number variant, PID primary immunodeficiency, HLA Human leukocyte antigen.

Fig. 2: Candidate gene defects and pathological mechanisms in patients with primary antibody deficiency based on clinical and immunological phenotyping and their appropriate treatment modalities.
figure 2

AH50 50% alternative hemolytic complement activity, KREC kappa-deleting recombination excision circle, CARD caspase recruitment domain, EBV Epstein–Barr virus, CH50 50% hemolytic complement activity, FFP fresh frozen plasma, G-CSF granulocyte-colony stimulating factor, HSCT hematopoietic stem cell transplantation, iNKT invariant NK-T cells, IgR immunoglobulin replacement therapy, IFN-γ interferon gamma, LPD lymphoproliferative disorder, TLR Toll-like receptors, TNF tumor necrosis factor, TREC T-cell receptor excision circle. References reviewed for compiling the gene list underlying dysgammaglobulinemia were refs.5,7,16,17,20,23

The main pathways involved in our PAD patients were the postgerminal center survival pathway and DNA repair signaling. We calculated and plotted the network of all 189 known PID-causing genes related to PAD using the human genome connectome-predicted direct biological distance between human genes. We found that PAD-related genes tend to be the central hub of the B cell activation and antigen receptor–mediated signaling pathways (Supplementary Fig. S5 and Table S7). The fitness of the newly identified gene, CD70, was tested and shown functionally close to previously known antibody deficiency related genes, mainly CD27, its unique ligand (Supplementary Fig. S6 and Table S8). The CD70 deficient proband (P11) manifested with an Epstein–Barr virus (EBV)-related lymphoproliferative disorder, a phenotype resembling three patients with CD27 deficiency (P7 with nodular sclerosis classical Hodgkin lymphoma, P8 with Hodgkin lymphoma, and P9 with disseminated infectious mononucleosis). As summarized in Supplementary Table S5, several families showed different types of antibody deficiencies with a similar pathogenic variant, possibly due to genetic or environmental modifying factors. Of note, another CD70 deficient patient (P10) showed only a severe viral infection in childhood and was seropositive for EBV. According to our previous observation that only 60% of patients with defects in CD27-CD70 signaling present hypogammaglobulinemia requiring Ig replacement therapy,22,25 P10 and P7 presented only specific antibody deficiency, indicating that complete humoral immune tests should be investigated in first-degree relatives of a PAD proband.

Additional capabilities of NGS on in silico analysis of CNVs provide further support for this approach as the first step in the genetic diagnosis of PAD. Lack of identification of disease-causing CNVs, particularly in compound heterozygous forms in cohorts with a large proportion of nonconsanguineous patients, could be a potential explanation for differences in diagnostic yield between our survey (4.6% of the solved cases harbored large homozygous deletions mainly within genes with a high level content of transposable elements39) and other studies.16,17 The effect of severe clinical phenotype, use of homozygosity mapping, and even parental consanguinity could not explain the level of diagnostic yield between 40 unsolved and 86 solved patients. However, familial segregation analysis in multiple case families could increase the discovery rate, a fact that seems to be consistent among different studies with different settings.16,17,29

Genome sequencing enables an advanced CNV detection (in the case of polymerase chain reaction free library preparation), and targeting all arbitrary exomes, deep intronic, regulatory domains, and structural intragenic regions (due to the absence of capture step and reference biases). For the time being, however, ES seems to be the most cost-effective approach for PAD patients with unknown etiology due to our current cognition about monogenic disorders (enriched in the coding exome, the most conserved region of the genome across metazoans), and less analysis complexity and risk of secondary findings.10

Our follow-up on patients with a molecular diagnosis using NGS provided valuable guidance for the treating physicians toward appropriate clinical management, prenatal diagnosis, and targeted therapy (utilization of rapamycin, abatacept, and tocilizumab as new modalities in the treatment of PAD17). Prenatal diagnosis in childhood-onset patients has an important significance apart from decreasing the burden of diseases, because the parents and also the first-degree relatives of patients are still at childbearing age and may thus need genetic counseling for their next pregnancy. Furthermore, a selected group of patients with symptoms suggesting a combined immunodeficiency could potentially benefit from HSCT. Despite obvious advantages of a molecular diagnosis, it should be realized that the identified genetic pathogenic variant needs to be evaluated together with the clinical and immunological phenotypes of affected patients before making a decision on clinical management and medical implication.

We suggest that NGS could replace a conventional multistep genetic approach because it can be expanded to cover all known PID associated genes and potentially detect CNVs and new genes associated with PID. More efforts should be spent on improving the NGS processing timeframe and gene-capture coverage to integrate it in a rapid molecular diagnostic pipeline, including the confirmation of a positive newborn screening test. PAD, particularly CVID, is likely to be a collection of several genetically distinct disorders. Although all the patients in our cohort could not be shown to suffer from a monogenic disorder, these patients should be further investigated for additional, nongenetic susceptibility factors.