Abstract
Background:
Copy number variants (CNVs) have recently been reported to be associated with several autoimmune conditions. Moreover, loci involved in immunity are enriched in CNVs. Therefore, we hypothesized that CNVs in immune genes associated with Graves' disease (GD) may contribute to the etiology of disease.
Methods:
One hundred ninety-one North American Caucasian GD patients and 192 Caucasian controls were analyzed for CNVs in three major immune regulatory genes: CD40, PTPN22, and CTLA-4. Copy number was determined using quantitative-PCR (Q-PCR) assays specifically designed for determining copy numbers in genomic DNA. Additionally, a well-characterized CNV in the amylase gene was typed in a separate dataset of DNA samples that were derived from cell lines or blood.
Results:
No CNVs could be confirmed in the CD40 and CTLA-4 genes, even though a CD40 CNV is cataloged in the Database of Genomic Variants. Only the PTPN22 CNV was confirmed in our cohort, but it was rare and appeared in only two individuals. A key finding was that the source of DNA has a significant effect on CNV typing. There was a statistically significant increase in amylase locus deletions in cell line-derived DNA compared to blood-derived DNA samples.
Conclusions:
We conclude that CNV analysis should be performed only using blood-derived DNA Samples. Additionally, the CTLA-4, CD40, and PTPN22 loci do not harbor CNVs that play a role in the etiology of GD.
Introduction
So far, variants that were the focus of genetic studies in GD included single-nucleotide polymorphisms (SNPs) and microsatellites. However, recently it has been shown that copy number variants (CNVs) are also associated with complex diseases, including autoimmune diseases [reviewed in Ref. (10)]. CNVs are large DNA segments, ranging from kilobases to megabases, that are altered within the genome as a result of duplications, deletions, insertions, inversions, or complex combinations of rearrangements [reviewed in Ref. (11,12)]. Immune regulatory genes have been found to be enriched with CNVs, suggesting that CNVs in immune-related genes may predispose to autoimmune diseases [reviewed in Ref. (12,13)]. We hypothesized that CNVs in the three major immune-regulatory genes associated with GD, CTLA-4, CD40, and PTPN22, may play a role in the etiology of GD. Therefore, the aim of this study was to test CNVs in these three genes for association with GD.
Subjects and Methods
Subjects
The project was approved by the Mount Sinai Institutional Review Board. One hundred ninety-one North American Caucasian GD patients were studied. GD was diagnosed by (i) clinical and biochemical primary hyperthyroidism, (ii) diffuse goiter, and (iii) the presence of TSHR antibodies and/or a diffusely increased I-131 uptake in the thyroid. Of the 191 GD patients 107 (56.0%) had ophthalmopathy. The average age of onset of GD was 40.4 years. Our controls consisted of 192 North American Caucasian individuals without personal or family history of thyroid disease. All controls had normal thyroid functions, and negative thyroid antibodies.
DNA purification
For the comparison of CNVs between blood- and cell line-derived DNA, 218 blood-derived DNA samples and 253 cell line-derived DNA samples were extracted using the Puregene kit (Gentra Systems, Minneapolis, MN). For the CNV analysis, 191 patient and 192 control blood-derived DNA samples were studied.
Analysis of copy number variants
To type CNVs in each DNA sample, we used the TaqMan Copy Number Assays (Applied Biosystems, Foster City, CA): for CD40, Hs04041560_cn and Hs99999100_s1; PTPN22, Hs07226371_cn; and CTLA-4, Hs04714283_cn. For the amylase 2A CNV we used the TaqMan assay; Hs04204136_cn. Quantitative-PCR (Q-PCRs) were performed using the applied biosystems universal genotyping Master Mix in a 20 μL duplex reaction that contained FAM-MGB target gene probes and VIC-TAMRA reference gene (RNaseP part number 4403326) probes and 20 ng of genomic DNA. The amplification reaction was performed using the Applied Biosystems 7300 real-time PCR machine using the following PCR program: 2 minutes of incubation at 50°C, initial DNA polymerase enzyme activation at 95°C for 10 minutes, followed by 40 cycles of: denaturation at 95°C for 15 seconds and annealing/extension at 60°C for 1 minute. Data were collected at the end of each 60°C step. All samples were run in quadruplicates and standard deviations were calculated. The 7300 software automatically excluded samples with large standard deviations based on a predesigned algorithm. The results were expressed as the threshold cycle (Ct), that is, the cycle number at which the PCR product crossed the threshold of detection. The Copy Caller program (Applied Biosystems) was used to determine copy numbers of the tested gene in the samples. This program is based on the ΔΔCt method and obtained copy number values as follows: Ct values were imported into the program and normalized to the reference gene (RNase P) generating a ΔCt value. To obtain the ΔΔCt value of the samples, since all samples had unknown copy numbers, the program found the ΔCt value obtained in the majority of the population and assigned this ΔCt value as that of a sample with two copies, as this is what would be expected since most normal individuals have two copies at all CNVs, having gotten one copy from the maternal side and one from the paternal side. Once found, this ΔCt value was then subtracted from the ΔCt values of all other samples, thus giving a ΔΔCt value. The ΔΔCt value was then used to find the relative quantification of the gene analyzed using the formula 2(−ΔΔCt). A relative quantification value of 1 is equivalent to two copies, and the program uses this to calculate all other sample copy numbers. This method has been tested and proven to provide reliable results.
Statistical analysis
The comparisons of the CNV typing between patients and controls were performed using the χ 2 test. We used EpiInfo 3.4.2 software (CDC, Atlanta, GA) for the statistical analyses. A p-value of <0.05 was considered statistically significant.
Power calculations
Power calculations were performed using CDC simulation software (Epi Info, Version 3.3.2; CDC). We assumed the lowest population frequency of the susceptible CNV to be 10% since we were analyzing only common CNVs. Our power calculations indicated that our dataset of 191 patients and 192 controls would give our study 80% power to detect a difference between the patients and the controls resulting in an odds ratio of >2.37 with an α of 0.05. Thus, our dataset gave us enough power to detect biologically significant CNVs.
Results
Comparison of copy number variations in blood-versus cell line-derived DNA
Since Epstein Barr Virus (EBV) immortalization of B-cells has been shown to introduce genetic changes in the DNA, we first compared cell line-derived DNA to whole blood-derived DNA. To test this potential artifact, we analyzed a well-characterized CNV in the amylase gene (chromosome 1, at base positions 103,911,245-104,109,897 [NCBI/hg18 Build 36;

Comparison of copy number variants between blood- and cell line-derived DNA. To compare the differences between blood- and cell line-derived DNA, we analyzed a well-characterized CNV in the amylase gene. Calculated copy numbers of the AMY2A gene in blood-derived (left) and cell line-derived (right) DNA samples are shown. Each dot represents the copy number of each individual sample. The area between the horizontal black bars represents the range of values that constitutes two copies. Gray-shaded regions signify borderline values that cannot be clearly assigned, either a one or two copies (lower region) or a two or three copies (upper region). Samples that fell in these gray areas were excluded from analysis. DNA samples derived from cell lines showed a significant increase in the number of samples showing deletions compared to those derived from whole blood (p = 0.008).
Of the 218 blood-derived DNA samples, 165 (75.7%) were healthy controls and 53 (24.3%) were patients with type 1 diabetes and thyroiditis.
Of the 253 cell line-derived DNA samples, 199 (78.7%) were healthy controls and 54 (21.3%) were patients with type 1 diabetes and thyroiditis.
NS, not significant.
Analysis of copy number variations in immune regulatory genes associated with GD
CNV assay selection
To identify CNVs in the CD40, PTPN22, and CTLA-4 genes, we searched the Database of Genomic Variants (DGV;
CNV analysis of the CD40 gene
Even though a CD40 CNV is listed in the DGV, the CD40 gene showed no copy number variation in either the controls or GD patients (Fig. 2A). To confirm this surprising result, we used another CNV assay for the CD40 CNV (ABI assay No: Hs99999100_s1). This assay uses primers within exon 3 of the CD40 gene, and again showed 2 copies for all samples tested with no evidence for any deletions or duplications (data not shown).

Copy number variation analysis of CD40, PTPN22, and CTLA-4. TaqMan CNV assays were selected for CD40 and PTPN22 covering CNVs currently deposited in the Database of Genomic Variation. For CTLA-4, however, there was no cataloged CNV in the database, so a CNV assay encompassing the 5′UTR of the CTLA-4 gene was chosen. Each dot represents the copy number of each individual sample. The area between the horizontal black bars represents the range of values that constitutes two copies. Gray-shaded regions signify borderline values that cannot be clearly assigned, either a one or two copies (lower region) or a two or three copies (upper region). Samples that fell in these gray areas were excluded from analysis. Calculated copy numbers of CD40
CNV analysis of the PTPN22 gene
PTPN22 showed no copy variation in the GD patients, but in the control samples, there was one duplication and one deletion present (Fig. 2B), demonstrating that this is a rare CNV that is not associated with GD.
CNV analysis of the CTLA-4 gene
Since no CNV was described in the CTLA-4 gene we first typed a smaller cohort of 56 GD samples and 15 controls to detect a CNV in the CTLA-4 gene. In all 71 samples tested we found no copy variation in the CTLA-4 gene (Fig. 2C).
Discussion
Copy number variants (CNVs) are areas in the genome of duplications or deletions of large DNA sequences ranging from kilobases to megabases [reviewed in Ref. (11)]. CNVs have received much attention recently as they were suggested as a major source of human phenotypic variation specifically susceptibility to complex diseases. Moreover, recent data suggest that genes involved in immunity are particularly enriched in CNVs (13,14). These findings raised the possibility that immune-gene CNVs may predispose to autoimmune diseases, in a similar manner to the well-documented associations between immune-gene SNPs and autoimmunity (1). To date, several autoimmune diseases have been found to be associated with immune-regulatory gene CNVs (15 –20). Therefore, we hypothesized that immune-regulatory genes associated with GD may harbor CNVs that influence susceptibility to disease. We tested the 3 major non-HLA immune-regulatory genes associated with GD: CTLA-4, CD40, and PTPN22.
CD40 is a surface receptor with diverse functions, including activation of B-cells and antigen presenting cells, immunoglobulin class switching, and IgG secretion. CD40 is a major susceptibility gene for GD (21); therefore, we tested whether a CD40 CNV, which is cataloged in the DGV, was associated with GD. However, we found no deletions or duplications in the locus of the DGV-listed CD40 CNV when testing 191 GD patients and 192 controls. All 383 individuals had two copies at this locus. This suggests that this CNV is very rare.
The PTPN22 gene, responsible for encoding a lymphoid tyrosine phosphatase, is a powerful inhibitor of T-cell activation. A C/T SNP in the PTPN22 gene causing an arginine to tryptophan change at position 620 was found to be associated with GD (22), as well as other autoimmune diseases (23). When we tested a DGV-deposited CNV in the PTPN22 gene in our large cohort, only two controls showed a copy number change at this locus, one deletion and duplication. None of the patients showed copy number changes at this locus, again suggesting that it is a rare CNV that does not contribute to the etiology of GD.
CTLA-4 is a negative costimulatory molecule that suppresses the activation of T-cells. CTLA-4 is also expressed on T-reg cells and is important to their function. Several CTLA-4 polymorphisms are associated with GD, as well as with other autoimmune diseases (1). When we tested the CTLA-4 gene locus for copy number changes, we could not find a CNV in this gene, suggesting that CTLA-4 CNVs do not predispose to GD or other autoimmune diseases.
An important and surprising finding from our study is that the source of DNA, whether cell line-derived or blood-derived, significantly altered the CNV analysis. Our data clearly demonstrated that DNA derived from cell lines contained novel CNVs that have been introduced during transformation, but do not exist in blood-derived samples that represent the native unperturbed DNA. As we were concluding this study, the Wellcome Trust group published an article that reported the same phenomenon. They have shown increased presence of variation in cell line-derived samples compared to blood-derived samples (24). Typically, cell line-derived DNA is purified from EBV-transformed B-cells, resulting in immortalized B-cells that can serve as an unlimited source of DNA. However, the immortalization of B-cells can introduce duplications and deletions as a result of defects in cell cycle checkpoints. It has been shown that cells immortalized using viral oncogenes inactivate the p53 and p16INK4a/Rb pathways that extend cells life spans and allow them to override the normal cell cycle checkpoints. This can introduce genetic aberrations, including aneuploidy, and copy number changes [reviewed in Ref. (25)]. In addition to changes in large chromosome segments, the genomic instability in these cells can also involve subtle base substitutions and deletions or insertions of a few nucleotides [reviewed in Ref. (26)]. These findings may explain the presence of CNVs in the DGV (e.g., CD40) that could not be confirmed in our large cohort of patients and controls, as these CNVs may be cell line derived. Taken together, the findings from the Wellcome Trust study and our study demonstrate that the source of DNA for CNV analyses needs to be taken into serious consideration. Most likely CNV studies should be limited to blood-derived DNA samples. Therefore, in the current study all GD patients and control DNA samples were derived from whole blood.
In conclusion, our study showed that copy number variation in the immune regulatory genes CD40, PTPN22, and CTLA-4 do not play a role in the etiology of GD. It is possible that previous CNV studies in complex diseases may need to be re-analyzed if the DNA samples were derived from cell lines. For the 3 immune-regulatory genes we tested, the entire genetic risk is likely to be attributed to non-CNV genomic variations.
Footnotes
Disclosure Statement
The authors of this article have nothing to disclose.
