Abstract
Background:
The revised guidelines for the management of medullary thyroid carcinoma recommend that genetic counseling regarding reproductive options, including preimplantation genetic diagnosis (PGD), be considered for all RET mutation carriers of reproductive age to avoid the transmission of multiple endocrine neoplasia type 2 (MEN2). However, the high complexity and cost of PGD have hindered its widespread use. Thus, it is necessary to establish a simple and relatively inexpensive method to facilitate the PGD of MEN2.
Patients and Methods:
A customized Nimblegen EZ sequence capture array was designed to capture the targeted regions, including the RET gene, and 1 Mb range on each side of the RET gene. Targeted, capture-based next-generation sequencing of three members of one family with MEN2A (the couple and the paternal father) was conducted to identify the informative markers. The diagnosis of the embryos was achieved through haplotype analysis based on informative markers and causative mutation.
Results:
Based on the sequencing results, 173 informative markers were detected, which were sufficient for the subsequent use for PGD. Seven informative markers and the causative mutation (RETC634Y) were selected and subjected to Sanger sequencing. Through haplotype analysis, four embryos without inheritance of the mutation haplotype of the RET gene were diagnosed as unaffected. One unaffected embryo was transferred, with one healthy baby born at 38 gestational weeks.
Conclusions:
Targeted, capture-based next-generation sequencing for identification of informative markers together with Sanger sequencing is an easy and efficient method for the PGD of monogenic diseases such as MEN2.
Introduction
M
As individuals with MEN2 have a 50% chance of passing the pathogenic variant to their offspring, reproductive options, including prenatal diagnosis and preimplantation genetic diagnosis (PGD), can be offered to these patients to avoid transmission. PGD is an in vitro fertilization technique with the aim of assisting couples with heritable genetic disorders to avoid the transmission of genetic diseases to their offspring. The basic advantage of PGD is that it avoids the need for invasive prenatal diagnosis, thereby circumventing arbitrary decisions regarding pregnancy termination. Since the first birth after PGD in 1990 (5), PGD has been applied in various single-gene disorders as well as hereditary cancer predisposition syndromes (6,7). It has been recommended in the revised American Thyroid Association (ATA) guidelines for the management of MTC in 2015 that clinicians should make patients aware of these technologies, and genetic counseling about these reproductive options should be considered for all RET mutation carriers of reproductive age, particularly those with RET mutations in codons 634 and 918 (3). However, it remains unclear whether clinicians or patients have used PGD over the last decade to prevent disease transmission (3). In addition to the ethical considerations for this adult-onset disease and alternative management methods involving timely surgical intervention, the high complexity and cost of PGD may have hindered its widespread use. Thus, it is necessary to establish a simple and relatively inexpensive method to facilitate PGD for MEN2.
Generally, PGD is initiated by selecting candidate polymorphic markers from the literature or online databases (8). Next, a few candidate polymorphic markers (short tandem repeats [STRs] or single nucleotide polymorphisms [SNPs]) are tested for usability based on the analysis of the couple, together with relevant affected and unaffected family members, which requires family-specific designs and can be time-consuming and labor intensive (9,10). Moreover, the identification of even two informative linked markers might be challenging for some families. Thus, it is important to develop a simple method to identify informative markers. In recent years, next-generation sequencing (NGS) has rapidly developed and has been applied to numerous clinical tests, such as noninvasive prenatal testing and preimplantation genetic screening for aneuploidy (11,12). However, the high cost of whole genome sequencing (WGS) or whole exome sequencing (WES) restricts the wider clinical application of these methods. Compared to these analyses, customized targeted sequencing of regions of interest through capture assay is economical and clinically pragmatic (13,14). Targeted sequencing can characterize the full spectrum of DNA mutations in the region of interest simultaneously. The informative markers should lie within the disease-causing gene or flank it within ±1 Mb to avoid the misdiagnosis caused by an intervening homologous recombination, making targeted sequencing an ideal method for the identification of these markers. Thus, it was proposed that targeted, capture-based sequencing can facilitate the identification of informative markers used for PGD. Here, the first clinical PGD using informative markers identified by targeted sequencing in a family with MEN2A is reported.
Methods
In vitro fertilization protocol and sample collection
A patient with classical MEN2A and his wife, who was infertile due to a tubal factor, were referred to the authors' center for PGD. Their family pedigree is shown in Figure 1. The proband, and his father and aunt, were affected by MEN2A. Genetic analysis indicated that all patients in this family are RETC634Y (c.1901G>A) carriers. For the identification of informative markers, approximately 5 mL of EDTA-anticoagulated peripheral blood was obtained from the proband (III-1), his wife (III-3), and his father (II-1). After controlled ovarian stimulation using a gonadotropin releasing hormone agonist (GnRHa) long protocol (15), intracytoplasmic sperm injection (ICSI) was used for the insemination of mature oocytes. On day 3, one or two single blastomeres biopsied from six cleavage-stage embryos were collected for subsequent analysis. Genomic DNA was extracted using the QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany), and whole-genome amplification (WGA) of the blastomere cells was performed using the REPLI-g Single Cell Kit (Qiagen) according to the manufacturer's instructions. Written informed consent was obtained from the patient and his family. The study protocol was approved by the Ethics Committee of the International Peace Maternity and Child Health Hospital of Shanghai Jiao Tong University School of Medicine.

Pedigree of the family affected by multiple endocrine neoplasia type 2A analyzed in this study. III-1: proband.
Informative marker identification
Library construction and targeted region capture
Targeted, capture-based NGS of the proband, his wife, and his father was conducted to identify the informative markers for this family. A customized Nimblegen EZ sequence capture array (Roche, Basel, Switzerland) was designed to capture the targeted regions, including the RET gene, and 1 Mb range on each side of the gene. The final capture region was chr10:42,272,517–44,625,797 (hg19), and the detailed region is shown in Supplementary Table S1 (Supplementary Data are available online at
Targeted sequencing and data analysis
After chip hybridization and elution, the enriched DNA libraries were qualified and then sequenced using Illumina HiSeq2500 Analyzers with pair-end 90 bp reads following the manufacturer's standard cluster generation and sequencing protocols. The raw reads were aligned to the human reference genome (hg19) using Burrows Wheeler Aligner (16), producing binary sequence alignment/map files containing various mapping information. The duplicate reads were then removed using Picard and realigned using a Genome Analysis Toolkit (GATK) (17). The variants were called using the SOAPsnp software and the GATK Indel Genotyper (17,18) and annotated using the ANNOVAR software package (
Principles of informative markers identification and embryo diagnosis
The principles of informative marker identification had been demonstrated in a previous study (20). Briefly, the genotypes of the proband (Fig. 1, III-1), his wife (Fig. 1, III-2), and his father (Fig. 1, II-1) at each SNP locus were comparatively analyzed. The informative markers were defined as those in which the proband was heterozygous while his wife and his father were homozygous. Through Mendelian analysis of the informative markers, the paternal or maternal origin of each haplotype of the embryo could be identified, and the inheritance of these haplotypes could be mapped, thereby determining whether the embryo inherited the mutation haplotype.
Polymerase chain reaction and Sanger sequencing
Polymerase chain reaction (PCR) of the seven selected informative SNPs (four upstream and three downstream) plus the causative mutation was performed separately using the WGA DNA products of the embryos. The PCR products were analyzed using the ABI3700 genetic analyzer (Thermo Fisher Scientific, Waltham, MA). Amniocentesis was performed under ultrasonographic control at 16 weeks of gestation for prenatal diagnosis to evaluate the accuracy of the method. Approximately 15 mL of amniotic fluid was collected, and the whole genome DNA was extracted using the QIAamp DNA Mini Kit (Qiagen). The presence of the RETC634Y was assessed using PCR amplification, followed by Sanger sequencing.
Results
The targeted regions of the father (Fig. 1, III-1), the mother (Fig. 1, III-2), and the paternal father (Fig. 1, II-1) were captured and sequenced to identify the informative markers. The sequencing and alignment statistics are shown in Table 1. An average of 58.59 Mb per sample was obtained. An average of 71.19% reads from each sample were uniquely aligned to UCSC Human Genome reference build 37, and 59.94% of paired reads were properly mapped onto the targeted regions. The average coverage and depth of the targeted region was 99.39% and 63.35 × , respectively. Overall, 1322 variants were detected and further analyzed for the identification of informative markers.
Naming convention in Figure 1.
Based on the genotyping of the parents and the paternal father, 173 informative markers were detected, with 107 markers distributed in the upstream regions and 66 markers distributed in the downstream regions of the RET gene (Supplementary Table S2). The four parental haplotypes were identified by Mendelian analysis of the informative markers. Among these markers, 18 markers were informative for phasing the paternal mutant and disease-causing haplotype 1 (F1), wherein the genotype of the father was heterozygous, while both the paternal father and mother were homozygous but had different genotypes. The remaining 155 markers were informative for phasing the paternal normal haplotype 2 (F2), wherein the genotype of father was heterozygous, while both the paternal father and mother were homozygous with the same genotypes.
Seven informative markers, with three markers informative for F1 and four markers informative for F2, were selected and subjected to Sanger sequencing in combination with the RETC634Y mutation. Through analyzing the genotypes of information markers of the embryos, the paternal haplotype for each of the embryos can be identified, and the diagnosis can be inferred. As shown in Table 2, embryos 1 and 3 were affected, as they inherited two or four informative alleles phased the mutant haplotype of the father, whereas embryos 2, 4, 5, and 6 were unaffected, as they inherited one to four informative alleles phased the normal haplotype of the father. The results of direct sequencing of the mutation C634Y were the same as the haplotype results, with the exception of embryo 1, most likely due to allele dropout (ADO; Fig. 2). As all the markers were located within ±1 Mb of the disease-causing gene, the possibility of intervening homologous recombination was low. In this context, the ADO was observed in all tested embryos, with an average rate of 31.25% (form 0 for rs12573211 to 66.67% for rs11238609).

Sanger sequencing results of the RETC634Y mutation in the six embryos (E1–E6) and the amniotic fluid (AF).
Naming convention in Figure 1.
The alleles of informative SNPs shown in bold phased the mutation and disease-causing haplotype of the proband (F1), and the alleles of informative SNPs shown in italic phased the normal haplotype of the proband (F2). In contrast, the other alleles are defined as non-informative alleles.
The computed alleles are underlined to indicate that ADO occurs at that locus.
A indicates “the embryo was affected,” and U indicates “the embryo was unaffected.”
Based on the sequencing results and the embryo morphology, one unaffected embryo with a high morphology score (embryo 2) was transferred to the uterus. The prenatal diagnosis using DNA from the amniotic fluid was consistent with the PGD results, and one healthy baby was born at 38 gestational weeks (Fig. 2).
Discussion
Prophylactic thyroidectomy can be considered as a prevention strategy for MEN2 patients, and it is recommended in infancy for patients identified to have a codon 918 mutation (highest risk), at or before five years for patients with codon 634 or 883 (high risk), and may be delayed for patients with all other mutations (moderate risk) until the calcitonin is elevated, which requires ongoing surveillance of individuals at risk (3). In contrast to prophylactic thyroidectomy, PGD is a primary prevention strategy to avoid the transmission of disease from an affected individual, and genetic counseling about PGD has been recommended in the revised ATA guidelines. However, the use of PGD for MEN2 is not widespread, partly due to the complexity and high cost of PGD. The present study performed PGD in a MEN2A family using targeted NGS in combination with Sanger sequencing, which is an efficient and relatively inexpensive method and can theoretically be used for the PGD of any MEN2 family. To the authors' knowledge, this is the first report of PGD using the strategy of array-based targeted sequencing in combination with Sanger sequencing of the WGA DNA products from the single cell of embryos.
During PGD, only one or a few cells biopsied from human preimplantation embryos were analyzed. Single-cell PCR is the classical and widely used method for the PGD of single-gene disorders. The genetic materials subjected to PGD are limited in comparison with those used in other genetic tests. Thus, the accuracy of the PGD diagnosis would be restricted by ADO, preferential amplification, and amplification failure (21). The development of MDA-based WGA could overcome the difficulty of insufficient DNA, providing a sufficient template for numerous genetic analyses with limited sequence representation bias (22). However, ADO remains the major issue for WGA DNA products. The average ADO rate of WGA DNA was approximately 25% (23,24). To avoid ADO, the multiplex amplification of one or more informative markers with or without the causative mutation has become the gold standard in single-cell PCR for PGD (25). As ADO occurs independently for different loci, theoretically, four fully informative markers are required for the detection of both alleles with confidence for the average ADO rate of 25%. However, the identification of informative markers can be time-consuming and labor intensive. As the informative markers should lie within the 1 Mb flanking region of the causative mutation to avoid misdiagnosis, it was hypothesized that targeted sequencing would be an effective method for the identification of informative markers. As expected, through array-based targeted sequencing and subsequent analysis, a total of 173 informative markers were identified for this family, which is sufficient for subsequent use for PGD.
Compared to the commonly used method of first selecting informative candidate markers based on the literature or databases, and subsequently ascertaining which markers are informative individually in family of interest, the method reported here is fast, automated, and easily performed. Recently, genome-wide SNP array analysis, which can genotype hundreds of thousands of SNPs simultaneously, has been demonstrated as a universal and automated method for the PGD of monogenic disorders (26 –28). An elegant approach called karyomapping based on SNP array analysis can theoretically be applied to all single-gene disorders without the need for customized family-specific tests (27,28). However, the cost of genome-wide SNP arrays is still high. Another method called preimplantation genetic haplotyping is also associated with the same problems (29). Compared to these methods, the proposed method is cost-effective, as the cost of NGS has dramatically decreased. For one family, the cost of PGD using the proposed method can be divided into three parts: (i) the cost of the capture array and NGS for three samples (the couple and one affected relative), (ii) the cost of WGA of the tested embryos, and (iii) the cost of PCR and Sanger sequencing of the tested embryos. The approximate cost of part 1 is about half to two-thirds of that using SNP arrays. The cost of part 2 is almost the same, regardless of which PGD method is adopted. However, the cost of part 3 is low and can be negligible compared to that of SNP array analysis, as one embryo requires one SNP array. Moreover, as the capture assay enables the capture of more than 10 samples in one customized product, the informative markers can be identified for more than three families of the same single-gene disease in one experiment without family-specific design, enabling the cost (part 1) to be further lowered. In addition, if multiple different disease-causing gene regions were captured, then the proposed method would also be suitable for the identification of informative markers for multiple families suffering from different diseases at one time. For example, the capture array could be designed to capture three different disease regions, including a 1 Mb range on each side of the RET gene, a 1 Mb range on each side of the PKD1 gene, and a 1 Mb range on each side of the PKD2 gene. Then, the array can be used to identify informative markers for the PGD of MEN2, polycystic kidney disease 1, and polycystic kidney disease 2, respectively. However, this method can only be used in cases where additional affected family members can provide a DNA sample, enabling the determination of informative markers. A limitation of the present study is the high ADO rate that was observed. This might be explained by the use of blastomere biopsy rather than blastocyst biopsy. Blastocyst biopsy for the PGD of MEN2A using the proposed strategy in the future is suggested. The high rates of ADO also illustrate the necessity of using multiple informative markers to acquire a precise diagnosis for each embryo and verify the advantages of the proposed method, which can identify multiple informative markers simultaneously.
In conclusion, the results of the present study indicate that targeted, capture-based NGS for the identification of informative markers together with Sanger sequencing is an easy and efficient method for the PGD of monogenic diseases such as MEN2.
Footnotes
Acknowledgments
This work was supported by the National Key Research and Development Program of China (2016YFC0905103), National Natural Science Foundation of China (no. 81471506; no. 81501231; no. 81401219; no. 81472861), the Shanghai Municipal Commission of Science and Technology Program (no. 14411965100; no. 15411964000; no. 16411963300; no. 15411966700; no. 14DJ1400102), the Research Grant from Shanghai Hospital Development Center (no. SHDC12014131), the Innovation Foundation of Translational Medicine of Shanghai Jiao Tong University School of Medicine (no. 15ZH4011), the Shanghai Municipal Commission of Health and Family Planning Program (no.15GWZK0701), and the Shanghai Jiao Tong University School of Medicine Program (no. 14XJ10083).
Author Disclosure Statement
The authors have nothing to disclose.
