Abstract
The human malaria vector Anopheles gambiae is becoming increasingly resistant to insecticides, spurring the development of genetic control strategies. CRISPR-Cas9 gene drives can modify a population by creating double-stranded breaks at highly specific targets, triggering copying of the gene drive into the cut site (“homing”), ensuring its inheritance. The DNA repair mechanism responsible requires homology between the donor and recipient chromosomes, presenting challenges for the invasion of laboratory-developed gene drives into wild populations of target species An. gambiae species complex, which show high levels of genome variation. Two gene drives (vas2-5958 and zpg-7280) were introduced into three An. gambiae strains collected across Africa with 5.3–6.6% variation around the target sites, and the effect of this variation on homing was measured. Gene drive homing across different karyotypes of the 2La chromosomal inversion was also assessed. No decrease in gene drive homing was seen despite target site heterology, demonstrating the applicability of gene drives to wild populations.
Introduction
Gene drives in vector control
Global control efforts have averted an estimated 1.5 billion cases of malaria in the last two decades but this progress has begun to slow, with 619,000 deaths reported in 2021 alone. 1 Malaria transmission persistence has been attributed to a combination of stalling or inadequate control programs, insecticide resistance of the mosquito vector, and treatment resistance of the parasite. 2 The World Health Organization has stressed the importance of developing novel control strategies to meet its malaria elimination goals.3,4
Genetic control strategies can achieve population modification or suppression of a target species without collateral damage to nontargets or the environment by the modification of the target genome, making these strategies highly desirable alternatives to widespread insecticide use. One such strategy is the use of selfish genetic elements with super-Mendelian inheritance rates known as gene drives. Gene drives can deliver a genetic payload or disrupt an essential gene while overcoming any subsequent fitness cost by severely biasing its own inheritance, allowing its spread in a population.5–10 Strategies using gene drives are being considered for the control of several pest species,11,12 and have progressed to the successful development of CRISPR-Cas9-based gene drives in the primary vector of malaria in Africa, Anopheles gambiae. 7
The Cas9 protein guided by an single guide RNA (sgRNA) is capable of making highly specific double-stranded breaks (DSBs) in a chromosome, allowing the introduction of an alternate sequence at the cut site using the cell's own DNA repair mechanism. 7 DSB repair by the cell can involve the use of a homologous template strand, usually the paired chromosome, which is copied to accurately repair the break.13,14 When a gene drive element is copied into the broken chromosome alongside the homologous template sequence, the gene drive is inserted at the breakpoint in a process known as homing. Homing from one chromosome to another in germ line cells means the gene drive will be integrated in the majority of gametes, resulting in its super-Mendelian inheritance in the next generation.
This mechanism can be exploited to bias the inheritance of a coupled effector gene through a population, such as antimicrobial peptides to impede malaria development, 10 or to target a gene essential for fertility and therefore reduce the target population size. 7
Gene drive resistance
The emergence of resistance to gene drives has been demonstrated empirically in synthetic drive constructs.15,16 CRISPR-based gene drive resistance occurs as small genetic differences at the cut site, reducing Guide RNA (gRNA) binding and therefore the ability of the Cas9 enzyme to create a DSB. These cut site mutations can arise during the DSB repair process via alternate repair pathways such as nonhomologous end-joining (NHEJ), which enzymatically repairs the cut without a template but with higher rates of error.15,17–19 If these genetic differences produce a functional allele with a fitness cost less than that of the gene drive, they may be positively selected for in the population. Functional alternate alleles produced by gene drive-induced mutations can reduce CRISPR-gRNA binding enough to confer complete resistance to the gene drive. 15
Strategies to reduce the likelihood of resistance developing have been suggested; modern gene drives will target genes that are haplosufficient (one functional copy is required for survival or fertility) and highly conserved, therefore making any mutations at the target site likely to result in unviability.8,15,20 This reduces the speed of gene drive resistance development but does not entirely prevent it; mutations produced during NHEJ will still eventually lead to resistance. 15 NHEJ, and therefore related mutations conferring resistance, can be reduced by using more efficient germ line-specific promoters with less accidental somatic expression of the Cas9 enzyme. 21 Multiple target sites in different genes can be used in a single gene drive system by multiplexing gRNAs; homing can occur at all target sites, making independently evolved resistance at all target sites necessary to prevent super-Mendelian inheritance of the gene drive.22–24
Any intervention that applies a strong selection pressure will eventually produce a similarly strong and concomitant pressure to evolve resistance. Resistance has historically only been discovered after the implementation of a control strategy, and after the resistance has become a public health issue. 25 By anticipating and investigating potential issues such as resistance during gene drive development, we can reduce the impact on control strategies. Genetic variation at the target site, whether produced by Cas9-mediated NHEJ or naturally present in a target population, could act as a barrier to successful implementation of gene drives.
Genetic variation: a barrier to gene drive success?
Single-nucleotide polymorphisms
Genetic differences around a gene drive target site, or target locus heterology (TLH), may occur naturally in a wild population even in highly functionally constrained genes. Differences within the gRNA target site have the most impact on drive efficiency, 26 but due to the nature of DSB repair, TLH will also potentially reduce the gene drive homing rate. Stringent regions of homology between the donor chromosome (containing the region to be copied) and the recipient chromosome (where the DSB occurs) are required for homology-directed repair (HDR) in mammalian cells, where 1.2% TLH within 1 kb of the DSB causes a 22% reduction in the recombination required for HDR. 27 Similar dependence on homology has been noted in Drosophila melanogaster, where 1.4% TLH suppressed recombination by 32% 28 ; and Aedes aegypti, with 1.2% TLH created by experimental recoding resulting in a 66% reduction in homing. 29
Given the conserved nature of DNA repair mechanisms, it is reasonable to expect that this sequence homology requirement would extend to An. gambiae, which has an incredibly diverse genome, including more than 57 million single-nucleotide polymorphisms (SNPs). 30
Chromosomal inversions
In addition to SNPs, the An. gambiae species complex contains more than 120 chromosomal inversions.31,32 These inversions vary in their geographical and seasonal distribution and have been associated with desiccation resistance, larval habitat, insecticide resistance, and malaria infection rate.31,33–37 The largest inversion in An. gambiae is the 2La/2L+a, which spans roughly half the length of chromosome 2L 38 ; the ancestral 2La form is implicated in anthropophilic behavior, aridity tolerance, and Plasmodium transmission.35,39–41 Recombination of inverted chromosomes in opposite orientations is reduced as the chromosomes are forced to form a loop to align. 42 Reduced recombination between inversion heterokaryotypes has been empirically demonstrated during meiosis in multiple species, including Drosophila (7.7-fold decrease). 43
The effect extends beyond the inversion breakpoints to suppress recombination in regions close to the inversion and increase recombination at distant regions, known as the interchromosomal effect, 44 and can change the recombination landscape enough to suppress recombination in homokaryotypes as well.45,46
A reduced recombination rate between inversion heterokaryotypes could theoretically interfere with HDR in gene drive releases, leading to reduced spread of a gene drive situated within the inversion into heterogenous wild populations. In an allelic drive system in Drosophila, inversion heterokaryotypes had a drive rate a third lower than inversion homokaryotypes. 47 Meiotic recombination between 2La/2L+a heterokaryotypes is at least fivefold less than between 2L+a homokaryotypes. 48 However, multiple gene drive systems have been developed within the 2La inversion site in An. gambiae successfully, with super-Mendelian inheritance (76–98%).49,50 As these were not developed with the 2La inversion karyotype in mind, or tested with different karyotypes, the impact on recombination rate during gene drive homing in An. gambiae is still unknown.
Variation outside of the target region
Genetic variation outside of the target region can also influence gene drive inheritance and resistance development. In a study of gene drive homing and resistance rates in different strains of Drosophila, all with identical target site sequences, inheritance rates ranged from 64.1% to 85.9%; increased inheritance was significantly associated with certain genotypes, but no SNPs were identified as contributing significantly. 51 Moderate variation has been noted in gene drive homing in different genetic backgrounds of Drosophila despite little to no variation within 200 bp of the cut site. 19
Background genetic variation can also influence the development of resistant alleles at the target site; no single gene was found to be significantly responsible for increased resistance development, indicating a combined effect of multiple genes. 51 Differences in homing efficiency may be due to the differences in a combination of genes, such as DNA repair mechanisms, DNA transcription or translation, or germ line expression. In naturally occurring gene drives, suppressors can evolve to reduce the impact of the drive in the population; these are often unlinked to the gene drive, such as small RNAs or alterations in heterochromatin structure. 52 Undoubtedly, the interaction between genetic variation and gene drive homing needs to be explored for their effective use in control strategies.
To assess the impact of TLH and inverted chromosomes on the homing of a gene drive element in An. gambiae, two well-characterized laboratory-created gene drive strains vas2-5958 and zpg-7280 were crossed with three alternate An. gambiae wild-type strains from across East, Central, and West Africa (Kisumu, N'Gousso and Tiassale), all with TLH around the cut sites. The vas2-5958 gene drive element is located within the 2La inversion; gene drive homing rates were compared between 2La heterokaryotypes and homokaryotypes.
Materials and Methods
Mosquito rearing
All mosquitoes were reared under standard conditions of 26 ± 2°C and 70 ± 10% relative humidity, with a 12-h light/dark cycle with 1-h dusks/dawns. Larvae were fed on ground fish food flakes (TetraMin® tropical flakes) and adults were fed 10% sucrose solution ad libitum. Adults were allowed to mate for 5–10 days before blood feeding and egg collection.
Mosquito strains
Two G3 colonies containing gene drive elements were used, both created by Hammond et al. and well-characterized.7,21 The vas2-5958 colony contains a CRISPR-Cas9 endonuclease construct within AGAP005958, an ortholog of the Drosophila yellow-g gene expressed in somatic ovarian follicle cells with an unknown function. 53 The AGAP005958 gene is located within the 2La inversion, 54 with a gRNA cut site within 4 Mb of the distal breakpoint. The zpg-7280 colony contains a similar construct in AGAP007280, ortholog of the Drosophila nudel gene also expressed in follicle cells with a known role in dorsoventral patterning of the developing embryo. 55 Both genes have a haplosufficient role in female fertility, making them useful targets for population modification or suppression of gene drive strategies.
The inserted construct for both colonies consists of a CRISPR-Cas9 protein under germ line-only promotion (zpg in the zpg-7280 line and vas2 in the vas2-5958 line), a gRNA sequence targeting the cut site for each line, respectively, under U6 (universal) promotion, and a red fluorescence protein marker with a 3 × P3 promoter, all flanked by attB recombination sites to allow insertion into previously created docking lines via recombinase-mediated cassette exchange. 56 The full sequence of vector p165 used to produce these two lines, with the only difference between them being the gRNA sequence, is available on GenBank (accession ID: KU189142). 7
The wild-type strains used in crosses were taken from colonies kept at the Liverpool School of Tropical Medicine7,57–59; details can be found in Table 1.
Wild-type Anopheles strains used in this work
Crosses of gene drive strains into alternate backgrounds
An outline of the methodology can be seen in Figure 1. The number and sex of mosquitoes used in each cross can be seen in Table 2. All F1 hybrid adults used in crosses were confirmed to be heterozygous for the gene drive element by screening for the red fluorescent protein (RFP) marker via fluorescent microscopy during the larval stage. Females containing the vas2-5958 gene drive are sterile due to unintended somatic promotion of Cas9 under the vas2 promoter 7 ; therefore, in vas2-5958 crosses, only males containing the gene drive were used. For zpg-7280 F1 crosses, female hybrids were used. F1 cross females were forced to lay in single deposition and up to 50 offspring per female were screened for the presence of the RFP marker to determine the rate of gene drive in the hybrid parent.

Experimental homing of a gene drive into alternate genetic backgrounds.
Details on the number, sex, and strain of each F0 and F1 cross
Drive rates were compared with data from Hammond et al.7,21 using a pairwise Wilcoxon test with false discovery rate correction (Supplementary Table S1).
Target-site sequence heterology
To determine the maximum potential TLH in each strain, F1 hybrids of each type were pooled and their wild-type chromosome (representing each wild-type strain) was sequenced. DNA was extracted from pools of 33–37 adults using a Wizard® genomic DNA purification kit (Promega), and a region of ∼690 bp spanning the gRNA cut sites for both vas2-5958 and zpg-7280 gene drives was amplified in two fragments either side of the gene drive insert. Fragments were amplified by polymerase chain reaction (PCR) using Phusion Hot Start II High-Fidelity DNA polymerase (Thermo Scientific™), with forward and reverse primers at a final concentration of 0.5 μM (Supplementary Table S2) and 1 μL of genomic DNA in a 50 μL reaction. NHEJ deletions around the cut site in these F1 hybrids were removed from the TLH analysis. ‘Saying’ A table of all SNPs can be found in SupportingInfo.xlsx.
PCR conditions were as follows: an initial denaturation step at 98°C for 30 s; followed by 30 cycles of denaturation at 98°C for 30 s, 30 s at the annealing temperature (Supplementary Table S3), and extension at 72°C for 15 s; and a final extension step of 10 min at 72°C.
PCR products were sequenced by Illumina MiSeq sequencing; reads were quality filtered and aligned against an amplicon containing all SNP variants present in G3 deep sequencing data 15 using CRISPResso. 60 Alleles present at >1% relative abundance were aligned to G3 sequences in Benchling to determine the percentage of mismatch between G3 and each strain at the homing sites (raw files accession: PRJNA914102). As vas2-5958 is known to produce “leaky” promotion and therefore maternal deposition of the Cas9 enzyme, resulting in NHEJ-induced deletions at the cut site in somatic tissue, any characteristic NHEJ deletions around the cut site in these F1 hybrids were removed from the TLH analysis. Separate analysis of a fourth strain can be found in the Supplementary Data, Supplementary Figure S1 and Supplementary Table S2.
Homing rate analysis in alternative 2La karyotypes
Mosquitoes from the vas2-5958 colony were backcrossed to G3 and offspring were screened for the gene drive marker; 65 F1 males were mated individually to three G3 females, with eggs collected from each group and screened for the gene drive element. Each male parent was karyotyped for the 2La inversion as previously described, 38 and drive rates in heterozygotes and homozygotes of both karyotypes were compared using a Wilcoxon test.
Results and Discussion
An. gambiae gene drives are robust to TLH
The vas2-5958 and zpg-7280 An. gambiae gene drive lines, originally made in the G3 background and targeting haplosufficient female fertility genes, were crossed into three different strains to create F1 hybrids, which were backcrossed to wild-type G3 to assess the F1 hybrid homing rate (Fig. 1C). The TLH around the cut sites was 5.3–6.6% between each strain and the gene drive background strain (G3), with significant variation between the left and right sides of both gene drive cut sites (Figs. 2 and 3). No SNPs were observed within the gRNA sequence; however, an SNP was commonly observed in the N base of the zpg-7280 -NGG Protospacer Adjacent Motif (PAM) site (Fig. 3). All F1 hybrids for both gene drive colonies produced super-Mendelian inheritance rates of the gene drive element (vas2-5958: 81.8–100%; zpg-7280: 92.0–100%), with no significant difference from the control (Fig. 4). No reduction in larval production was observed in zpg-7280 hybrids (Supplementary Fig. S3), suggesting no loss of fertility.

Target locus heterology in three strains (Kisumu, N'Gousso, and Tiassale) compared with G3, at two gene drive sites (vas2-5958 and zpg-7280), with alleles present at >1% relative abundance. The data represent the maximum potential TLH between each strain and G3, by comparing each allele from the pooled F1 hybrid wild-type chromosomes to a G3 sequence containing all known SNPs found in a deep sequencing data set of 24 G3 individuals. Each point represents an allele from pooled sequencing of adult mosquitoes, with percentage difference to G3. SNPs, single-nucleotide polymorphisms; TLH, target locus heterology.

Position and frequency of SNPs at two gene drive sites (vas2-5958 and zpg-7280) in three strains (Kisumu, N'Gousso, and Tiassale) compared with G3. The position of each SNP is given relative to the gene drive cut site, indicated by a dashed line. SNPs, which are not also found in G3, are marked with an asterisk. No SNPs were observed within the Guide RNA (gRNA) sequence of either cut site—however, at the zpg-7280 cut site, an SNP commonly occurred in the N of the -NGG Protospacer Adjacent Motif (PAM) site. SNPs, which are not also found in G3, are marked with an asterisk.

The inheritance rate of two gene drive elements vas2-5958 and zpg-7280 in the offspring of F1 hybrids of three different strains, compared with the control rate of inheritance in the gene drive colony (G3 background). Target locus heterology ∼600 bp around the cut site between each strain and the G3 wild type is given in percentages next to strain names. Homing into alternate chromosomes produced drive rates, which were not significantly different to the control drive rate (pairwise Wilcoxon test, corrected for false discovery rate). n.s., nonsignificant.
This result varies considerably from previous findings in Ae. aegypti 29 and D. melanogaster, 28 which saw significantly reduced HDR between sequences with 1.2% and 1.4% TLH, respectively. Differences in methodology between the two studies and this work make direct comparison difficult; both used artificially generated silent mutations spaced at regular intervals to generate TLH, which could have a different impact recombination than the naturally occurring, irregularly spaced TLH in the strains used here. In addition, Ang et al. 28 measured HDR between a donor plasmid and recipient chromosome rather than between chromosomes, and Do et al. used heat-inducible I-Sce1 for DSB formation rather than Cas9. It may be the case that these previous studies, while well suited to describe their respective systems, were not good predictors of the dynamics of Cas9-based gene drive homing.
While we cannot definitively state based on comparison with these studies that An. gambiae HDR is inherently more robust to TLH than Ae. aegypti or D. melanogaster, it appears that Cas9-based gene drive homing is efficient enough in An. gambiae that increased TLH is tolerated without causing enough of a reduction in efficiency to reduce the homing rate. This is supported by previous studies that have found Cas9-based gene drive homing rates are higher in Anopheles (∼97%) 8 than in both Drosophila (∼80%) 19 and Aedes (∼70%). 61
The robustness of An. gambiae gene drive homing to variation has important consequences for its application in real-world vector control strategies. The development of gene drives in laboratory-bred mosquito strains allows for standardization of the genetic background for easier study but has called into question their applicability to heterogenous wild populations. Despite the sensitivity of homing in other organisms to low amounts of TLH, our findings show no significant reduction in homing activity into multiple strains with up to 6.6% TLH in An. gambiae.
The strains used in this experiment were collected from East, Central, and West Africa across a span of 37 years, and are a mixture of An. gambiae, An. coluzzii, and An. gambiae/An. coluzzii hybrids (Fig. 1B). The demonstration of unimpeded gene drive homing into strains of this diversity represents the strong potential for gene drive implementation across members of the An. gambiae species complex that are able to produce fertile progeny.
No impact of 2La karyotype on gene drive homing
The vas2-5958 gene drive is located within the region covered by the 2La inversion (Fig. 1A); homing rates for all three permutations of the 2La inversion were analyzed (Fig. 1D). There was no significant difference in homing rate between 2La inversion karyotypes (Fig. 5).

The inheritance rate of the vas2-5958 gene drive element in the offspring of males either homozygous (2La/2La and 2L+a/2L+a) or heterozygous (2La/2L+a) for the 2La chromosomal inversion. There was no significant difference in gene drive inheritance between the three karyotypes (Wilcoxon test).
Despite previous observations of reduced gene drive conversion across inversions in Drosophila 47 and reduced meiotic recombination within the 2La inversion region in An. gambiae, 48 we saw no evidence of reduced gene drive homing rate in 2La inversion heterokaryotypes. However, reduced recombination is not uniform across an inversion, and adjacent sequences external to the inversion can also show altered recombination rates. Meiotic recombination is slightly higher in the middle of the inversion compared with regions near the breakpoints, due to the increased ease of forming chiasmata between sister chromatids at the center of the inversion loop.47,48 The vas2-5958 gene drive target site is <4 Mb from the distal breakpoint of the 2La inversion (Fig. 1A)7,62 theoretically, recombination at this point should have been low, but this is not reflected in the gene drive rates we observed.
Adjacent to the inversion, the region between the proximal breakpoint of the 2La inversion and the centromere shows strong recombination reduction, with a less strong but still reduced recombination rate in the region distal to the centromere. 48 The zpg-7280 gene drive target site is 2.8 Mb from the 2La distal breakpoint and is therefore located in a region with a known slight reduction in meiotic recombination.7,62 Our results suggest that this is not sufficient to reduce homing, but future work could explore if other targets within the inversion, or closer to the breakpoint, may be affected.
While homing does not appear to be reduced within the inversion in An. gambiae, other impacts of the inversion on long-term control strategies should be considered. Reduced meiotic recombination results in the protection of the inverted regions and their accumulation within populations; a common mechanism of speciation in Anopheles. 54 Linked regions can result in the persistence of deleterious mutations or the spread of adaptive alleles for certain environments. In the case of gene drives, regardless of the impact of recombination on the homing mechanism itself, inversions could impact the penetrance of gene drives into wild populations indirectly, through reproductive isolation. That said, unless this reproductive isolation is total, even rare cases of intrastrain hybridization should lead to the gene drive rapidly introgressing into the new karyotype.
There is good precedent for this in the adaptive introgression of insecticide resistance alleles between An. gambiae and An. coluzzii, two separate species that are not fully isolated reproductively. 63 The idea of “forced” introgression, through which gene drives are backcrossed into wild populations before release, has been suggested to reduce the introduction of novel chromosomal arrangements or variations into wild populations. 64
Application of gene drives to wild populations
At first glance, the high level of variation in the An. gambiae species complex suggests that gene drives developed in laboratory-bred colonies could struggle to spread in wild populations via HDR. Our results suggest that this is not the case; with a highly conserved gRNA, variation in the surrounding sequence or in the chromosomal structure had no impact on the gene drive constructs tested here. The use of highly conserved gRNA sites is an important strategy for reducing the development of gene drive resistance. 8 The availability of deep sequencing data for An. gambiae via the Ag1000G confirms the high variation within the species complex, but also greatly improves our ability to choose gRNAs appropriately. 65
Correspondingly, gRNA target sites must be chosen carefully to confine gene drives to a particular strain; there are a variety of self-limiting strategies currently in development that either combine nonautonomous elements or target alleles private to the target population.66–68
The specificity of the gRNA targeting system produces very low off-target effects in An. gambiae, making CRISPR-Cas9 gene drives resistant to unexpected homing outside of the target sequence. 69 However, there is potential for neighboring sequences flanking the gene drive to be carried over during HDR due to resection of the broken chromosome. 28 This could result in tight allelic linkage of neighboring sequences to the gene drive and introgression of novel alleles into wild strains, suggesting that gRNA target regions need to be chosen with the surrounding sequences in mind. Future work will be able to determine the precise dynamics of genetic exchange between the gene drive donor and recipient chromosome.
Regardless of TLH of up to 6.6%, gene drive strategies for An. gambiae control show promising efficacy for malaria control in wild mosquito populations. The self-sufficiency of gene drives after initial release has meant that extra care is being taken to characterize how gene drives will function in natural settings.70,71 This work offers improved understanding of gene drive dynamics in wild populations and demonstrates their potential for Anopheles control.
Footnotes
Acknowledgment
We would like to thank Molly Margiotta for her contribution to this work in the early characterization of some of the genetic crosses.
Authors' Contributions
P.P.: Methodology, software, validation, formal analysis, investigation, data curation, writing—original draft, writing—review and editing, and visualization. G.B.: Methodology, investigation, and writing—review and editing. A.A.: Methodology, validation, investigation, supervision, and writing—review and editing. R.S. and J.S.: Investigation and writing—review and editing. F.L.: Conceptualization, methodology, supervision, project administration, and funding acquisition. T.N.: Conceptualization, methodology, validation, resources, writing—review and editing, supervision, project administration, and funding acquisition.
Author Disclosure Statement
T.N. has equity in the company Biocentis.
Funding Information
This work was supported by a Springboard fellowship from the Academy of Medical Sciences to Tony Nolan (SBF006\1183) and by start-up funds from the Liverpool School of Tropical Medicine.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
