Abstract
Genome editing provides a new therapeutic strategy to cure genetic diseases. The recently developed CRISPR-Cas9 base editing technology has shown great potential to repair the majority of pathogenic point mutations in the patient's DNA precisely. Base editor is the fusion of a Cas9 nickase with a base-modifying enzyme that can change a nucleotide on a single strand of DNA without generating double-stranded DNA breaks. However, a major limitation in applying such a system is the prerequisite of a protospacer adjacent motif sequence at the desired position relative to the target site. Progress has been made to increase the targeting scope of base editors by engineering SpCas9 protein variants, establishing systems with broadened editing windows, characterizing new SpCas9 orthologs, and developing prime editing technology. In this review, we discuss recent progress in the development of CRISPR base editing, focusing on its targeting scope, and we provide a workflow for selecting a suitable base editor based on the target nucleotide sequences.
Introduction
Gene therapy is a technique to modify or introduce a gene into a patient's cells for therapeutic purposes. This concept was proposed in 1972 by Friedmann and Roblin in their seminal paper, which described the idea of incorporating functional copies of DNA into a patient's cells, now called gene augmentation. 1 Currently, 2022 cellular and gene therapy products have been approved by the U.S. Food and Drug Administration. 2 Novel technologies, such as the CRISPR-Cas9 system, can specifically target and edit disease-causing mutation(s) in genomic DNA, thereby providing a permanent modification to treat the disorder in question.
While Cas9 from Streptococcus pyogenes (SpCas9) has been the most widely used species, research has also been focusing on discovering additional Cas species such as Cas12a and other Cas9 orthologs to expand its targeting capabilities. These Cas9 orthologs have different CRISPR components, and some are smaller in size compared to SpCas9, which allows them to be packaged into single delivery vectors.3,4 They are attractive alternatives for in vivo therapeutic applications. Recently, base editing, a novel application of CRISPR-based technology, has been developed where a specific nucleotide at an exact location in the genome can be precisely changed to a different nucleotide.5,6
Base editors are constructed by fusing a Cas9 nickase (nCas9) with a base-modifying enzyme. Currently, there are three types of base editors: cytosine base editors (CBEs), adenine base editors (ABEs), and C-to-G base editors (CGBEs). CBEs are the fusion of nCas9 with rat cytosine deaminase APOBEC1 (rAPOBEC), producing a protein capable of converting a C•G base pair to a T•A base pair (Fig. 1A).5,7 ABEs convert A•T base pairs to G•C through adenine deamination to inosine, which is read as guanine by DNA polymerase (Fig. 1B).6,8

Two major types of base editing systems.
Together, CBEs and ABEs could potentially correct 68% of pathogenic point mutations registered in the ClinVar database in 2020. 9 More recently, CGBEs were developed which take advantage of the high levels of C-to-G conversions observed in some CBE variants.10–13 While this is undesirable for CBEs, it provides novel mechanisms to achieve C-to-G transversion.
Base editing has been shown to change a single nucleotide precisely, with efficiency as high as 98% in cell culture systems, as measured by high-throughput sequencing. 14 Also, base editors do not require a donor template and are equally effective in quiescent and nonquiescent cells, opening the possibility for delivery to any cell type. Most importantly, base editors do not generate double-stranded DNA breaks (DSB), reducing off-target oncogenic potential and generation of fewer insertions or deletions (indels) during the genome editing process. Overall, base editing appears to be a better choice than DSB-mediated genome editing when making single nucleotide modifications in the genome.
The potential of base editors extends beyond just correcting pathogenic mutations, as they are powerful tools for generating stop codons and splice site mutations. In the iSTOP and CRISPR-STOP systems, CBEs precisely converted four codons into STOP codons, leading to gene inactivation and allowing modeling of more than 32,000 cancer-associated nonsense mutations.15,16 ABEs have been used in the i-Silence system to silence genes by mutating start codons and can effectively mimic start codon mutation-based diseases. 17
Genes can also be knocked out by targeting splice sites and altering the conserved splice donor or acceptor. This strategy can use either ABEs or CBEs and can theoretically disrupt 95.86% of all protein coding genes. 18 These expand the CRISPR toolkit and offer a more precise way to knock out genes compared to preexisting DSB-based methods.
Despite its promising potential, applications of base editing have been limited by the lack of appropriate protospacer adjacent motif (PAM) sites at the desired positions relative to the target nucleotide. For first-generation SpCas9-based base editors, the target nucleotide is usually within a 5 nt editing window, in positions 4–8 for CBEs and positions 4–7 for both ABEs and CGBEs within the guide RNA (gRNA) protospacer region.5,6,10–12 While base editing could still occur outside of these optimal windows, the efficiency is generally lower. This limits the broader application of base editing systems. Additionally, any nontarget base (bystander) within the editing window can also be modified. The risk of bystander mutations combined with the lack of an appropriate PAM mean that only 37% of point mutations (pathogenic or likely pathogenic) can be safely repaired with current base editors. 9
In this review, we discuss recent progress in increasing the targeting scope of the CRISPR base editing system, and we provide a workflow for determining the most suitable base editor depending on the nucleotide composition.
SpCas9 PAM Recognition Mechanism
The PAM sequence was first described by Mojica et al. 19 Its presence marks the invading sequence to facilitate self versus nonself recognition. SpCas9 requires an NGG PAM, which has an estimated frequency of 5.21%, or every 42 bases in the human reference genome. 20 In addition, SpCas9 can recognize noncanonical NAG and NGA sites by inducing a distortion of the DNA backbone.21–25
The formation of a ribonucleoprotein (RNP) complex between the Cas9 enzyme and gRNA induces a conformational change in the Cas9 protein to expose the PAM interacting domain (PID).26,27 This complex scans DNA for PAM sites and gRNA complementarity. It will only bind to compatible PAMs and rapidly dissociates from inappropriate ones.28,29 The Cas9 RNP interaction with PAM sites leads to gRNA invasion of adjacent double-stranded DNA and formation of a site-specific R loop, in which the gRNA displaces the noncomplementary strand to produce a DNA:RNA hybrid with the complementary DNA strand.28,30–32 In the absence of a NGG or NGA/NAG PAM sequence, SpCas9 will not recognize the target DNA, even if perfectly complementary to the gRNA.28,32
The R1333 and R1335 residues on SpCas9 PID form a hydrogen bond directly with the GG dinucleotides on the nontarget strand (Fig. 2). Substitution of these two arginines to either alanine or glutamine results in near abolishment of cleavage.30,33 E1219 stabilizes R1335 through formation of salt bridges. 34 A phosphate lock loop at K1107-S1109 in the PID stabilizes a sharp kink at the +1 phosphate position immediately upstream of the PAM (Fig. 2). This promotes local duplex melting and target strand flipping for R-loop formation. 30

SpCas9 recognition of NGG protospacer adjacent motif (PAM). R1333 and R1335 directly participate with the hydrogen bond formation with GG dinucleotides of the PAM sequence. A phosphate lock loop at K1107-S1109 in the PAM interacting domain stabilizes a sharp kink at the +1 phosphate position immediately upstream of the PAM. This promotes local duplex melting and target strand flipping for base pairing between the seed region on the gRNA and the target DNA sequence.
For base editing, a lack of available PAM sequences can render a target nucleotide inaccessible unless an alternative PAM can be recognized. Further, if a target site is not at a preferable distance away from a PAM sequence, base editors are unable to edit it effectively, as the deaminases act on a short gRNA-displaced single-stranded DNA upon the R-loop formation. The relatively strict SpCas9 PAM sites and editing window requirements mean that often only one gRNA can be used to target a specific nucleotide. As a result, expanding available PAM sequences will expand the number of targetable nucleotides.
Approaches to Alter PAM Recognition Requirement
Engineering SpCas9 to overcome PAM limitations mainly relies on structural-guided mutagenesis, directed evolution, or combinations of the two. The crystal structure of the Cas9–gRNA complex bound to target DNA provides guidance on candidate amino acids that could be mutagenized to alter and broaden PAM compatibility and increase protein stability.27,30 Most mutations identified through this approach are concentrated in SpCas9 PID.
Directed evolution gives specific evolutionary pressure to force Cas9 to bind to alternative PAM sites. Simple screening methods use a toxic gene that must be cleaved by Cas9 bound to an alternative PAM to ensure survival. Phage-assisted continuous evolution (PACE) and phage-assisted noncontinuous evolution (PANCE) are more complex systems (Fig. 3).35–37

Overview of phase-assisted continuous evolution (PACE) and phage-assisted noncontinuous evolution (PANCE).
A typical PACE setup consists of a lagoon with constant inflow and outflow of host cells. A selection phage (SP) encoding a catalytically dead Cas9 (dCas9) fused to the ω subunit of RNA polymerase will infect the host cells. A mutagenesis plasmid induces mutations in the SP. If the mutated dCas9 can bind to a candidate in the PAM library in the accessory plasmid, RNA polymerase is recruited, leading to phage propagation. Over time, the lagoon will be filled with Cas9 variants that are evolved to bind a specific PAM sequence. PANCE is a simplified version of PACE involving serial dilution of evolving SP to fresh host cells. Although PANCE is labor intensive, it allows parallel replication of weakly active variants. Since the full-length Cas9 is subject to evolution, directed evolution can lead to mutations outside the PID.
SpCas9 Variants that Accept Alternative PAMs
SpCas9 variants have been developed that accept more promiscuous PAM sequences, allowing for editing at sites not previously targetable. These variants can also position the target nucleotide within a preferable editing window and provide alternative gRNAs to reduce bystander mutations (Table 1).
SpCas9 VQR/EQR/VRER/VRQR variants
Based on structural information and through bacterial positive selection, Kleinstiver et al. engineered four SpCas9 variants: VQR recognizes NGAN/NGNG PAMs, EQR recognizes NGAG PAM, VRER recognizes NGCG PAM, and VRQR exhibits higher activity compared to the VQR variant at NGAH sites.33,38 These variants contain mutations at D1135, R1335, and T1337. The crystal structure of these Cas9 variants in complex with gRNA revealed that the alternative PAMs are recognized through an induced-fit mechanism. 25 A distortion in the PAM-containing nontarget strand is observed, which positions the PAM sequence at favorable positions to form sequence-specific hydrogen bonds. 24
Base editor constructs with these four variants have been developed by several groups.39–43 CRISPR-SKIP is a versatile strategy in which CBEs are used to target the highly conserved G in the AG splicing acceptors, leading to exon skipping. 44 Microinjection of VQR-BE3 and corresponding gRNA into mouse zygotes resulted in on-target editing of 21/23 alleles. 45 Further, VQR-BE3 introduced P301L mutation in the zebrafish Tyr gene to mimic human oculocutaneous albinism. 46 CGBE-VRQR exhibited up to 31% editing efficiency at endogenous sites in HEK293T cells. 10
xCas9
xCas9 was engineered through PACE. The most efficient variant evolved is xCas9 3.7. 47 In addition to NGG sites, xCas9 is most compatible with NGN, GAT, and GAA PAMs (lower with NGC). Closer examination of xCas9 PAM preferences suggests higher efficiency when C is the fourth nucleotide of the PAM, potentially limiting editing at NGND PAM sites. 48 Analysis of the crystal structure of xCas9 interactions with DNA has suggested that the loss of interaction between V1219 and R1335 allows for R1335 rotamerization and reduced PAM interaction. 34 A key advantage of xCas9 is dramatically reduced off-target activity compared to SpCas9.34,47,49 However, xCas9 has lower or nondetectable on-target efficiencies at some sites and is outperformed by SpCas9 at NGG sites.50–53
Both xCas9-CBE and xCas9-ABE have been shown to edit different sites in the Tyr and App genes in rabbits, with efficiencies up to 40%. 54 Although low levels of A > G editing (1.43%) were observed for CFTR W1282X and R553X mutations in the patient-derived intestinal organoid biobank, it still demonstrates the potential applications in treating genetic diseases, especially since no detectable off-target edits were observed. 55 To increase the efficiency of xCas9 base editors, Liu et al. added biparticle nuclear localization signal to facilitate nucleus localization, and Gam protein to the N terminus of xCas9-base editors to protect against DSBs and indel formation. 56 These modifications led to more than a threefold increase for both CBEs and ABEs. 56
SpCas9-NG
Based on the SpCas9 crystal structure, Nishimasu et al. designed SpCas9-NG, which recognizes an NG PAM. 51 This broadened PAM recognition comes from mutating R1335 and E1219, along with other mutations to facilitate and stabilize PAM binding. SpCas9-NG has been shown to exhibit the highest level of nuclease activity compared to xCas9 and wild-type (WT) SpCas9 at non-NGG sites, possibly due to higher binding affinity.57,58
Both NG-BE4max and NG-ABEmax have demonstrated good targeting efficiency at various sites in rabbit embryos. 59 A rabbit model mimicking the human pure hair and nail ectodermal dysplasia was generated using NG-ABEmax, which carries Hoxc13 Q271R missense mutation. 59 Further, NG-ABE has demonstrated 23.4% editing in the MECP2 gene. 42 CGBE1-NG induced edits at six NGT PAMs, with efficiency as high as 27%. 10
SpCas9-NRCH/NRTH/NRRH
Recently, SpCas9-NRRH, SpCas9-NRTH, and SpCas9-NRCH were developed through PACE and PANCE selection. 60 The SpCas9-NRCH variant is important, as it can address the low editing efficiency of xCas9 and SpCas9-NG at NGC PAMs. All three evolved variants showed robust indel formation when tested on endogenous HEK293T cells and primary human fibroblasts. GUIDE-seq was used to detect DSB formation, revealing that the new variants have similar overall off-target activity compared to SpCas9.
The evolved SpCas9 variants also support CBE and ABE constructs with 10–30% editing efficiency. 60 By analyzing ∼3,000 library sequences containing NANN PAM sites, it was found that the CBE construct of the three generated variants together with SpCas9-CBE and SpCas9-NG-CBE all have a strong preference for a G at position 1 of the PAM, with activity being lowest when that position has a T. 60 With this expanded PAM compatibility, ∼95% of all pathogenic single nucleotide polymorphisms recorded in the ClinVar could in theory be corrected by base editors derived from SpCas9-NRRH/NRCH/NRTH or SpCas9-NG/xCas9. 60
PAMless SpCas9
To increase the targeting flexibility of base editors further, near-PAMless SpCas9 variants were engineered through structure-guided engineering. 48 The SpG variant showed robust and even efficiency across NGN sites compared to both xCas9 and SpCas9-NG. A second variant, SpRY, accepts both NRN and NYN PAMs. The SpRY variant contains 11 total residue substitutions, including R1333P and R1335Q, which abolish the recognition of the two guanines in the NGG PAM. SpRY is found to have comparable efficiency to Cas9-NG and approximately 10-fold higher than xCas9. 61
Both SpG-CBE and SpG-ABE exhibited significantly higher editing efficiency across different NGN sites, including NGC sites. The SpRY-CBEs exhibited up to 96% editing in rice, while SpRY-ABE8e achieved 79% editing.61,62 The most exciting advancement with the SpRY variant is its ability to edit sites with NYN PAMs, although the chosen sites were highly active, and the ability to target lower activity sites is expected to be much lower. Nevertheless, the base editing efficiency of WT SpCas9 is negligible at these highly active sites, suggesting the potential of SpRY variants.
The drawback of SpCas9 variants
The variants mentioned above are attractive options for non-NGG PAMs, but it is important to discuss their limitations. The above base editing variants have reduced efficiency compared to WT SpCas9 at NGG sites. At this time, the WT SpCas9 would be the first choice, and these other variants would act as alternatives for nontargetable sites.
Some variants have lower editing efficiency but a higher ratio of on-target versus off-target edits. For example, by directly comparing WT SpCas9, xCas9, and SpCas9-NG in human cells, Kim et al. found that xCas9 was less tolerant of mismatches, and Cas9-NG had the broadest PAM compatibility. 63 This could potentially be attributed to slower cleavage kinetics due to the amino acid modifications in xCas9. 51 Thus, when choosing SpCas9 variants, it is crucial to balance efficiency and specificity.
Another concern with expanded PAM compatibility is whether these will lead to increased off-target effects. The majority of the SpCas9 variants mentioned above exhibit comparable off-target activity to the WT, although more rigorous testing in other contexts and applications will be required. A higher level of off-target editing was observed with SpG or SpRY variants compared to WT SpCas9, and nearly all novel off-target sites were associated with expanded PAM sites. Replacing them with the high-fidelity variant of Cas9 nearly eliminated all off-target events. 48 Therefore, off-targets are highly base editor specific and case specific, and choosing a high fidelity version of Cas9 could reduce such effects.
Base Editing Variants with Shifted Editing Window
Another solution to the lack of PAM sites at proper positions relative to the target nucleotide is to shift and/or broaden the editing window of base editors. This is extremely useful when introducing premature stop codons or creating multiple mutations within a particular gene to study its function (Table 1).
Replacing rAPOBEC
Rat APOBEC has been replaced with homologs to alter or broaden editing windows. For example, when replaced with human APOBEC3A (hA3A), the editing window of the hA3A-BE3 spans positions 2–12 within the protospacer. 64 These hA3A-BEs also have robust editing efficiency at GC regions and at Cs in the CpG dinucleotides in methylated regions, where efficiency is otherwise generally low with other CBEs.64,65 The targeting ability of hA3A-BE3 has been further increased to NGH PAMs by fusing to SpCas9-NG. 66 This base editor could mediate at least 3% editing at a variety of NGH PAM sequences in HEK293T cells and porcine fetal fibroblast cells and was shown to be able to induce targeted mutation in four genes simultaneously. 66
The Target-AID system was developed by fusing dCas9 with Petromyzon marinus cytidine deaminase (PmCDA1) from sea lamprey. 67 The editing window is at positions 1–5 in the protospacer. 67 The ability to edit the PAM-distal region adds to the base editing repertoire. Target-AID showed appreciable editing against the rs1061170 SNP associated with age-related macular degeneration, while no base editing was observable with other BE3 variants. 68 By fusing PmCDA1 with other SpCas9 variants outlined in the previous section, a new set of CBEs were engineered to recognize different PAM sequences. 69 Liu et al. further enhanced the AID system (eAID-BE4max) with an editing window of 1–11 nt. 70
Recruiting multiple copies of deaminase
BE-PLUS is a versatile tool for introducing multiple mutations within a specific gene. It allows simultaneous editing of multiple Cs at positions 4–16 within the protospacer. 71 Multiple copies of rAPOBEC-UGI are recruited to the target site through single chain fragment variable recognition of GCN4 peptides on the C-terminus of nCas9(D10A). BE-PLUS expands BE3-targetable sites from 20.4% to 42.2% and significantly increases the efficiency of stop codon generation up to 90%. 71
CRISPR-X was developed to induce edits not confined to the protospacer region. It consists of a dCas9, an AID variant lacking a nuclear export signal, and MS2-modified gRNAs. 72 It can recruit four deaminases through the MS2 RNA hairpin and MS2 coat protein interaction. The mutation hotspot has been shown to be −50 bp to +50 bp relative to the PAM site, while the most highly mutated bases are located within the −12 to +32 bp region. 72 CRISPR-X is extremely useful for protein engineering, as it can create numerous mutations. This system can be used to generate libraries of point mutations and target multiple genomic locations simultaneously.
Altering base editor protein structure
Circular permutation (CP) can alter protein behavior by linking the N- and C-terminus and then splitting the sequence at another position, creating a topological re-arrangement of the protein primary sequence. The circular permutant CP-SpCas9s are predicted to have the termini lie closer to the ssDNA loop for the deaminase. 73 The CP-CBEmax (especially from CP1012 and CP1028) and CP-ABEmax constructs have slightly broadened editing windows of positions 4–11 and 4–12, respectively. 43
Zhang et al. inserted a ssDNA binding domain, RAD51, between the cytidine deaminase and nCas9. The resulting HyBE4max could edit positions 4–14. 74 This system is also available as hyA3A-BE4max and hyeA3A-BE4max, which preferentially edit TCR motif. 75 BE-PIGS is a similar system, where rAPOBEC1 is inserted between the G1246 and S1247 sites in the SpCas9 PID. It is shown to exhibit high editing efficiency at Cs only 6 bp away from the PAM. 76
Inlaid base editors (IBEs) were engineered such that the deaminase domain is internal to the RuvC and PI domains of SpCas9. 77 Therefore, depending on the deaminase location and the linker length, a series of IBEs exhibited shifted editing windows from positions 2–18 within the protospacer. 77 In general, these IBEs retained on-target editing efficiency and reduced DNA and RNA off-target editing compared to ABE7.10.
These base editing systems with shifted and/or broadened editing windows can be used to introduce amino acid substitutions and early stop codons, edit promoters and regulatory elements, and screen functional variants. In addition to the broadened editing windows, many of these base editors have been engineered to have narrower and more specific windows.39,64,75,78 These base editors address significant limitations faced by base editors with broadened editing windows. Broadened windowed base editors are more prone to nonsynonymous bystander mutations, requiring careful balance of targeting scope and risk of introducing further mutations.
It is also important to mention that due to codon degeneration, a large fraction of bystander mutations will result in silent amino acid mutations. While such synonymous mutations are not expected to have an overall impact on protein function, there are instances where it could have pathogenic effects. To summarize, base editors with shifted and/or broadened editing windows are critical for furthering our understanding of disease causes and can provide foundations for future therapeutics.
Base Editing with SpCas9 Orthologs
Many naturally occurring Cas9 orthologs have been identified that provide potential base editing strategies at genomic sites not currently targetable by SpCas9-based systems. An important note is that these orthologs will have different tracrRNA sequences than SpCas9 (Table 1).
Staphylococcus aureus Cas9
Staphylococcus aureus Cas9 (SaCas9) is 124 kDa—39 kDa smaller than SpCas9. 4 Through site-depletion assays, the PAM site requirement for SaCas9 was determined to be NNGRRT and can be further relaxed to NNNRRT (SaKKH-Cas9 variant).79,80 The nucleotide preference of the third PAM position is G > A = C > T, and the preference for fourth and fifth positions is AG > GG > GA > AA. 79 SaCas9 interacts directly with its PAM sequence through major groove recognition. 81 It demonstrates robust editing activity and higher cleavage activity compared to SpCas9 in human cells, potentially due to higher binding affinity.33,58,82
The editing window of SaCas9-CBE base editors is at positions 4–14, with maximum editing between positions 7–11 within the protospacer. 39 For SaKKH-ABE, the editing window is slightly shifted to positions 3–14 (highest efficiency at positions 8–13). 41 SaKKH-BE3 demonstrated editing efficiencies up to 62% at NNNRRT PAMs and successfully introduced single nucleotide substitution in human tripronuclear zygotes.39,83
In the CRISPR-SKIP system, SaKKH-BE3 could induce >40% G to A conversion for RELA exon 10 skipping. 44 Further, SaKKH-ABE could target exonic splicing silencers in exon 7 of SMN2 and lead to exon inclusion and expression. 84 Intravenous injection of SaKKH-BE3 and gRNA was shown to reduce blood L-Phe levels 20-fold, restore blood phenylalanine enzyme activity to up to 21% of WT activity, and reverse the fur phenotype in adult phenylketonuria mice.85,86
Neisseria meningitidis Cas9
Neisseria meningitidis Cas9 (Nme2Cas9) is a highly compact SpCas9 ortholog that recognizes the NNNNCC PAM, which is the same as NGG PAM but on the reverse strand. 87 It has higher fidelity than SpCas9 in mammalian cells. 87 The editing window of Nme2Cas9-CBE is from positions 2–11, counting from the PAM distal end. 88 Using this base editor, F0 rabbits carrying a premature Q79X stop codon in the Fgf5 gene were successfully generated with no detectable off-target mutations. 88 However, nNme2-ABE did not produce detectable A-to-G conversions, suggesting further improvement in the Nme2Cas9-ABE system.
Other SpCas9 orthologs
Streptococcus canis (ScCas9) can recognize NNG PAM sequences, and ScCas9-BE3 and ScCas9-ABE(7.10) are available. 89 Sc++ and HiFi-Sc++ are higher fidelity versions that further increase the nuclease activity while maintaining off-target activity. 90
Campylobacter jejuni (CjCas9) recognizes NNNNRYAC PAMs. 91 CjCas9-ABE was used to block transcription factor binding to TERT promoter, inhibiting growth of brain tumors in mice. 92
Streptococcus thermophilus (StCas9) recognizes NNAGAAW/NNGGAA PAMs.93,94 St1BE4max could convert Cs with varied efficiency, while only moderate levels of editing were observed with St1ABEmax at endogenous sites in K562 cells. 93
Chatterjee et al. switched the PI domain of SpCas9 with that of Streptococcus macacae Cas9 to generate hybrid SpymacCas9 that is capable of recognizing NAAN PAMs. Both SpymacCas9-CBEmax and -ABEmax induced efficient base editing in rabbit embryos, although only at TAAA PAM sites.95,96 Further optimizations by incorporating R221K and N394K mutations could address this issue. 95
Importantly, these orthologs offer alternatives at G-less PAMs, which are generally not targetable by SpCas9. The long length of these PAMs mean that some may encompass existing PAM sequences, creating overlap with other SpCas9 variants. One rationale for choosing these SpCas9 orthologs is that their PAM requirements are more stringent, which could decrease off-target activity. These would be extremely useful where very precise edits are required and where off-target/bystander concerns prevent the use of other base editors.
These SpCas9 orthologs generally have a similar editing efficiency compared to SpCas9, supporting their use in gene editing applications. However, further characterization is needed to confirm their efficiency in various applications both in human cells and in vivo. Additionally, many other naturally occurring Cas9 orthologs can accept different PAMs, but base editor versions of these have not yet been developed. It would be greatly beneficial and provide alternative editing methods if these base editors are engineered.
Cas12a
Genome engineering with Cas12a is another promising and well-studied approach. Acidaminococcus sp. BV3L6 (AsCas12a) and Lachnospiraceae bacterium ND2006 (LbCas12a) are the two common orthologs that exhibit robust nuclease activity in human cells.97,98
Cas12a base editors have gained considerable interest due to their unique editing mechanism compared to Cas9. Cas12a recognizes TTTV PAMs and generates a 5 nt staggered cut downstream of the PAM site, facilitating editing at genomic sites lacking G-rich PAMs. Crystal structures of Cas12a in complex with crRNA and target DNA have revealed that the PAM sites are primarily read by two consecutive lysine residues in the PAM-binding channel of Cas12a. 99 Through GUIDE-seq and targeted deep sequencing analysis, Cas12a nucleases were shown to be more specific than SpCas9 in human cells.100,101
The Cas12a-CBE was engineered through the fusion of rAPOBEC1 with catalytically inactivated LbCas12a as well as a UGI. 102 It has an editing window ranging from positions 8–13, counting the base directly downstream to the PAM as position 1. This CBE disfavors GC sequences, similar to what is observed with SpCas9-CBE.5,102 Cas12a-CBE induces fewer indels, and C to T product purity is also shown to be higher compared to SpCas9-CBE. 102 AsCas12a-CBE was also engineered and demonstrated good editing efficiency in human cells. 103
Prime Editing as a Novel Approach
Prime editing is important to mention, as it addresses some of the limitations of base editing. 104 Prime editing can repair transitions, transversions, small indels, and combinations of the above. This uses an approach similar to CRISPR-Cas9.
Here, a Cas9 nickase linked to a reverse transcriptase (RT) enzyme (prime editor [PE]) is delivered with a pegRNA, an extended gRNA that includes a RT primer-binding site and a template for the desired DNA change (Fig. 4). The initial prime editing system has undergone several improvements to enhance efficacy; PE2 uses an engineered RT to improve editing efficiency, while PE3 adds to this an additional gRNA to nick the opposing strand and promote inclusion of the edit. 104 Recently, editing efficiency was further enhanced by including the co-expression of a mismatch repair inhibiting protein with PE2 and PE3, generating PE4 and PE5, respectively. 105

Prime editing. The prime editor consists of nCas9 protein and a reverse transcriptase. The prime editing gRNA (pegRNA) contains regions that are complementary target DNA, as well as a primer binding site (PBS) and reverse transcription (RT) template. RT starts after the PBS and uses the RT template to incorporate changes on the nicked nontarget strand. The incorporated changes could be transitions, transversions, and multiple nucleotide long insertions or deletions.
There are several advantages to using prime editing. First it can correct different types of base changes not currently editable by base editors, such as duplications, transversions, and indels. 104 Second, the range of editing capabilities may relieve some stringency in the PAM requirements, as PEs can edit further from the PAM site than many base editors. Third, much like base editors, PEs can be combined with modified SpCas9 variants to accept more flexible PAMs and increase editing efficiency.
Prime editing and base editing complement each other. A study comparing CBEs to PEs found that BE4max had a 2.2-fold higher C•G-to-T•A conversion than PE3 when the target nucleotide was in the center of the editing window. 104 PE3 had a 2.7-fold higher conversion when the target was outside of the editing window. 104 PEs, when only a single nucleotide is present within a targeting window, or if bystander mutations are permissible, base editors are preferred. 104 PEs, however, are useful if multiples of the same nucleotide are within the target window and editing of only one is desired. They are also useful when PAMs placing the target nucleotide in an optimal position for base editing are unavailable.
However, one study observed a high frequency of undesirable nucleotide conversions when generating a Hoxd13 mouse model with PEs. 106 This shows off-target events in prime editing still need to be characterized and reduced.
Flow Chart for Choosing the Appropriate Base Editors
We propose the following workflow for choosing base editors based on the nucleotide composition surrounding the target nucleotide (Fig. 5). If the target nucleotide and an NGG PAM is present at the desirable position relative to each other, the first choice would be SpCas9-CBEs/ABEs/CGBEs, depending on the type of nucleotide change. These two types of base editors are well characterized.

Flow chart for determining the most suitable base editor. When an NGG PAM sequence is present at the appropriate distance away from the target nucleotide, SpCas9 CBEs and ABEs would outperform other Cas9 variants and orthologs. Otherwise, base editor variants recognize noncanonical PAM, and/or orthologs (such as SaCas9 and Cas12a base editors) are preferred alternatives if their corresponding PAMs are available. BEs with broad editing windows are extremely useful when multiple nucleotides are required to be changed.
On the other hand, if the target sits outside of the preferable window, base editors with a shifted editing window would be desirable. Alternatively, if no NGG PAM is available, one should explore NNNRRT or TTTV PAMs, as both SaCas9 and Cas12a base editors have been examined in various applications for these PAMs sites. In the case where none of the canonical PAMs are present or not at the desirable position, it would be appropriate to look for PAMs of SpCas9 variants. If off-target editing is a concern or alternative base editing strategies need to be tested, base editors with other Cas9 orthologs would offer good alternatives.
Concluding Remarks
CRISPR-Cas9 base editing provides precise and efficient single base editing, making it possible to target A → G, C → T, and C → G (or T → C, G → A, and G → C on the reverse strand, respectively) base changes. However, at present, the targeting scope of base editors is often limited by the lack of PAM sites at the optimal position relative to the target nucleotide. In addition to the widely used SpCas9, many of its variants that accept noncanonical NGG PAMs are being developed. To increase their applications further, CBEs, ABEs, and CGBEs with broader editing windows and prime editing are also available. Base editor constructs with Cas9 orthologs are engineered along with Cas12a nuclease, with the capability of targeting non-G PAMs or AT-rich regions.
When NGG PAM sites are present and at an optimal distance from the target nucleotide, the WT Cas9 base editors would usually outperform their variant counterparts. The rapid advancements in base editing technology provide researchers with multiple options for editing a given target site in the genome, extending beyond the standard NGG PAM. Since the efficiency of CRISPR/Cas9 base editing is highly site-specific, these Cas9 variants and orthologs still need rigorous testing in various applications, and their off-target editing needs to be carefully examined.
Footnotes
Acknowledgments
The authors thank members of the Ross Lab, Tessa Morin and Jafar Hasbullah, for their helpful discussions and suggestions in the preparation of this article. C.J.D.R. is a Michael Smith Foundation for Health Research Scholar.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
This work was supported by Genome British Columbia [project code SIP005], Michael Smith Foundation for Health Research Scholar Award #16458, CIHR New Investigator Award, and the Nanomedicines Innovation network (NMIN 2019-T2-05).
