Abstract
As research on human applications of CRISPR advances, researchers, advisory bodies, and other stakeholder organizations continue calling for global public discourses and engagement to shape the development of human gene editing (HGE). Research that captures public views and tests ways for engaging across viewpoints is vital for facilitating these discourses. Unfortunately, such research lags behind advances in HGE research and applications. Here, we provide the first review of nationally representative public-opinion surveys focused on HGE to discuss limitations and remaining gaps, illustrating how these gaps hinder interpretation of existing studies. Rigorous research with proper methods for capturing representative public opinion of HGE is limited, especially in countries outside of the United States and on a global scale. The result is severely restricted understanding of even the surface level of public views concerning HGE. We identify broad areas where we need more and better research capturing public views, and describe how future surveys can help collect insights necessary for discourse and decision making on HGE.
Introduction
Reports in late November 2018 that Chinese researcher He Jiankui created genetically edited babies using the gene-editing tool CRISPR not only moved gene editing from hypothetical into living human applications, but also expanded public attention to human gene editing (HGE) and triggered international outcry. Many expressed doubts of the appropriateness of the edits and were deeply concerned by the lack of international, public, and government consultation prior to He making the edits.1,2 The Chinese government responded by placing a temporary ban on gene-editing research. 3 At least two children, however, now appear to be living with CRISPR-edited genes, and some researchers fear that the media and public attention on this apparently rogue researcher could create public backlash against HGE. 2
The controversy surrounding He's edits focused on the fact that he violated tacit ethical agreements within the scientific community not to conduct heritable edits in humans. 4 However, the controversy also highlighted many of the dominant concerns surrounding HGE. Many of these concerns stem from HGE's potential to change the biological and social structures that affect individual and societal well-being, both for better and for worse. HGE could reshape health care, as researchers investigate editing genes responsible for more than 6,000 genetic disorders and diseases 5 and develop more effective treatments for illnesses such as cancer and human immunodeficiency virus (HIV).6,7 In fact, the controversial research conducted by He was reportedly intended to make the children resistant to HIV infection1,4
Researchers in the field, however, argued that there are more acceptable ways to use CRISPR to treat and prevent HIV, 8 and many other critiques of He's research overlap with broader concerns associated with HGE in general. These include ethical concerns about HGE's potential to cause unintentional physical harm and introduce or perpetuate social inequities and harmful power structures.5,9 Similar to other biotechnologies, these concerns include discussion of whether editing the human genome is too “unnatural” or gives some humans too much power to shape human development, often described in the language of “playing god.”10–14 Although concerns of naturalness and power associated with HGE are especially prominent in discussion of heritable, or germline, edits, they can shape discussion of nonheritable somatic edits as well.5,9
Because of such concerns about the power and potential of HGE, researchers and governmental and scientific organizations globally have called for a hold on heritable edits and edits for enhancement (non-therapeutic) purposes in particular.5,15–17 Many have also called for greater public engagement on decision making concerning HGE,5,9,18,19 which the U.S. public have also expressed a desire for. 20 These calls and concerns predate news of the edits that He made in China. But the news of gene-edited children did emphasize further the need and urgency of public involvement in shaping the discourse and use of HGE technologies, as hypothetical applications are increasingly becoming reality.
This critical discussion, however, continues to lag behind the quickly advancing scientific developments with CRISPR. Furthermore, deliberation around HGE is often limited to expert committees who do not represent all necessary stakeholders or perspectives, and who often focus on narrow sets of questions and concerns that align with their areas of expertise, as Jasanoff et al. clearly laid out last year in this journal. 21 A necessary precursor to deliberation is developing an understanding of stakeholder viewpoints, concerns, values, and goals so that we can test ways to communicate and engage diverse publics in decision making effectively. Because gene-editing research and applications span political and geographic boundaries, deliberation on HGE and understanding of the public's views also need to occur at the international level.
Unfortunately, there are large gaps in the social science research necessary to help accomplish these insights— in terms of both what public opinions are and what mechanisms for communication and engagement can further public deliberation across these opinions. Existing gaps include lack of insights into even very basic aspects of potential edits that could influence views of research and applications of HGE, such as whether edits are for therapeutic or enhancement purposes, whether they are heritable, and who the recipients of edits would be (e.g., adults vs. embryos). Further, very few studies focus on non-U.S. contexts, and the vast majority are in the Global North and West.
In the rest of this article, we provide an overview of what we know and, more so, the many areas where we do not know what public opinions of HGE are in order to highlight where and how additional research can be particularly beneficial. We focus on existing public-opinion surveys since 2013—the year that marked the emergence of CRISPR as the dominant gene-editing tool. In this review, we highlight the factors that limit our ability to interpret across existing surveys and ways to alleviate these barriers going forward. We end with a discussion of what these insights and gaps from existing surveys mean for gathering and incorporating insights from around the world as we collectively decide how we want to further HGE research and applications.
State of Global Research on Public Opinion of HGE
Table 1 provides an overview of existing nationally representative surveys of public views of HGE globally since 2013. As seen in Table 1, few nationally representative surveys of public views of HGE exist, and most are concentrated in the Global North and West, overwhelmingly the United States, leaving majorities of people and cultures unrepresented. Further, no surveys, to our knowledge, have been published since the news of He's edits to twin babies in China, which could have increased public awareness of the issue and solidified or changed particular views. In an earlier review of U.S. public-opinion surveys concerning HGE, Blendon et al. included surveys since 1985. 22 Because of the difference in speed, access, and accuracy with CRISPR as an editing tool—and therefore the potential to make many edits possible on a larger scale 5 —we focus just on surveys that capture views since 2013. A publication in 2013 of the first research to use CRISPR in human cells23,24 offers a more public marking point of the start of the CRISPR era in HGE.
Overview of existing nationally representative surveys of public opinion of human gene editing: sampling methods, country of focus, and edit types of focus (therapeutic or enhancement purpose, heritability, and recipient)
To gather the studies included in this review, we conducted Web of Science and Google searches using different search terms based on combinations of “public opinion [attitudes] [views]” and “gene*[genome] edit*[therapy].” We collected results that had abstracts or introductions that described the study or survey as capturing public opinion of HGE or gene therapy. We then assessed the methods that researchers used in each to determine if the survey's results were nationally representative. This resulted in the sample of 10 surveys listed in Table 1.
We include in Table 1 only those studies that qualified as nationally representative, using rather broad criterion to assess representativeness. Surveys had to be based on either (1) probability sampling (the gold standard for national surveying) or, as is more common today, (2) non-probability sampling. Some methods of non-probability sampling, such as quota sampling, allow researchers to generate a sample that can match the population of interest based on key demographic characteristics, such as age, sex, ethnicity, and education levels. 29
A number of studies of public perceptions of HGE exist that report on convenience samples, some of which purport to capture nationally representative views or do not sufficiently clarify that they are not representative.25–28 We do not include these studies here because although convenience samples can be appropriate for experiments, they do not allow us to generalize from them to draw conclusions about who the survey respondents represent.
We emphasize the importance of sampling methods because they matter for the quality of the data and for interpreting survey-based findings. A recent review article, for example, collected all studies capturing opinions on gene editing and gave the studies a rating of high to low quality, but the authors did not seem to account for the sampling methods of the studies in their ratings. 30 Thus, the authors rated many convenience samples as “high” or “medium” quality in their review, when these samples—unless used in an experimental design—unfortunately tell us only what those particular respondents, and not broader, identifiable publics, think.
Because our research team is in the United States, it is possible that we missed international surveys that meet our inclusion criteria. This could be especially likely if surveys are not published in English and/or in English language academic journals. To address this, we compared our results to other review articles. Although these review articles took a broader scope by including more than just nationally representative surveys,22,30 we did not find any additional surveys or publications that we had missed in our searches. That does not rule out that those reviews also missed surveys, particularly if surveys are not in English or were published through nonacademic mediums, such as government Web sites. Moving forward, researchers could collaborate across languages and countries to continue identifying and compiling any additional national surveys.
Challenges and Gaps Across Existing Surveys
As the discussion of sampling methods illustrates, there are methodological challenges that limit our ability to synthesize across and draw conclusions from the few surveys that we do have. Many surveys do not explicitly differentiate in their question wording between particular types of edits in terms of their purpose (therapeutic or enhancement), heritability, and recipients (e.g., adults vs. embryos). We discuss below some of the ways that question wording and emphasis is likely contributing to the lack of clarity and the differences we see in responses across surveys. We focus particularly on illustrations from U.S. surveys, as that is the country for which there are the most studies. We then move to discussion of what big questions remain and how we can draw from this review to understand public opinion better moving forward.
Variations by purpose of edit
Surveys do not always clarify the type of edit, but when they do, people generally support edits for therapeutic (or treatment) purposes but not for enhancement purposes. For example, a 2016–2017 national study explicitly distinguishing between the two found that most Americans support HGE for therapy reasons (59%) but not for enhancement (33%). 20 A 2018 Pew Research Center survey distinguishing between treatment and enhancement edits in babies found even sharper variations in perceived appropriateness of treatment (72% find appropriate) versus enhancement edits (19%). 31 In contrast, however, the earlier 2016 Pew survey had found that most Americans (68%) were worried about using HGE to reduce risk of disease in healthy babies. 32 In that study, the researchers described such edits as enhancements in their reporting of the results, describing gene editing in this case as being used to “enhance human abilities.” The questions, however, could also easily be interpreted as asking about gene therapy, 5 as they asked respondents about the potential for gene editing to give “healthy babies a much reduced risk of serious diseases and conditions.” 32
This ambiguity in whether an item is capturing therapeutic or enhancement edits is not necessarily a problem in itself, as the lines between treatment and enhancement are, and will continue to be, fuzzy. For example, some might categorize disease prevention as enhancement rather than therapy, 5 as the 2016 Pew question described above did. But it is common for disease-focused edits to be considered therapy because the purpose of the edits is to move someone toward the norm for a particular health condition, 5 or to reduce the condition's likelihood of occurring or the severity of its effects. Enhancement, in comparison, is often considered as moving someone beyond the norm for a given human characteristic, such as making edits that give one above-average intelligence or strength. 5
Deciding what constitutes treatment versus enhancement, however, raises questions about how much the societal norm or statistical average should be the defining standard for what we consider disorders or disease and of what appropriate “treatment” is. We already see these questions emerging in discussion around conditions such as albinism and autism. Although both albinism and autism can greatly impact people's well-being, many people—including many who have or are related to people with those conditions—question whether prevention through gene editing embryos is the right approach, and to what extent we should consider such conditions as valuable differences in the range of human characteristics. Many who see these conditions as a valuable part of human diversity or experiences argue we should focus on societal-level changes in how we view and accommodate differences in human abilities—including better access to treatment throughout one's lifetime to manage the detrimental side effects of the conditions—rather than on prevention of the condition itself through HGE.33,34 Similarly, for conditions based on traits such as height or intelligence, where the line should be for what level of height or intelligence constitutes a disorder, or a detriment to quality of life, is not a given. How we view and define disorders and human differences will shape whether potential edits to these characteristics are seen as treatment or enhancement, and will be sources of disagreement across different publics. 5
As more surveys emerged following the 2016 Pew study, however, results indicated that enhancement versus therapy in general is a meaningful distinction for understanding views of HGE applications in many cases.20,31,35,36 The results across these existing surveys suggest that future surveys need to continue to distinguish clearly at least between these two broad dimensions of edits. If items account for those broader dimensions of therapy versus enhancement, surveys can then also include questions that capture the nuance of particular types of edits that could span or blur those dimensions.
A related issue, and harder to disentangle within the small pool of existing surveys, is how other differences in the specificity and examples included in question wording affect responses and how we interpret the results. For example, in the 2016 STAT-Harvard survey, questions about support found that most Americans believe that changing the genes of unborn babies should be illegal, with only 26% believing that therapeutic edits and 11% believing enhancement edits should be legal. 36 In the same survey, however, questions about federally funding such research captured much more positive views for therapeutic edits, with 44% supporting funding for research on therapeutic edits and 14% for enhancement. 36
The difference between the legality and funding items is surprising, as support for a technology usually either evenly maps onto support for funding of that technology or is higher than support for funding, depending on people's views of the role of government in funding research. 37 The difference across the items, however, could be due to additional distinctions in the exact wording between the funding and the general support items. For example, the item on legality of treatment edits asked about “changing the genes of unborn babies to reduce their risk of developing certain serious diseases.” The item on funding for treatment used the same stem focused on unborn babies but added specific disease examples: “…to reduce their risk of developing certain serious diseases such as Huntington's disease, cystic fibrosis, or some types of muscular dystrophy.” 36 The differences in wording across the legality and funding question pairs went beyond the therapy versus enhancement and legality versus funding distinctions of interest. The difference in support, then, could come from the specificity of the latter item making particular disease reduction benefits more salient or less abstract.
As a point of comparison, data collected in the same year by several co-authors of this article also captured views of general support and support for funding across therapeutic and enhancement edits and used more consistent wording across the items. 20 In that study, Scheufele et al. changed only the wording that was relevant to legality versus funding and therapy versus enhancement in the question pairs. The results indicated that support for federal funding mirrors the general support results, with respondents more supportive of funding for treatment purposes than for enhancement purposes. 20 For treatment-focused edits, 59% support using gene editing, and 50% support federal funding for research on it. For enhancement-focused edits, 33% support using gene editing, and 33% support funding research on it. 20
Differences in wording can also capture real differences in views of the potential applications of HGE, but they need to be systematic differences if we are going to be able to interpret their impact on results within and across studies. We do not necessarily need every survey to use the same wording that other surveys did, although that could be useful for comparisons across time and countries, as we describe later. But we do need researchers to change wording across items in deliberate ways designed to help us interpret what differences from question to question mean and how they affect responses to the items.
That means that within a survey, the differences in wording across items should, as much as possible, include only differences that are of interest to the research questions. In other words, word choice should reflect what researchers want to test and what they want to control for across the items. Returning to the STAT-Harvard survey example, we should be able to tell if differences in how people responded to the funding support questions versus the general support questions are due to differences in views of funding and general support, rather than other unintentional differences between the wording. That is only possible if researchers limit word changes to those that convey the distinctions of interest from item to item—such as support for funding versus support for research or support for therapeutic edits versus support for enhancement edits—while balancing these choices with the need for writing understandable and believable items for different publics.
Variations by recipient of edits
Another important factor where we need more systematic study is how views vary depending on who receives the edits, and the terms we use to describe those recipients (e.g., “embryo” or “unborn baby”). These will likely matter for how we interpret responses and public views and could be useful for understanding differences across cultures as well. For example, in the U.S. surveys, items that used different words to refer to the recipient of the edits find different results. The STAT-Harvard survey items always referred to “unborn babies,” while both the 2017 Scheufele et al. and the 2018 Pew studies focused on “children” or “babies,” respectively. These differences could explain, in part, why the STAT-Harvard results indicated less support for treatment-focused edits.20,31,32 It is possible that “unborn babies” is associated with concerns about edits to embryos, which in the United States in particular is often a controversial area, as we see in controversies surrounding embryonic stem-cell research and abortion.38–40
Clarifying and testing across different potential recipients of edits is therefore another vital dimension for understanding public views of those edits. It is also one in which word choice, particularly for any prenatal edits, should be strategically tested. Very few studies compare views of edits to children or to embryos with views of edits to adults. None, to our knowledge, test different word choices for recipients, such as to compare the effect of “embryo” versus “unborn baby” on responses, for example.
The nationally representative survey from Australia by Critchley et al. is one of the few that tested views of edits across many potential recipients, and it found that respondents were more supportive of edits made in adults than of “prenatal” edits. 28 The Critchley et al. study is also the only one of the nationally representative surveys that used the term “embryo” to refer to potential recipients (see Table 1). 27 U.S. surveys seem to avoid the term, perhaps because of researchers' concerns that it will be difficult to distinguish between views of gene editing and reactions to the word “embryo,” which, as mentioned above, is tied to the complex societal discourse and divisions around issues such as abortion and stem-cell research. 40 Researchers' concerns about this are likely valid. Such overlap between views of embryos and views of particular gene edits, however, will also likely matter for real-world discourse and views. Work that lets us compare not only responses across items describing edits made at different stages of recipients' lives but also the impact of different words describing prenatal edits will be valuable for understanding public views, especially cross-culturally.
Variations across heritability of edits—germline versus somatic
Finally, the heritability of edits is an additional factor that could likely matter for public perceptions and discourse of HGE. Of the nationally representative surveys, however, only Critchley et al.'s Australian study and Scheufele et al.'s U.S. study explicitly distinguished between edits that are heritable (germline) versus not (somatic).20,41 Scheufele et al. did not find substantial differences in acceptability by heritability, 20 and Critchley et al. found only very small but significant differences across heritability for the context of therapeutic edits. 41 These results are somewhat surprising, as many in the scientific communities seem to be especially concerned about germline, as compared to somatic, edits.5,15,42 As these are each only single studies, however, we need more work to confirm whether these two surveys capture an accurate picture of views on the ground in their respective countries.
Further, because most surveys do not explicitly tell respondents whether edits will be heritable, we do not know how these views change over time or across countries. We also do not know how much reactions to prenatal edits versus edits in adults overlap with, or not, concerns about heritability. In other words, are people more likely to assume that prenatal edits would be heritable, or consider heritability at all, when responding? Research should therefore continue not only to elucidate whether edits are for therapeutic and enhancement purposes and who the edits are done to (e.g., adults, embryos, germ cells, and somatic cells), but also do so in ways that make explicit whether edits are heritable. This will allow us to see how heritability of edits in particular can elicit different views and concerns.
Areas for Future Work: Filling In and Clarifying the Picture of Public Views
Although surveys of national-level public opinions are only one tool for understanding views of HGE, they are especially important and useful for gaining insights across large diverse publics and cultures globally. We have limited survey data on public opinion of HGE, however, with existing work suffering from the methodological challenges described above, as well as from large geographic gaps. No surveys we found focused anywhere within Africa or South America, for example. Additionally, none captured views from after the news of the CRISPR-edited babies in China in late 2018, which could have raised awareness of HGE or affected views of edits among publics. Going forward, we need more data points from around the world and more data points that come from sampling methods that allow us to generalize to these different populations. As we described, convenience samples are unfortunately too common in research on public views of HGE.
We also do not have many surveys that allow us to understand interactions between key factors that evidence suggests matter for views of gene editing. As described in this article, some of these include: (1) the purpose of the edit (i.e., therapy or enhancement); (2) the recipient of the edit, such as adults, children, embryos, germ cells, or somatic cells; and (3) the heritability of the edits. Ideally, we will also have future work that not only captures comparisons among these factors and broad dimensions of different edits, but also allows us to understand views of particular potential edits within these dimensions in greater detail. For example, the recent 2018 AP-NORC poll had items that allow for comparisons between enhancement edits for physical looks versus for skill-focused traits, such as intelligence and athletic abilities. 35 Similarly, the aforementioned examples of how the line between enhancement and treatment will be blurred in many potential applications of HGE illustrate how these gray areas will likely matter for how we perceive and make decisions about those applications. 4 It is likely that views of such edits will also depend on who the edit is made to and whether they are old enough to consent, as well as whether the edits are heritable, which similarly raises concerns about consent. Surveys can help us understand these interconnections if designed with these factors, their distinctions, and their potential overlap in mind. This also depends on designing survey items so that wording choices highlight the areas and distinctions of interest while minimizing or controlling for potential confounds and noise.
Finally, we also need more surveys that capture the views of different segments of populations. Many of the surveys in this review already do this to a degree. The Scheufele et al. and the 2016 and 2018 Pew surveys, for example, examined the effects of religiosity, broadly defined, on views in the United States,20,31,32 and the Wang et al. study surveyed patients with HIV in China. 43 We did not focus on survey results of segments of populations here due to lack of space and the need to address large gaps in understanding of broad national-level publics. But work should continue filling in our understanding of views across different publics and stakeholders, both within and across countries and cultures. To do so, however, we will still need sampling of those populations that meets the same probability or quota sampling standards we previously described. We reemphasize this because surveying of segments of populations can be even more prone to the temptation of using convenience samples, as the Wang et al. study did for the HIV patient sample. 43 These samples leave us with data we cannot generalize from or compare to other surveys and populations to understand the views of these important publics better.
Conclusion: Achieving Completeness and Nuance in Survey Research for Global Views of HGE
Surveys do not all need to share the exact question wording to be able to make the important comparisons we have described. More research similar to Gaskell et al.'s multi-country study that did use the same wording across surveys would be especially beneficial for cross-national comparisons, 44 especially if it expands the scope to include non-Western nations. The Eurobarometer surveys also provide an example of how cross-national surveying can use items that work across language and cultural boundaries. 45 But surveys do not need to copy each other's question wording to capture the important underlying concepts effectively in a way that lets us to compare across them. In fact, using different wording across surveys can be a strength, as reproducing results with a different approach provides evidence that we are capturing a real phenomenon that exists across contexts. In this sense, different wording can help us compare across surveys to see how much responses are due to the essence of what the question captures (e.g., therapeutic vs. enhancement edits) rather than other details of the particular question wording.
Ideally, we would have both: studies with matching survey instruments that reach diverse nations and populations, and studies that strategically design questions that let us compare dimensions of potential gene edits, albeit using different wording than did previous studies. To do this, we need survey items to distinguish clearly between those essential concepts at the heart of the purpose of the question. This is especially true if we then also want to be able to understand the views of edits where those distinctions—such as between therapeutic versus enhancement edits—blur. In short, we are arguing for greater clarity and deliberateness in survey design rather than homogeneity.
In fact, the heterogeneity of cases, edits, viewpoints, and possibilities is why public discourse is essential to ethical, safe, and effective development of HGE. It also means that different survey contexts will likely require different wording and explanations. We need research that accommodates this diversity of potential edits and perspectives while helping us understand how these distinctions and details matter, or not, in public views of HGE globally. This includes expanding the scope of existing work to have greater coverage of publics surveyed. We need greater representation of publics in terms of countries, cultures, religious and other world views, experiences with genetic illnesses, differences in access to health care, and so on.
The key problem of lack of clarity in the purpose of questions across surveys is only one set of relatively low-hanging fruit that we will need to address as we expand research to understand the diverse perspectives that will and should impact how we advance HGE. If we can develop this base of robust, international insights, in part through survey research, we can then dive deeper into understanding the nuances of how features of particular potential applications and characteristics of different people and cultures shape views of gene-editing technology and possible paths forward. These insights will make it truly possible to incorporate and act on public knowledge, goals, and concerns in scientifically, ethically, and democratically legitimate ways.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
Funding Information
The authors acknowledge the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin–Madison (with funding from the Wisconsin Alumni Research Foundation) for its support of this research.
