Abstract
Advances in molecular genetics have led to the exponential growth of the direct-to-consumer genetic testing industry, resulting in the assembly of massive privately owned genetic databases. This article explores the potential impact of this new data type on the field of marketing. Drawing on findings from behavioral genetic research, the authors propose a framework that incorporates genetic influences into existing consumer behavior theory and use it to survey potential marketing uses of genetic data. Applications include business strategies that rely on genetic variants as bases for segmentation and targeting, creative uses that develop consumers’ sense of community and personalization, use of genetically informed study designs to test causal relations, and refinement of consumer theory by uncovering biological mechanisms underlying behavior. The authors further evaluate ethical challenges related to autonomy, privacy, misinformation, and discrimination that are unique to the use of genetic data and are not sufficiently addressed by current regulations. They conclude by proposing an agenda for future research.
In September 2018, the music streaming service Spotify announced that it would allow its 217 million users to upload their genetic data and create playlists that “match their genetic ancestry” (Hassan 2018). A few months later, Mexico’s national air carrier, Aeroméxico, launched a “DNA Discounts” campaign, offering to some customers discounted flights to Mexico, with discount rates that matched the traveler’s “Mexican DNA” percentage, determined by a genetic test (Vora 2019). 1 These actions mark the dawn of a new age, when consumers and firms alike may access information that until recently was rarely accessible: individual-level measures of the human genome.
Such data are now available through the direct-to-consumer genetic testing (DTC-GT) market, whose total sales in 2019 exceeded all previous years combined. Most sales come from personalized DNA testing kits—plastic tubes that consumers spit into and then ship off for genomic analysis. The motives for taking a DNA test vary, ranging from the desire to uncover forgotten family histories to assessing genetic predispositions for diseases. As of 2020, more than 30 million people have already taken such personalized DNA tests (Regalado 2019). A by-product of the growing DTC-GT market is the accumulation of massive genetic data sets. Industry leaders, such as AncestryDNA and 23andMe, encourage consumers to participate in research by answering surveys about anything from dietary habits to personality, generating enormous data sets for investigating genetic associations to numerous outcomes. Because the sales growth of DTC-GT kits might be slowing down (Farr 2020), DTC-GT companies are aiming to monetize their data to maintain growth. For example, Patrick Chung, a 23andMe board member, noted in an interview that “the long game here is not to make money selling kits, although the kits are essential to get the base level data” (Murphy 2013). In line with this notion, 23andMe has already accredited access to its data to the pharmaceutical company GlaxoSmithKline in a $300 million deal (Brodwin 2018).
The abundance of privately owned DNA data is concurrent with large-scale data collection efforts of public endeavors such as the UK BioBank, which genotyped nearly half a million U.K. citizens (Bycroft et al. 2018). National genome projects have also taken off in other countries, including Sweden and Singapore (Swede, Stone, and Norwood 2007). The accumulation of genetic data has already fueled the discovery of associations between genes and individual differences in many traits (MacArthur et al. 2017; Mills and Rahal 2019; Visscher et al. 2017), from dietary habits such as coffee and tea intake (Taylor, Smith, and Munafò 2018), to psychological traits such as adventurousness (Karlsson Linnér et al. 2019) .
The current research explores the potential impacts of the DNA revolution on the field of marketing and discusses possible uses and abuses of genetic data by marketers. It is organized as follows. First, we introduce key terms and review recent advances in the fields of behavioral genetics and genealogy. Drawing on these findings, we introduce a theoretical framework that incorporates genetic variables into existing consumer behavior theory. We rely on this framework to conceptually explore applications of genetic data for marketing strategy and research and evaluate under what circumstances genetic tools may be of value to marketers. We then raise ethical challenges that are unique to the use of genetic data in marketing, survey how current regulations address them (or not), and suggest potential solutions. Subsequently, we identify gaps in the current state of knowledge that must be filled to further advance the field and draw a research agenda to address them.
A Primer on Human Genetics
This section introduces basic concepts in human genetics and reviews related research that is relevant for the field of marketing (see Table 1). Our review is intended for readers who are not acquainted with the topic, and it focuses on research using DNA measures (the only type of genetic data currently available at scale). We admittedly abstract away from many subtleties and refer interested readers to other publications for more comprehensive reviews (Calladine et al. 2004; Lewis 2017) and surveys of research using other genetic data modalities (for epigenetics, see Lester, Conradt, and Marsit [2016]; for RNA sequencing, see Stark, Grzelak, and Hadfield [2019]; for gene therapy, see Wirth, Parker, and Ylä-Herttuala [2013]).
Illustrative Genetics Literature of Marketing-Relevant Outcomes.
The Human Genome and Its Measurement
The human genome is a sequence of about 3 billion base pairs. There are four types of bases: adenine (A), thymine (T), guanine (G), and cytosine (C). The base pairs are packaged into structures called chromosomes and are indexed based on their location on the sequence. Every human has two copies of each chromosome, one inherited from each parent. The base pairs in most genome locations are identical across all humans and are thus not informative about interindividual variability. However, there is a small number of locations (<2%) called polymorphisms where individuals commonly differ. The most common type of polymorphism is the single-nucleotide polymorphism (SNP), which denotes locations where a single base pair differs across individuals. 2 For most SNPs, only two possible base pair types are observed in a given species. The more frequent base pair is called the major allele, and the other is called the minor allele. As all humans inherit one chromosome from each parent, they also inherit two copies of each SNP, and thus have either zero, one, or two minor alleles in every SNP location. This property allows for the storage of an individual’s genetic data in terms of numbers of minor alleles at each SNP location (0, 1, or 2). Certain SNPs are located in subsequences of base pairs called genes. Genes shape the structure and function of every cell in the human body and are involved in many biological processes, most notably the construction of proteins (Ezkurdia et al. 2014). The human genome includes 20,000 to 30,000 genes.
Until recently, it was extraordinarily time consuming and expensive to measure genetic variation of individuals. However, technological advances following the sequencing of the human genome by the Human Genome Project (Collins, Morgan, and Patrinos 2003) have enabled cost-effective measurements of the genome across individuals. Common measurement techniques quantify variations in selected genome locations (typically under 1 million SNPs) where humans commonly differ. From there, around 20 million other SNPs are imputed.
Twin Studies and the Three Laws of Behavioral Genetics
Behavioral genetics is a discipline dedicated to studying the relationship between genetic code and behavioral traits (also called phenotypes). Early research in the field mainly consisted of twin studies—which rely on the fact that identical (monozygotic) twins are on average twice more genetically similar to each other than fraternal (heterozygotic) twins. Under some strong assumptions (Evans and Martin 2000), twin studies enable us to estimate a trait’s heritability—the part of its interindividual variance that can be attributed to genetics. Surprisingly, twin studies have shown that most human behavioral traits are, to some degree, heritable. This finding is commonly known as “The First Law of Behavioral Genetics” (Turkheimer 2000) and was illustrated for manifold phenotypes, from psychological traits such as personality to real-life outcomes such as marital status (see Table 1). Two other empirical regularities characterize findings from behavioral twin studies. The Second Law of Behavioral Genetics states that the effect of being raised in the same family is typically smaller than that of genetics. The Third Law of Behavioral Genetics denotes that substantial behavioral variations are not accounted for by either genetics or family environment. Nonetheless, the Three Laws are not without exceptions. On the one hand, many biological phenotypes that are highly relevant for marketing of health care, nutrition, and beauty products, such as lactose intolerance (for additional examples, see Table 1) are highly heritable. The downstream behavioral consequences of these traits (e.g., the tendency to buy dairy alternative products) 3 are expected to be more heritable. On the other hand, various culture-related characteristics, such as one’s native language or nationality, are entirely driven by the environment yet can be predicted from genetic ancestry (see the “Genetic Ancestry” subsection). The Three Laws demonstrate the promises and drawbacks of using measurements of the genome in marketing. Although genomes are informative about a wide range of relevant outcomes, genetic information is usually not informative for making individual-level predictions of most behavioral traits without additional variables (Harden and Koellinger 2020). A unique feature of DNA data is that they are currently immutable across one’s lifespan. Thus, such measures may be informative of one’s future behavior long before any other variables become informative.
Genome-Wide Association Studies
Although twin studies produce heritability estimates, they remain silent about contributions of specific genetic variants to a trait’s variability. The first wave of research addressing this gap consisted of candidate gene studies—theoretically motivated examinations of associations between phenotypes and SNPs located in specific genes that were a priori hypothesized to be related to them (Kwon and Goate 2000). For example, the known role of serotonin in depression motivated studies investigating the association between depression and SNPs located on serotonergic genes (Ogilvie et al. 1996). Although candidate-gene studies have yielded eye-catching findings for some applications, most in the behavioral domain have failed to replicate in subsequent studies. This failure is attributed to low statistical power, a lack of appropriate correction for multiple hypotheses testing, and a lack of control for confounding factors (Chabris et al. 2015). Development of genotyping techniques, together with massive data collection efforts, has led to a paradigm shift from candidate-gene studies to genome-wide association studies (GWAS; Visscher et al. 2017)—data-driven investigations of the relationships between phenotypes and SNPs across the entire genome. Due to the large number of associations studied, GWAS methodology emphasizes stringent correction for multiple testing, preregistration, and replication in independent samples. Over the past decade, GWAS samples have grown from thousands to millions of participants, and the increase in statistical power has allowed researchers to identify numerous replicable associations between SNPs and behavioral phenotypes (see Table 1). However, a typical behavioral trait is associated with numerous variants, each of them accounting for a very small part (R2 < .01%) of its variance, an observation known as the “Fourth Law of Behavioral Genetics” (Chabris et al. 2015).
While the contribution of individual SNPs to the variability of most human behavioral traits is minute, one can obtain greater explanatory power by aggregating their effects to a polygenic risk score (PRS). The PRS is a linear combination of the most significant SNPs identified in its GWAS, and it becomes increasingly accurate as sample sizes increase. For example, a PRS constructed from a recent GWAS in 1 million people was able to predict 13% of the variance in the educational attainment in an independent sample (Lee et al. 2018). Polygenic risk scores are similarly informative on many behavioral traits, and firms that possess genetic data can construct them using GWAS summary statistics that either are publicly available (MacArthur et al. 2017) or can be obtained from other organizations. Yet a significant share of current publicly available PRSs of behavioral phenotypes are not accurate enough for making individual-level predictions. Furthermore, the predictive accuracy of PRSs typically decreases when applied to populations different from those used to estimate them (e.g., in ethnicity and socioeconomic status; Duncan et al. 2019; Martin et al. 2019). Nonetheless, publicly available PRSs are typically computed from samples of only up to a million individuals, whereas DTC-GC companies have access to samples that are an order of magnitude larger. Moreover, advanced statistical techniques show a high potential for obtaining more accurate genome-based predictions (see the “Advanced Targeting and Prediction” subsection).
Genetic Ancestry
The possibility to quantify genetic variations in individuals has also opened up the path for studying genetic variation between populations. A common approach is to perform principal component analysis on the genetic data of a population sample in search of high-order factors that capture its variability (Alexander, Novembre, and Lange 2009). These principal components (PCs) are highly informative about one’s genetic ancestry and location. 4 For example, a study of individuals from 51 populations worldwide found that the first PC distinguished sub-Saharan Africans from non-Africans and the second PC differentiated populations from Eastern and Western Eurasia (Li et al. 2008). These findings were echoed by studies of less diverse samples that used the same method for high-resolution ancestry mapping (e.g., Novembre et al. 2008). The PCs are also commonly used as control variables in GWAS and other population studies to account for environmental factors that vary across ethnic groups (Price et al. 2006; Nave et al. 2018). The relevance of genetic ancestry for marketing stems from its noncausal correlation with environmental factors such as language and culture. For example, individuals of Irish ancestry are more likely to be interested in a cultural heritage trip to Ireland or celebrate Saint Patrick’s Day with a pint of Guinness, and their interests may stem from cultural influences related to their ancestry. Marketers with access to genetic data may be able to infer such behavioral tendencies and the motivations underlying them and use such insight for targeting and positioning.
Incorporating Genetics into Marketing Theory
The Four Laws of Behavioral Genetics provide solid empirical grounds from which explorations of genetic effects on consumer behavior can embark. Translating these fundamental insights into applications, however, depends on incorporating them into consumer behavior theory and models. This section proposes such a framework, illustrated in Figure 1. Our theory extends the well-known stimulus-response model, which describes behavior as arising from the interaction between the organism (consumer) and stimulus (Belk 1975). The stimulus is described via object variables, such as the products, prices, and brands offered, and situational variables, such as location, time of day, and context. The organism has traditionally been marked by personal variables denoting characteristics that are “stable over times and places of observation and may therefore be attributed consistently to the individual” (Belk 1975, p. 36). Typical personal variables include demographics, psychographics, and behavioral dispositions.

A model of genetic effects on consumer behavior.
Our framework extends the stimulus-response model by incorporating the elementary factors described in the previous section. We group these factors into three categories: (1) environment, which includes stable cultural, social, and geographical factors, as well as the flow of time influencing development and aging; (2) family factors, such as parenting style; and (3) the individual’s genome, which depends on familial background, except for cases of adoption and recomposed families. Our framework considers familial and environmental factors as external to the organism, where the genome is within the organism and constitutes the most stable type of personal variables: it is fixed at conception and remains mostly stable throughout the lifespan. Our framework also extends the description of the organism by incorporating stable biological traits such as physiology (e.g., height), anatomy (e.g., brain structure), and typical brain function (e.g., connectivity between brain areas at rest). These biological traits are more directly influenced by genetics and typically mediate the influence of genetics on nonbiological personal traits. When such mediation occurs, the mediating biological trait is commonly referred to as an endophenotype.
As indicated by the Three Laws of Behavioral Genetics, the environment plays a major role in the development of most personal traits. Nonetheless, genetic influences affect many outcomes of interest, starting from prenatal development and early-life stages. These effects occur via interactions with familial and environmental factors and are mediated via endophenotypes that are more directly susceptible to genetic influences, such as brain anatomy (Thompson et al. 2001). The relative impact of genetics varies by trait. In some cases, few genetic variants have strong direct effects on a biological endophenotype (e.g., lactose tolerance; for other examples, see Table 1), and genetic data will be highly informative of their downstream consequences (e.g., interest in dairy alternatives). Most personal traits, however, are only moderately heritable and are influenced by interactions between numerous genetic and environmental factors. Importantly, the genome is also informative about characteristics that are not influenced by genetics at all, due to the noncausal correlations between genetics and environmental or familial factors (dashed lines in Figure 1). If genetic data are available, such links allow for the inference of consumer characteristics such as cultural heritage and language.
The impact of genetics continues through the lifespan via two main channels. First, genetics affects later-life outcomes through its prior influence on traits that had developed earlier. For example, variants that contribute to early-life intellectual development continue to affect one’s educational attainment and career in adulthood. Second, genetics continues to interact with environmental factors (e.g., time, nutrition) to influence later-life development of personal traits through biological endophenotypes such as brain anatomy and function (Smith et al. 2020). Although the heritability of later-life traits is typically moderate, characteristics that have a strong biological basis are well-explained by interactions between genetics and time. For example, a few SNPs explain 38% of the variance in hair loss (alopecia) in men (Pirastu et al. 2017), whose associated market size is expected to reach $3.9 billion by 2026 (Newswire 2020). These SNPs likely capture behavioral variance in this trait’s downstream consequences.
The final influence of genetics on consumer behaviors, such as information search, purchase decisions, satisfaction, and word-of-mouth activity, occurs through interactions with situation and object. These effects are mediated via biological processes (e.g., changes in neural activity and hormonal levels) that regulate the consumer’s emotional and physiological state, as well as cognitive processes such as attention, valuation, and memory (Plassmann, Ramsøy, and Milosavljevic 2012). For instance, genetics affects one’s tendency to be an early riser or a night owl (Hu et al. 2016), and this disposition affects arousal via interaction with the time of day (situational variable) to influence behavior. Likewise, situational stressors interact with genetics to generate a person-specific stress response, regulated by activation of the hypothalamic–pituitary–adrenal axis (Federenko et al. 2004). This response, in turn, influences decision making (e.g., Margittai et al. 2016, 2018). Genetics also interacts with object variables, as products and marketing messages may affect genetically regulated attention, reward, and valuation processes. For example, the presence of a desirable food item (e.g., in a supermarket tasting counter) elicits an appetitive (or Pavlovian) response that may increase its subjective valuation (Bushong et al. 2010). Animal studies suggest that individual differences in this tendency, which is biologically implemented by the dopaminergic system, is partly accounted by genetic variation (Flagel et al. 2007). Additional interactions between genetics and object occur via indirect genetic influences on heritable traits such as personality (Matz et al. 2017) and behavioral dispositions such as the tendency to choose the default or compromise option in a choice set (Cesarini et al. 2012; Simonson and Sela 2011). Genetic data may allow for approximating these tendencies without having to rely on large-scale customer surveys.
Applications for Marketing Strategy
Building on the framework introduced in the previous section, the following two sections discuss how the availability of genetic data may advance marketing practice and research (see Figure 2 and Table 2). We highlight that some of these applications, especially when employed by private entities in a for-profit setting, raise legal and ethical challenges (discussed in a subsequent section). It remains to be seen whether their potential benefits outweigh these concerns.

Genetic applications for marketing strategy.
Using Genetics to Advance Marketing Research.
Gene-Based Segmentation
When genetic variations correspond with consumer needs, firms may rely on genetic data to divide the market into distinct, stable, and identifiable subsets to be reached with unique marketing mixes (Frank, Massey, and Wind 1972). In some cases, genetic variants are indeed directly associated with consumer needs via known mechanisms. A firm or institution could thus rely on genetic data to identify segments that benefit from its products and services. Prior research has uncovered mechanisms that link genetic variants to phenotypes that closely map onto consumer needs in various domains (see Table 1). Most current knowledge concerns outcomes related to health care, nutrition, and beauty, with applications such as promoting screening or prevention products to individuals who are at increased risk of developing pathologies such as cancer, diabetes, or Alzheimer’s disease. Indeed, leading DTC-GT companies already provide information on such risks to their consumers and aspire to use their data to become the “Google of personalized health care” (Murphy 2013). As genetic databases grow in size, research for nonmedically relevant causal effects is expected to increase and yield new discoveries that are relevant for marketing strategy across domains. For example, a brand manager of a product for preventing men’s hair loss could rely on a specific genetic variation linked to male pattern baldness (Pirastu et al. 2017) to identify segments that are genetically disposed to alopecia. The brand manager may even be able to identify future customers long before they show any behavioral indication that they may need the product (e.g., via web searches) and increase their awareness of the brand (e.g., by advertising to males in their late 20s who are genetically disposed to baldness in their mid-30s).
Using Whole Genomes to Infer Other Segmentation Bases
As Figure 1 illustrates, genetic variation correlates with almost every personal characteristic. As a result, genetic data can be used for reaching market segments when nongenetic managerially relevant variables cannot be easily observed at scale. In contrast to the direct use of specific genes as segmentation bases, most SNP associations to behavioral traits occur outside of genes, and marketers can leverage their cumulative information to infer other (nongenetic) segmentation bases. Once genetic data are available, a firm can construct for every individual in a target population PRSs that are predictive (to some degree) of every trait for which a GWAS has ever been performed (Buniello et al. 2019). Similarly, a firm can rely on previous findings of genealogical research for calculating individual-level ancestry estimates to infer various culturally distinct motivations, interests, and behaviors. The usefulness of genomes as proxies for other segmentation variables crucially depends on how predictive they are of the target trait relative to other measures. Although genetic data might not be the most predictive of a target trait, it may be more convenient than other data sources such as surveys, which might be costly and subject to low response rates. Furthermore, adding genes to predictive models that use other variable types may improve their predictive accuracy at the individual level (see the “Are Genes More Predictive Than Other Measures?” subsection).
Advanced Targeting and Prediction
Marketers often aim to predict the probability of a single behavior (purchase, click on an ad, etc.), without necessarily understanding the underlying mechanism. As such, even a simple PRS constitutes a straightforward tool for targeting. Firms that obtain genetic data, but not samples that are large enough to estimate the coefficients used to construct a PRS, could potentially recover them from the public domain (Buniello et al. 2019) and other organizations. More advanced statistical learning methods (Libbrecht and Noble 2015), including deep learning algorithms (Zou et al. 2019; Eraslan et al. 2019), have been adapted to genetic data to generate more accurate predictions. Furthermore, when genetic predictive estimates are available, they can be used in conjunction with other variables for early identification of consumers with high lifetime value. For instance, a coffeehouse chain may want to target consumers with a high genetic potential to enjoy espresso before they show any prior espresso-purchasing patterns in their behavioral data. While counterintuitive, such an approach would potentially allow for reaching consumers who have not yet developed an espresso consumption habit and thus are not “locked in” to a particular brand. This is in contrast to targeting based on more traditional variables (e.g., behavioral measures) that are likely to become predictive only after the person has already tried and developed the habit of consuming a competitor’s brand.
A different approach using genetic data for behavioral prediction is to consider that genomes are representative of family relations and, as such, can be used to compute a comprehensive map of relatedness between individuals. Such a map can then be used for targeting in a similar manner to social network graphs (Van den Bulte and Wuyts 2007; Wind 1994). Another possibility is to compute genetic relatedness (or inversely, genetic distance) between individuals (Queller and Goodnight 1989), either for the whole genome or chromosome-wise, and leverage this metric for behavioral prediction. For instance, a company could target people who are within a small genetic distance from existing clusters of loyal customers. Methods such as collaborative filtering, nearest neighbors, or more advanced machine learning algorithms could be applied to implement such strategies (Lin and Lane 2017). Notably, geneticists are already using similar techniques that do not depend on identifying links between specific genes and a phenotype to estimate the variance in a trait that can be explained by SNP-derived genetic distance (Yang et al. 2010). For example, such methods have shown that 51% of the general population variance in fluid intelligence could be explained by genetic distance, quantified from SNPs, using a sample of a few thousand people (Davies et al. 2011).
Creative Uses: Product Development and Positioning
Finally, DNA has a unique status as a “cultural icon” (Nelkin and Lindee 1995), which opens the door for creative uses, including new product development and repositioning of existing products and brands. Genetic data provide a new means of “knowing thyself,” connecting to previously unknown genetic relatives, and building bridges between people and their ancient family histories (Tutton 2004). Leading DT-GTC companies have created several new products and positioning strategies that translate their customers’ fascination with DNA into applications that promote their sense of community and personalization. Notable examples are the aforementioned partnership between Spotify and AncestryDNA and the collaboration between Airbnb and 23andMe, which developed a service that helps travelers organize cultural experiences tailored to their ancestry. Ancestry-based positioning strategies of products and services in other domains, including entertainment (e.g., period dramas such as Downton Abbey and Braveheart), food (e.g., traditional cookbooks) and tourism (e.g., museums, heritage sites) could similarly benefit from such partnerships and creative uses. Similar strategies could employ PRS or single genetic variants. For example, most elite power athletes have a specific variant of the ACTN3 gene that encodes a protein expressed in muscle fibers (Al-Khelaifi et al. 2019; Lee et al. 2016). A sporting brand may be able to develop positioning strategies that generate a sense of community among amateur athletes who carry this variant and promote their sense of identification with brand ambassadors who also carry it.
Using Genetics to Advance Marketing Research
Genetic data can refine and substantiate existing theories of consumer behavior by illuminating the nature of relationships between traits and revealing the biological mechanisms underlying individual differences in behavior. Some of these applications are similar in nature to uses of genetics in other fields of the social sciences (Benjamin et al. 2012; Harden and Koellinger 2020), where others are unique to marketing research.
Estimating Causal Relations
In many domains of consumer research, it is not feasible to study causal relations between variables experimentally. For example, experimentally studying the causal relationship between one’s consumption habits and long-term happiness (Gilovich, Kumar, and Jampol 2015; Schmitt, Brakus, and Zarantonello 2015) would require randomly assigning individuals into groups that differ in their consumption habits or in their well-being. Such assignment could be extremely difficult for some variables and even unethical (e.g., if a group is required to worsen dietary habits, creating a threat to their health). Furthermore, studying such causal relations using observational data is also not straightforward. First, many personal and environmental factors (e.g., socioeconomic status, personality) confound the relationship between the explanatory variable and outcome. Second, there exists a possibility of reverse causality.
Sometimes, it is possible to overcome the aforementioned limitations using instrumental variables (IVs; Angrist, Imbens, and Rubin 1996). Instrumental variables are factors that cause changes in the explanatory variable of interest (e.g., consumption habits) and have no other independent effects on the outcome (e.g., happiness), enabling one to estimate the causal effect without bias due to confounds and reverse causality. Under some circumstances, genetic measures can be used as IVs. This is possible because the transmission of genetic variants from parents to offspring is determined via a “genetic lottery” that is independent from environmental factors (conditioned on the parents’ genomes). Furthermore, because genetic variations are not influenced by one’s environment or habits, reverse causality is not a concern.
The most common method that uses genetic measures as IVs is Mendelian randomization (MR; Smith and Ebrahim 2004), which can be thought of as a natural experiment that occurs at the time of conception. Mendelian randomization uses genetic variants that have well-established causal influences on the explanatory trait as IVs to quantify the trait’s causal effect on an outcome. For example, medical researchers have been using variants that regulate alcohol metabolism as IVs for studying the long-term causal effects of alcohol consumption on outcomes such as cardiovascular disease and cognitive decline (e.g., Chen et al. 2008). When using genetic data to infer causal relations, it is important to keep a careful eye on the assumptions of the methods used to estimate the effects. One crucial issue is that the transmission of genes occurs at random only within a family. Therefore, MR studies should ideally rely on within-family designs that compare genetic variation between related individuals (e.g., sibling pairs, parent–offspring trios). Mendelian randomization studies that do not use such designs are susceptible to biases of various sources (Davies et al. 2019). A second important assumption of MR is that the genes used as IVs affect the outcome only via their effect on the explanatory trait (a criterion called “exclusion restriction”). It is therefore important that the mechanisms linking the genetic IVs and the explanatory variable are well-understood, and that the genes’ prevalence in the population studied does not correlate with unobservable environmental factors that might influence the outcome (Koellinger and De Vlaming 2019).
One limitation of MR is that most genetic variants are relatively weak instruments, because their associations with personal traits of interest are small. Moreover, genetic variants typically correlate with multiple traits that could influence an outcome, a phenomenon called “pleiotropy.” Several statistical techniques that rely on summary statistics from large-scale GWAS (instead of single variants) have been recently proposed to overcome these issues (DiPrete, Burik, and Koellinger 2018; O’Connor and Price 2018; Zhu et al. 2018). Each of these methods relies on a different set of assumptions concerning the relationships between genetics and other variables that are included in (or omitted from) the model, for estimating a causal effect. To mitigate concerns that claims of causality are driven by any specific assumptions, it is crucial to verify that a study’s conclusion is consistent across methods. We anticipate that continuing development of such methods, together with the growing availability of data sets that include genetic measures of related-individuals, will provide a fertile ground for investigations of causal relations for a broader range of settings in the near future.
Accounting for Otherwise Unobserved Heterogeneity using PRSs
Genetic variation between individuals is fixed across the lifespan and can be related to many outcomes of interest to consumer researchers. As such, including genetic variables (most notably PRSs and genetic PCs) in statistical models that quantify any nongenetic effects provides a means to control for unobservable factors that would otherwise be a part of the model’s error. Such reduction of the model’s error would increase the study’s statistical power and allow estimating model parameters of interest with less uncertainty (Benjamin et al. 2012). For illustration, consider a field experiment aiming to test the efficiency of different campaigns for preventing smoking initiation among teenagers. In such settings, PRSs can explain one’s genetic tendency to smoke, as well as variance related to many preexisting personal characteristics that are not contaminated by the treatment and could be related to future smoking (e.g., extraversion). Including such PRSs in the model would therefore allow for quantifying the treatment effect more accurately.
Studying Person–Object and Person–Situation Interactions
Genetic measures are also useful in studying how consumers differentially respond to marketing stimuli or situational contexts. As noted previously, generic variants per se are not of great interest to marketers, but they allow for calculation of PRSs (based on any previously published GWAS) to approximate personal characteristics that cannot be easily measured in large samples (e.g., intelligence, personality) or when the participants’ tendencies are not yet expressed behaviorally. Going back to the smoking-prevention field experiment example, constructing PRSs for many unobservable traits in the sample could be used for carrying a post hoc analysis to investigate whether certain individuals more strongly respond to a certain treatment versus another.
Studying Relationships Between Traits Using Genetic Correlations
Because genetic variation correlates with many personal characteristics, it provides a means for studying the relationships between traits and whether they arise from genetic or environmental causes. A useful method for quantifying the genetic overlap between traits is estimating their genetic correlation (rg), which measures the amount of variance they share due to genetic causes (Lynch and Walsh 1998). A useful feature of genetic correlations is that they can be estimated between any two traits for which GWAS has ever been conducted—even for traits that have not been measured in the same sample (Bulik-Sullivan et al. 2015). A recent example for insight obtained from genetic correlations comes from a GWAS of general risk tolerance in a sample of over 1 million people (Karlsson Linnér et al. 2019). This study found that the genetic correlations between general risk tolerance and many domain-specific risky behaviors—such as substance use, speeding on motorways, and self-employment—were substantially larger than the correlations observed between the behavioral phenotypes. This finding indicates that common genetic causes influence all these phenotypes, where the translation of this genetic tendency to each of the domain-specific risky behaviors depends on environmental factors.
Identifying Biological Mechanisms
Genetic data can enrich marketing theory by illuminating biological mechanisms that underlie behavior, akin to research in the field of consumer neuroscience (Plassmann, Venkatraman, and Huettel 2015). Apart from straightforward genetic effects on traits like lactose intolerance, genetic analyses can provide insight into how different brain systems mediate the influence of genetics on complex behavioral traits, such as economic preferences and consumption patterns. Although brain imaging studies have long ago uncovered multiple systems that are functionally involved in emotional and cognitive processes, linking functional brain measures to differences across individuals is not straightforward, because of their low test-retest reliability (Elliott et al. 2020) and the high cost of obtaining such measures at scale. Genetic variation, in contrast, can be measured reliably and inexpensively in large samples, and once genetic variants are linked to a behavioral trait, they can be tied to neurobiological systems via bioinformatic tools (e.g., Finucane et al. 2015). For example, the recent GWAS of general risk tolerance pointed to multiple brain systems that are genetically associated with the trait, including the prefrontal cortex, the amygdala and mid-brain regions involved in reward processing (Karlsson Linnér et al. 2019). An alternative promising approach is to derive biologically informed PRSs, which reflect aggregate effects of variants related to known biological systems (e.g., the dopaminergic genes) on a target phenotype, and investigate their relationship with biological endophenotypes (Dass et al. 2019). The rapid development of bio-annotation techniques, together with the formation of data sets that include both genetic and brain-imaging measures (Aydogan et al. 2021), will facilitate additional discoveries of gene-brain-behavior pathways in the near future.
Ethical and Legal Challenges
Similar to other data types, some marketing uses of genetic data can improve individuals’ well-being and have a positive impact on society as a whole. For example, focused early interventions based on genetic data may help health care providers reach patients at high risk for conditions such as diabetes and hypertension and provide them strategies that mitigate these risks (e.g., via physical exercise and diet; Whelton et al. 2002). However, genetic data might facilitate manipulation and exploitation of vulnerable individuals (Susser, Roessler, and Nissenbaum 2019). For example, e-cigarette companies could use genetic data to target teenagers who are more genetically prone to develop nicotine addiction (Richtel and Kaplan 2018). Yet the use of human genetic data by marketers raises even further ethical and legal challenges. These issues are the result of several unique features of genetic data, which contain immutable and identifiable information that is predictive of future behavior and disease, both for the individual and their genetic relatives. For this reason, genetic data have been considered particularly sensitive even within the medical field, a view known as “genetic exceptionalism” (Green and Botkin 2003). In this section, we highlight serious ethical issues that emerge from these unique properties, review the current state of legislation in this area, and propose possible solutions.
Identifiability and Informed Consent
Except for monozygotic twins, genetic data can be uniquely attributed to one person: A mere 60 to 300 randomly selected SNPs are sufficient to identify an individual (Zaaijer et al. 2017). Anonymizing genetic data without destroying a large share of the information is not a simple task. Some methods, for instance, try to balance anonymity and information preservation by clustering the data before analysis (Lin and Wei 2009). Even when the data are labeled as anonymized, however, the inherent information they contain could allow for potential reidentification attacks (Wjst 2010). Due to the combination of this unique identifiability property and the rich information content of genetic data, using them for research requires obtaining informed consent from study participants (Beskow et al. 2001). Nonetheless, even in the ethically stricter research setting, acceptable anonymization and consent practices have been subject to heterogenous standards (Elger and Caplan 2006).
Because most current human genetic research involves analysis of secondary data that have been typically collected long before hypotheses are formed, obtaining consent is challenging. A common solution has been to ask participants to consent for all future research that falls within a broadly defined scope. For example, 23andMe informs customers who volunteer to participate in research that “the topics to be studied span a wide range of traits and conditions” and that “some of these studies may be sponsored by or conducted on behalf of third parties.” 5 Similar consent procedures are used in practice by other DTC-GT firms and biobanks. Advocates of the broad consent approach argue that it provides an ideal trade-off between participants’ autonomy and the public interest to benefit from research outputs (Hansson et al. 2006). However, it is unclear whether genetic research subjects can fully appreciate the potential benefits and risks of any future research at the time of consent. For instance, it is unlikely that 23andMe customers could foresee that access to their data would be sold to a pharmaceutical company under the broad label of “research.” To overcome these issues, scholars have proposed using dynamic or hybrid consenting protocols, where individuals can opt in to studies or withdraw their consent online (Kaye et al. 2015; Ploug and Holm 2015).
To complicate matters further, one’s genetic data are informative not only about oneself but also about one’s nongenotyped relatives. This issue was recently illustrated in the apprehension of Joseph James DeAngelo, the alleged Golden State Killer, who was arrested after a fraction of his genome could be matched to the DNA of distant relatives, who uploaded their genetic data to a searchable public genealogical database (Ram, Guerrini, and McGuire 2018). Although relatives of genetic research participants are potentially identifiable, current guidelines do not require obtaining their consent yet recommend that participants consult relatives when deciding to take part in research (McGuire, Caulfield, and Cho 2008). These guidelines may change in the future, as genetic identification technology advances.
In summary, several unique issues make it difficult to anonymize data and obtain fully informed consent from participants of genetic research. Because this is an active area of study, we recommend that researchers closely monitor the emerging literature on the topic and ensure that their studies comply with the latest ethical guidelines. It is imperative that analyses of publicly available genetic data, collected thanks to public funding, produce discoveries that benefit society as a whole. As for research using privately owned genetic data, it is crucial that informed consent is obtained and that all studies fall beyond a shadow of doubt under the scope of research to which participants had consented.
Privacy and Security
Many of the features that turn genetic data into a marketing opportunity also raise fundamental privacy and security challenges. Genetic data are identifiable, predictive of virtually every aspect of one’s life, and are even informative about one’s relatives—and thus, could enable firms to target individuals who never opted to share any information. Given that major companies have been known to keep “shadow profiles” of individuals who did not register for their services (Garcia 2017), this potential privacy threat is imminent. The assembly of privately owned genetic databases also gives rise to security concerns, as major data breaches become increasingly common (Cheng, Liu, and Yao 2017). In these cases, third parties obtain data against the will of both the consumer and the data holder. Once leaked, data will likely be used regardless of any regulation or ethical norm.
As massive volumes of genetic data reside on the servers of private firms, the question arises as to whether legislation and practice sufficiently protect the privacy of consumers from having their data exploited against their interests. While leading DTC-GT companies argue that their research complies with ethical guidelines, and they have a clear interest to avoid public controversies, it is unclear whether they follow the same principles when using data for marketing. As of July 2020, market leader 23andMe indicates in its (unilaterally modifiable) privacy statement that it would not process genetic data for marketing purposes without explicit consent, implying that it may do so if consent is given. 6 Furthermore, ethical recommendations are likely not a priority for all entities that own genetic data. In the absence of legal regulation and transparency, it becomes difficult to know exactly how private companies use the data.
Surprisingly, current federal laws in the United States concerning the use of genetic data have little implications on the DTC-GT industry, and U.S. lawmakers have mostly remained silent (with some notable exceptions) regarding potential regulations on the use of genetic data. As a consequence, the license to use and share genetic data for marketing purposes depends on the privacy policy of each individual company. Currently, a large number of U.S.-based DTC-GT companies do not provide their customers with any privacy information (on their website or the testing-kit packages) prior to the purchase of DNA kits, and the policies of many of the remaining companies indicate that they may use genetic data for purposes other than delivering ancestry and health reports. Furthermore, companies often reserve the right to share genetic data with third parties in cases of merger, acquisition, or bankruptcy, or to modify their privacy policies without notification (Hazel and Slobogin 2018).
In contrast to U.S. federal law, the recent European General Data Protection Regulation, commonly known as GDPR, explicitly recognizes genetic data as “sensitive” under Article 9 (Shabani and Borry 2018) and provides unique protection against sharing of genetic data (even semianonymized). Under current European regulations, one has to provide “explicit consent to the processing of personal data for one or more specified purposes.” 7 Nonetheless, consumers have been known to easily approve mining of their data without reading the legal terms and services conditions (Obar and Oeldorf-Hirsch 2020). Once such consent is provided, virtually every marketing application becomes possible, despite the strict sharing restrictions in place. Furthermore, DTC-GT companies can process genetic data and use them for running marketing campaigns on behalf of other companies, without having to directly share them. For example, DTC-GT companies can offer to forward a message to a subsample of their clients satisfying some criteria on behalf of other entities, without disclosing any data, just as Facebook allows advertisers to target its own users without sharing their data (Matz et al. 2017). Thus, regulatory limits to genetic data sharing may end up simply granting DTC-GT companies a monopoly over the data. Finally, it is important to recognize that the power of regulation might be limited. Industry practices typically advance faster than the policies trying to regulate them, with regulations doing too little too late after malpractice had already been exposed (Isaak and Hanna 2018). Moreover, technology giants have a long history of violating data protection laws and do not appear to be deterred by financial disincentives, as indicated by numerous condemnations and legal battles between regulatory agencies and these entities.
A possible solution to the privacy and security issues—which, in our view, is crucial for the continuing growth of the DTC-GT market—is adoption of industry standards that guarantee acceptable practices. One such framework, previously proposed to address similar challenges in artificial intelligence (AI) research, can be directly applied to genetic data (Thaine and Penn 2020). This framework, namely the “Four Pillars of Perfectly Privacy-Preserving AI,” articulates four principles for maintaining privacy, security, and usability of data: (1) training data privacy: a malicious actor will not be able to recover genetic data from other accessible information (e.g., model output); (2) input privacy: a user’s genetic data should not be observed by other parties, including the model creator; (3) output privacy: the output of a model should not be visible by anyone except for the user whose data are being analyzed; and (4) model privacy: the model (trained or not) should be protected from being stolen by a malicious party.
A specific strength of this framework is that privacy is considered from both sides. From the consumer side, data and the inferences (e.g., genetic reports, ads selected for the consumer) are not visible to the company. From the company’s side, its algorithms and parameters (e.g., GWAS weights) are not visible to the consumer. Importantly, no data have to be kept on the consumer side if a sufficiently strong encryption algorithm is applied to them before delivery to third-party servers. Thus, the consumer would only need to preserve the encryption key.
While algorithms satisfying some or all of the aforementioned criteria are still under active research, several methods to perform privacy-preserving GWAS already exist (Johnson and Shmatikov 2013; Uhler, Slavkovic, and Fienberg 2013; Yu and Ji 2014). With these methods, the GWAS’s summary statistics (e.g., weights, p-values) are known to the analyst, yet the PRSs can only be computed on the consumer side. Concurrently, general methods to allow for privacy-preserving versions of machine learning algorithms are being developed and can be expected to be adapted to genetic data mining, following their nonprivacy preserving counterparts (Li et al. 2017; Shokri and Shmatikov 2015). An additional advantage of such methods is that users can withdraw their data from the pool unilaterally by deleting either the data or the encryption key. However, even though such methods are constantly being developed, it is far from clear whether companies will end up adopting them. Major industry actors might not feel compelled to change their practice without strong incentives. A possible solution would be to enforce the use of these technologies through regulations. We can, for instance, picture a legal framework wherein a DTC-GT kit cannot be sold in a country without adhering to a framework of this type.
Misinformation
In November 2013, the U.S. Food and Drug Administration ordered 23andMe to suspend its genetic health reporting service until the company provided sufficient evidence to support clinical claims made in its reports. The company, which at that time had already sold half a million kits, relaunched the service only two years later, with less elaborate reporting that emphasized the probabilistic nature of genetic diagnosis (Pollack 2015). Although regulation of health-related genetic applications has tightened up, this cautionary tale illustrates how companies might use the scientific image of genetics in their consumers’ minds to oversell the utility of genetic information. When doing so, marketers could rely on genetic data to make pseudo-scientific claims that promote the appeal of products and services, as commonly done in the wellness industry (Baker and Rojek 2020).
In the United States, because nonmedical genetic applications do not pose direct health risks to consumers, the Federal Trade Commission, rather than the Food and Drug Administration, is responsible for regulating potentially deceptive marketing messages that make claims on the utility of genetic data (Kasperbauer and Wright 2020). However, such oversight might be difficult to exercise, for three main reasons. First, although genetic data are only moderately informative of most human behavioral traits, they do indeed contain some information. As a result, it is difficult to argue that genetic-based recommendations are entirely deceitful. Second, consumers’ perceptions of genetics (Zheng and Alba 2021) might make them prone to believe that genetic-based recommendations are always backed by scientific evidence, even when such claims are not made explicitly. Third, people have a poor intuitive sense of probabilities and thus might be prone to overestimate the informativeness of genetic-based recommendations even when their probabilistic nature is communicated (Tversky and Kahneman 1983). In our view, regulation should ensure that companies disclose the science underlying any scientific claims (and its limitations), attempt to communicate probabilistic information intuitively, and avoid the use of deterministic language when appropriate (Williams-Jones and Ozdemir 2008).
Discrimination
Similar to discrimination based on other unchangeable characteristics, negative treatment of individuals based on their actual (or assumed) genetic markup is a potential source of distress, exclusion, and loss of opportunities (Billings et al. 1992). Furthermore, such discrimination might deter individuals from taking genetic tests that could improve their health care or from participating in genetic research that benefits society as a whole (Joly et al. 2017). To date, most conversations concerning genetic discrimination among ethics and law scholars have focused on potential abuses of genetic data by insurance providers and employers (Joly, Braker, and Le Huynh 2010; Lemmens 2004). Yet marketing applications of genetic data give rise to similar concerns. Aeroméxico’s aforementioned DNA-discounts campaign is a recent prominent example of what is essentially genetic-based price discrimination. While it is unclear whether customers indeed received DNA discounts, the campaign was covered by major popular media outlets and, in general, was received positively by the public.
From a legal standpoint, the 1996 Health Insurance Portability and Accountability Act and 2008 Genetic Information Nondiscrimination Act prohibit insurance companies (for specific types of policies) and employers from discriminating against people based on genetic information, but they do not protect individuals from discrimination in other circumstances. However, some state laws, most notably California’s Unruh Civil Rights Act, explicitly ban businesses from discriminating against consumers based on genetic information. Similarly, Florida statutes have provisions requiring notification of an individual if genetic information was used in any decision to grant or deny any insurance, employment, mortgage, loan, credit, or educational opportunity. In our view, discrimination based on one’s genetic information is a serious issue that should be addressed in the same way as other types of discrimination.
Self-Reinforcing Loops
A final nontrivial concern is that marketing strategies that rely on consumers’ genes for predicting their preferences and behavior might generate self-reinforcing loops (Grafanaki 2017) that perpetuate inequality and deprive consumer’s exploration of options that do not align with their genetic markup. For example, providers of SAT preparation kits could offer promotions to high school students who are genetically disposed to higher education (Lee et al. 2018) and, by doing so, give preferential treatment to individuals who are already in an advantageous position.
Open Questions
Forthcoming discoveries in the field of behavioral genetics will undoubtedly advance our understanding of how genetics interacts with the environment to influence behavior. However, assessing the utility of genetic tools for the advancement of marketing theory and practice, and accurately evaluating the severity of ethical concerns, would require addressing several gaps of knowledge in the current literature (summarized in the Appendix).
Unveiling the Genetic Underpinnings of Consumer Behavior
Many genetic associations of phenotypes that are of interest to marketers have been identified over the past decade. Nonetheless, the genetic underpinnings of many traits that are more closely tied to consumer behavior and are known to be heritable (see Table 1) have remained elusive. There are two likely reasons for this gap. First, marketing scholars have largely neglected the influence of genetics on consumer behavior (with a few notable exceptions, e.g., Simonson and Sela 2011). Research in related fields, however, points to genetic effects on many traits that are central to consumer behavior theory and practice. Examples include investment decisions (Cesarini et al. 2010; Cronqvist and Siegel 2014), altruism and trust (Cesarini et al. 2009; Pedersen et al. 2015), susceptibility to placebo effects (Hall, Loscalzo, and Kaptchuk 2015), voting turnout (Loewen and Dawes 2012), and mobile phone usage patterns (Miller et al. 2012). Molecular genetic studies of these traits would be a straightforward extension that can enrich marketing theory and support industry applications.
Second, genetic data sets that include fine-grained behavioral measures are scarce. Behavioral geneticists have overcome this limitation by using measures that are more readily available at scale as proxies for traits that are laborious to measure, an approach that was shown to boost statistical power of genetic discovery (Rietveld et al. 2014). Genetic research of consumer behavior can similarly benefit from such an approach. For example, a twin study found that one’s disposition to display decision biases shares a common genetic variance with performance in the cognitive reflection test (Cesarini et al. 2012), suggesting that this brief measure could serve as a proxy for such behavioral dispositions. Another possible solution would be to preselect genetic loci that have already been identified as associated with related phenotypes in large-scale GWAS. This would drastically reduce the number of hypotheses to be tested and, thus, the sample size required to obtain sufficient statistical power.
On a final note, the capacity (or lack thereof) to obtain detailed phenotypic measures at scale to complement the genetic measures may be less of a concern for behavioral marketing metrics. Companies in the DTC-GT industry possess relationship management data of millions of customers and likely know whether they were early adopters, responded to email advertisements, shared coupons with their friends, and consulted health or ancestry reports and, furthermore, what device was used to access them. Thus, large-scale genetic data sets that contain high-resolution measures of consumer behavior already exist and could be employed to unveil the genetic foundations of many aspects of consumer behavior. Such explorations will generate insights that advance not only the field of marketing, but also the discipline of behavioral genetics.
Are Genes More Predictive Than Other Measures?
Behavioral genetic research typically focuses on identifying variants that have causal effects on a target trait and quantifying the variance they account for. Many marketing applications of genetic data, however, do not depend on whether genetic variants are indeed causally related to a trait but, rather, on whether they are more informative than other readily available measures. These two questions are not interchangeable for two reasons. First, genomes correlate with many personal characteristics that have no genetic basis. In traditional genetic analysis, noncausal correlations are of no interest (Price et al. 2006). For marketing applications, however, noncausal genetic associations carry information that is useful for identifying segments and reaching targets. Second, many behavioral dispositions can be accurately predicted from records of their downstream consequences (e.g., personality can be estimated from digital footprints; Kosinski, Stillwell, and Graepel 2013; Nave et al. 2018). This empirical observation is not of particular interest to geneticists, yet it is crucial for marketers deciding on what data to base their strategy. As noted previously, the degree to which genomes are more predictive than other measures likely varies by trait. To the best of our knowledge, only one study to date (whose outcome measure was longevity) systematically compared the predictive accuracy of models that use different sets of variables (Karlsson Linnér and Koellinger 2020). This constitutes an important gap that should be filled as genetic data sets become more available to marketing researchers.
Extreme Ends of Distributions
Marketing applications, such as segmentation and targeting, often depend on identifying people at the extreme ends of a trait’s distribution as opposed to explaining the variance in the general population. For example, a manager of an eco-friendly luxury car brand would be interested in reaching people who are willing to pay a lot for “green” products (Laroche, Bergeron, and Barbaro-Forleo 2001) rather than accounting for heterogeneity in this tendency in the general population. However, the goal of most behavioral genetic research to date has been to estimate how much of a trait’s variance in the general population can be attributed to genetics (using summary statistics such as estimated heritability or R2). Future studies should shed light on the capacity to use genetic data for identifying segments at the extreme ends of the behavioral distribution, using techniques such as discriminant analysis.
When We Do Not Have Genetic Data for Everyone
As described in the “Applications for Marketing Strategy” section, there are several cases when a marketing researcher can use genetic data to accurately predict a variable of interest, denoted by y (e.g., propensity for pattern baldness). However, genetic information might not be available for the entire population of potential consumers. In such cases, it may be possible to leverage the share of the population with genetic information to predict y for the remaining nongenotyped population. To this end, researchers must first predict the variable of interest in the population using genetic information (e.g., using a genetic estimator y′, such as a PRS) and then estimate a model to capture the link between nongenetic variables (e.g., demographics) and the predicted variable of interest y′. Finally, the model can be used to predict the variable of interest in the nongenotyped population, without having to rely on genetic data. The feasibility of this approach crucially depends on the capacity to estimate y′, which is a function of the genome, from other observables. We are not aware of any research relying on this approach to date, and its potential performance remains to be studied. Answering this question is crucial for evaluating the utility of genetic variables as segmentation bases.
How Will Consumers React?
A final important open question concerns how consumers would feel about the use of their genetic data by marketers. On the one hand, it seems plausible that at least some people would welcome marketing applications of genetic data if it got them discounts or better recommendations, which saves search costs. On the other hand, such usage is expected to raise privacy concerns that are similar to those invoked in relation to other data types. Yet there are several additional unique matters, related to the image of genetics in the minds of consumers. One major concern relates to historical misconceptions surrounding genetics, which were used in the past to justify racist worldviews and policies responsible for some of the worst crimes against humanity (Kevles 1985). Business strategies that insensitively use consumers’ genetic data might therefore invoke strong negative reactions. Furthermore, although the true causal effects of genetic factors on most human traits are moderate in size and occur via interactions with the environment, genetics is often associated with biological determinism (Condit, Ofulue, and Sheedy 1998). As such, the use of genetics for matching consumers with products, services, and ads might increase beliefs in the existence of potentially deterministic aspects of behavior (Bhattacharjee, Berger, and Menon 2014; Zheng and Alba 2021) and threaten consumers’ sense of autonomy (Wertenbroch et al. 2020). A final concern is that an individual’s genome contains sensitive information that they may not be aware of, for example, about future health risks such as cancer or Parkinson’s disease. Marketers should be cautious to avoid exposing consumers to information they might not want to know (Gigerenzer and Garcia-Retamero 2017) or prefer to receive with the appropriate counseling in a medical setting.
The substantial size of the DTC-GT market, despite poor regulation, suggests that these issues may not be a major concern for many customers. Moreover, many individuals voluntarily share their genetic data with third-party interpretation services (Guerrini et al. 2020) and websites that use them solely for making product recommendations. Indeed, in addition to ancestry-based playlists (Spotify) and cultural experiences (Airbnb), other services have recently emerged, recommending wines, travel destinations, and even romantic matches purportedly tailored to their consumers’ DNA. Yet it is possible that the market trends merely reflect consumer ignorance. A recent survey found that while many DTC-GT customers presumed that they were sufficiently informed about privacy issues, their expectations were often inconsistent with company practices (Christofides and O’Doherty 2016). For example, consumers’ most common expectation, that DTC-GT companies would not share their data with third parties, was often at odds with the firms’ actual privacy policies. Thus, it remains to be seen whether consumers’ attitudes toward the use of genetic data for marketing differ from how they (dis)regard the use of other types of digital records, and whether there are means to mitigate such effects (e.g., by increasing transparency; Kim, Barasz, and John 2019).
Conclusion
This article is a first attempt to assess how the massive amounts of data accumulated in genetic databases will influence the field of marketing. We developed a framework that incorporates genetic variables into consumer behavior theory and used it to explore potential applications of genetic data in marketing. We further evaluated ethical and legal challenges, and we highlighted gaps of knowledge that should be addressed by future research. Despite the gaps of knowledge in the published literature, we note that DTC-GT firms and governments already have access to the genetic data of millions of individuals. Therefore, business strategies that employ genetic data are likely already implemented, to some degree, by organizations. With the fast accumulation of genetic data and the rapid advances in methodology for genetic-based inference, the use of genetic data for marketing research and practice is likely to become increasingly common in the future.
Footnotes
Appendix
Associate Editor
Vikas Mittal
Acknowledgments
The authors thank Dylan Manfredi for research assistance. They thank Eric Bradlow, Ryan Dew, Joshua Eliashberg, Peter Fader, Nadja C. Furtner, Philipp Koellinger, Cait Lamberton, Richard Karlsson Linnér, Leonard Lodish, Bob Meyer, Raj Sethuraman, Christophe Van den Bulte, Kevin Werbach, and Juanjuan Zhang for their comments on previous versions of the article. Gideon Nave thanks Carlos and Rosa de la Cruz for ongoing support.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
