Rational Inattention as an Empirical Framework for Discrete Choice and Consumer-Welfare Evaluation

Abstract

The conventional discrete-choice model of demand assumes that consumers are fully informed about every available product alternative. This assumption is at odds with the large body of literature studying incomplete information and the role of the consumer's “evoked set” or “consideration set.” The author develops a novel empirical discrete-choice demand model derived from an underlying theory of consumers' rational inattention. The model distinguishes between factors that shift demand through the utility function, such as prices and product attributes, and factors that shift demand through the consumer's information “evaluation costs.” The author conducts an empirical case study of the laundry detergent category. Using a set of exclusion restrictions based on retail promotional instruments, specification tests select the rational inattention model over the conventional full-information discrete-choice model. Exploiting the launch of Tide Pods midway through the sample, the author demonstrates the role of evaluation costs for the measured value creation from a new product. A conventional discrete-choice model always assigns positive incremental consumer value from new products. However, the rational inattention model developed herein finds a decrease in overall consumer welfare from the new Tide Pods’ entry, with the increased friction in information associated with the larger choice set offsetting the potential gains from higher match value.

Keywords

rational inattention consideration consumer welfare discrete-choice consumer demand

Discrete-choice random utility models revolutionized the empirical brand-choice literature by offering a method to estimate demand for differentiated products and to measure value creation to consumers (e.g., Guadagni and Little 1983; McFadden 1974, 1978, 1981). The canonical full-information random utility maximization models (FI-RUM) assume that each consumer considers all of the product variety supplied at the point of sale and has complete information about prices and the objective product benefits and attributes. These assumptions are at odds with a parallel literature dating back to the 1960s that has studied the information frictions that limit consumer choices at the point of sale. Limited product awareness, the cognitive effort required to recall product information, and the costly effort from browsing and deliberating at the point of sale may cause consumers to base their purchase decisions on a more limited “evoked set” or “consideration set” (e.g., Howard and Sheth 1969; Wright and Barbour 1977). In practice, consumers may only consider as few as two to eight product alternatives at the point of sale, often a small fraction of the variety supplied (e.g., Bronnenberg, Kim, and Mela 2016; Hauser and Wernerfelt 1990; Honka 2014; Moorthy, Ratchford, and Talukdar 1997; Newman and Staelin 1972; Punj and Staelin 1983; Ratchford, Talukdar, and Lee 2007). Unfortunately, consumers’ information sets and consideration sets at the point of sale are typically unobserved to the researcher.

I develop a novel formulation of the discrete-choice demand model that is consistent with consumer product/price uncertainty and endogenous information acquisition subject to “evaluation costs.” Unlike the FI-RUM, the framework distinguishes between consumers’ preferences and the costs associated with the gathering of choice-related product/price information at the point of sale.

Formally, I propose the subjective prior rational inattention (SP-RI) empirical model of consumer demand. The SP-RI empirical demand model extends the rational inattention (RI) discrete-choice theory model of Matĕjka and McKay (2015) and Fosgerau et al. (2020) to make it amenable to empirical estimation and welfare measurement. In the RI formulation of discrete choice, the consumer forms a prior belief at the start of a trip and then optimally acquires product/price information before purchase subject to evaluation costs measured by information-theoretic Shannon (1948) entropy. The randomness in consumer choices arises from the posterior incompleteness of the product and price information gathered. The challenge for empirical estimation of the RI model is that the prior belief structure is typically unobserved to the researcher. The first key result of the current research consists of an existence theorem of a latent subjective prior belief that is coherent with the RI theory. I use this result to devise an empirical estimator of the SP-RI model that can be applied to standard brand-choice data sets without assuming a functional form for the consumer's prior belief structure.

I rely on exclusion restrictions to demonstrate the empirical value of the SP-RI model over standard FI-RUM approaches. I exploit the incidence of exogenous retail promotion variables that shift consumers’ consideration by facilitating product and price information at the point of sale without generating direct consumption utility (e.g., Allenby and Ginter 1995; Mehta, Rajiv, and Srinivasan 2003; Terui, Ban, and Allenby 2011). For many consumer packaged goods (CPG) brands, the majority of their sales are associated with some kind of promotion, ranging from in-store displays to feature advertising in a weekly circular (e.g., Blattberg and Neslin 1989). The established wisdom in the literature is that these promotional tools are designed purely to inform consumers and reduce evaluation costs. For instance, point-of-sale displays “are placed near the merchandise they refer to so that customers know its price and other detailed information” (Levy, Weitz, and Grewal 2019, p. 436). Even though some of this information may already be on the label of the product or its packaging, the display “can quickly identify for the customer those aspects likely to be of greater interest” (Levy, Weitz, and Grewal 2019, p. 436). Similarly, free-standing displays are used “primarily to attract customers’ attention and bring them into a department” (Levy, Weitz, and Grewal 2019, p. 437). Feature advertising in local newspapers primarily communicates product/price information, unlike the potential for genuine, utility-shifting branding in higher-engagement media such as television. These exclusion restrictions allow me to test the SP-RI model against the conventional FI-RUM.

To showcase the SP-RI demand model and the estimator, I conduct an empirical case study of the new Tide Pods product entry into the laundry detergent category. I use the Nielsen-Kilts household panel database, comprising laundry detergent category purchases from 2006 to 2016. As in previous work (e.g., Allenby and Ginter 1995; Mehta, Rajiv, and Srinivasan 2003; Terui, Ban, and Allenby 2011), I find that the inclusion of promotional variables in a standard FI-RUM improves fit and produces positive and statistically significant effects, which I interpret as reduced-form evidence for rational inattention. The full SP-RI model is selected over the FI-RUM in a series of model specification tests. The effects of consideration shifters in the SP-RI model are statistically and economically significant. Furthermore, I find substantial differences in the utility-coefficient estimates from the SP-RI model and the FI-RUM model, where the FI-RUM overestimates the magnitudes of almost all the utility parameters. For instance, the FI-RUM model overestimates the price sensitivity by 18%, the Tide brand coefficient by 44%, and the Gain brand coefficient by 65%. I attribute the bias of the utility-coefficient estimates of the FI-RUM to the omission of the promotional variables that are correlated with the corresponding price and brand variables. I find that the implied optimal average margins from the SP-RI model are 24% higher than those of the FI-RUM model, suggesting that the FI-RUM understates pricing power.

Finally, I explore the consumer-welfare implications of the SP-RI's distinction between preferences and evaluation costs. The canonical FI-RUM with unbounded support for the random utility (e.g., logit and probit) mechanically predicts that consumer welfare is strictly increasing in the number of product variants supplied at the point of sale (e.g., Berry and Pakes 2007; Fan and Yang 2020; Petrin 2002). This property is at odds with the commonly observed behavioral finding that too many alternatives can cause consumers to make bad choices (e.g., Bertrand et al. 2010; Broniarczyk, Hoyer, and McAlister 1998; Chernev and Hamilton 2009; Iyengar and Lepper 2000; Iyengar, Huberman, and Jiang 2004). In contrast, increasing product variety in the SP-RI framework increases the costs and complexity of choosing from a larger choice set, potentially leading to ex post inferior choices.

Returning to the empirical application, I measure the value creation to consumers from the launch of Tide Pods. Under the FI-RUM, I find that the launch of Tide Pods increased the nationally projected aggregate consumer surplus by $125 million per year. However, under the SP-RI, I find that the launch of Tide Pods reduced the nationally projected aggregate consumer surplus by $41 million per year, with heterogeneity in the sign of the welfare change across individual consumers. Intuitively, the new Pods offered a lower price-to-value ratio than the average incumbent detergents, and their launch increased the cost of information. The managerial relevance of this result is quite striking and might explain the subsequent decline in Tide Pods’ market share from 6.9% to 3.9% between 2014 and 2016.

The current research contributes to an emerging literature on discrete-choice RI (Caplin, Dean, and Leahy 2019; Fosgerau et al. 2020; Matĕjka and McKay 2015) by establishing a bridge between discrete-choice RI theory and empirical work. This article is the first to derive the empirical analog of RI, the SP-RI demand model, and a corresponding estimator. Key to my approach is the existence proof of a consumer's prior consistent with the demand model without needing to specify a specific functional form for those beliefs. Several recent papers have implemented variations of the SP-RI model and estimator developed in the present research (e.g., Bhattacharya and Howard 2022; Brown and Jeon 2020; DeDad et al. 2021; Natan 2021; Porcher 2020).¹

This article also contributes to the empirical literature on choice models with consideration sets and search (e.g., Honka 2014; Kim, Albuquerque, and Bronnenberg 2010; Mehta, Rajiv, and Srinivasan 2003; Morozov 2021) by offering a less computationally demanding model and estimator. Most approaches specify the consideration set as an additional random variable, imposing the computational burden of deriving the likelihood over all possible consideration sets. Some approaches also typically require observing the browsing and search process. The SP-RI model also provides a structural microfoundation that rationalizes the reduced-form consideration-and-purchase likelihood of Bronnenberg and Vanhonacker (1996).

This work is also related to consumer choice models with Bayesian learning (e.g., Crawford and Shum 2005; Erdem and Keane 1996). The SP-RI model does not require parametric assumptions about consumer beliefs, and it can be applied to the standard brand-choice data sets. However, it cannot accommodate the choice patterns of strategic experimentation because the consumer does not plan out future choices when making the current choice in the SP-RI model.

The remainder of the article is organized as follows. I first provide an illustrative example of the choice context being modeled. Following this, I develop the SP-RI discrete-choice and consumer-welfare evaluation framework. Next, I apply the proposed SP-RI discrete-choice and consumer-welfare evaluation framework to a case study on the addition of more alternatives to consumers’ set of alternatives—namely, Tide Pods’ introduction to the market in 2012. I close with conclusions.

An Illustrative Example of the Subjective Prior Rational Inattention Discrete Choice

This section illustrates the choice context and the SP-RI model being developed using a toy example. A consumer wants to buy one laundry detergent item that would deliver the highest consumption utility among what is available in the aisle. Assume that only three different detergents are available. The consumer instantly perceives the presence of three different options, but they do not know the consumption-utility value that each alternative would deliver.

Suppose first that the consumer does not have a prior consumption experience with any of the detergents, nor does the store have any promotion going on. Suppose the consumer is endowed with a subjective prior belief that each detergent's consumption utility is independent and equally likely to be 3 (good), 2 (neutral), or 1 (bad) with probability 1/3.² If the consumer makes a choice at this stage, it will be completely random with probability 1/3 because all three detergents are indifferent to the consumer in this stage. Such a random choice is not likely to be optimal for the consumer. To distinguish the detergents and obtain information about the consumption-utility value of each detergent, the consumer must engage in product/price research by reading labels, comparing prices, and so on. In the optimal strategy, the consumer would end up with some noisy information from the costly price and attribute research.

To fix ideas, assume further the true consumption utility of detergents {A, B, C} are {2, 2, 3}, respectively. When the cost of information is given by the Shannon (1948) entropy differences that I formally introduce in Equation 5, the probability of the consumer choosing detergent $C$ after the costly product/price research becomes a logit-like form with one additional parameter, unit information cost:

\frac{\exp (\frac{3}{Unit Info . Cost})}{\exp (\frac{2}{Unit Info . Cost}) + \exp (\frac{2}{Unit Info . Cost}) + \exp (\frac{3}{Unit Info . Cost})} .

(1)

If the unit information cost is 1, the probability of choosing detergent C is .576.

Thus far, I have assumed that all three detergents are homogeneous before engaging in costly information acquisition. Now suppose detergent B is featured in the store, which would shift the consumer's prior belief but not the true consumption utilities. Say, for example, the consumer's prior belief about detergent B's consumption utility is now adjusted to the following: 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/2. The probability of choosing detergent B then becomes

\frac{\exp (γ) \exp (\frac{2}{Unit Info . Cost})}{\exp (\frac{2}{Unit Info . Cost}) + \exp (γ) \exp (\frac{2}{Unit Info . Cost}) + \exp (\frac{3}{Unit Info . Cost})}, γ > 0.

(2)

γ represents the shifts in the consideration probability, the exact value of which is determined by a fixed-point equation that will be introduced in Equation 7. Note that the changes in the consumer's subjective prior belief lead to the changes in the conditional choice probability of choosing detergent B without affecting the consumption utilities.

The logit shape of the choice probability is inherited from the shape of Shannon (1948) entropy, but the degree of incomplete information about consumption utilities is also affected by the unit information cost term. The role of the unit information cost can be best understood by considering the two extremes. If the term approaches 0, the product/price research becomes free, and the consumer learns all the detergents’ consumption-utility value precisely. They would then choose detergent C with probability 1 regardless of the γ magnitude. By contrast, if the term approaches infinity, implying that the product/price research becomes extremely expensive, the consumer does not engage in any product/price research. Then, the promotion effect γ would be the only factor that affects the purchase probability.

The SP-RI Empirical Framework of Discrete Choice and Consumer-Welfare Evaluation

In this section, I develop the main empirical framework of subjective prior rational inattention (SP-RI). The section is divided into three parts. The first part develops the SP-RI model of the consumer's purchase decision. The setup and exposition in the subsection mostly follow Matĕjka and McKay’s (2015) Lemma 1 and Corollary 1,³ but with modifications on the interpretation of the prior belief and the definition of the information-cost function, explained in detail subsequently. The second and third parts are entirely novel to the literature. The second part establishes the SP-RI discrete-choice model as an empirical framework of consumers’ discrete choice. The third part then develops the SP-RI discrete-choice demand model as a framework of consumer-welfare evaluation.

Discrete Choice Behavior Under Subjective-Prior Rational Inattention

On each shopping trip, a consumer faces the choice set of products, $J = {1, \dots, J}$ , which may include an “outside option” or “no purchase.”⁴ To simplify the notation, I omit the trip-specific index. Let u = (u₁, …, u_J)′ denote the vector of choice-specific consumption utilities for each product. At the start of each trip, the consumer does not observe u, and they are indifferent between the alternatives.

Table 1.

Comparison of Welfare Calculations.

Case	Alternative	Utility	Choice Prob.	SP-RI Welfare	FI-RUM Welfare
Baseline	$A$	2	.5	2	2.693
Baseline	$B$	2	.5	2	2.693
“Good” added	$A$	2	.212	2.453	3.551
	$B$	2	.212
	$C$	3	.576
“Neutral” added	$A$	2	.333	2	3.099
	$B$	2	.333
	$C$	2	.333
“Bad” added	$A$	2	.422	1.763	2.862
	$B$	2	.422
	$C$	1	.155

Notes: The table illustrates the differences in consumer welfare associated with the presence of additional Good, Neutral, and Bad, respectively, in the choice set, compared with the Baseline case with two Neutral alternatives. The proposed SP-RI welfare is calculated using Equation 12, and the FI-RUM welfare is calculated using the usual log-sum formula presented in Equation W5 of Web Appendix B.

Prior to purchase, the consumer collects product information and engages in deliberation prior to making their choice. Let $F$ denote the set of costless information available to the consumer at the point of sale. This information includes the total number of products available, J, and promotional instruments, such as feature advertising and/or displays for a product, d = (d₁, …, d_J): $F = {J, {d_{j}}_{j \in J}}$ . The consumer uses this costless information to form a subjective prior belief about the consumption utility vector, $Q (\cdot | F)$ . $Q (\cdot | F)$ is a multivariate cumulative probability distribution function with support $R^{J}$ that takes the realized vector of utilities as its argument. Note that this formulation of the RI with a deterministic choice-specific utility and a subjective prior is different from the theoretical RI models (e.g., Matĕjka and McKay 2015) and is critical to my empirical derivation, which I explain more in detail subsequently.

After acquiring the free information, the consumer may also engage in costly information acquisition and product deliberation to reduce their subjective uncertainty about the consumption utility vector u, such as reading labels and comparing prices. The exact amount of deliberation trades off the costs of information acquisition against the benefits of a more informed purchase decision and could potentially lead to a full-information choice. Let $\Pr (Choose j | \cdot)$ denote the mapping that takes the possible utility vectors as its argument and returns the postdeliberation probability (or the “conditional choice probability”). The conditional choice probability is the probability that the consumer believes product j to be the highest-utility product after costly information acquisition. Once conditioned on the true consumption utility vector u, $\Pr (Choose j | u)$ represents the choice probability after the costly research about the consumption utility vector u.

Prior to engaging in product deliberation, the consumer's ex ante unconditional probability of choosing j, defined by the expectation of the conditional choice probability against the consumer's prior belief $Q (\cdot | F)$ , is

π_{j} = \int \Pr (Choose j | u) dQ (u | F) .

(3)

Even though π_j does not condition on a specific realization of consumption utility vector, it does reflect the free information

F

, i.e.,

π_{j} = π_{j} (F)

, because the free information shifts the consumer's prior belief.⁵ π_j in this context can be interpreted as the consideration probability, representing the probability of choosing j absent costly product research about the true consumption utility vector u.

The consumer adjusts the accuracy of the costly information that they obtain to maximize the expected value of {Gross benefit from choice − Information cost}. The accuracy of costly information corresponds to the shape of the conditional choice probability $\Pr (Chooses j | \cdot)$ . To formalize, the consumer solves

\begin{aligned} max_{{\Pr (Chooses j | \cdot)}_{j \in J}} \int {\sum_{j \in J} u_{j} \Pr (Choose j | u) - c (π, {\Pr (Choose j | u)}_{j \in J})} \\ dQ (u | F), \end{aligned}

(4)

where the sum of the conditional choice probabilities

\Pr (Chooses j | \cdot)

over j equals 1. The first term inside the expectation in Equation 4 is the “Gross benefit from choice,” naturally defined as the sum of the true consumption utility u_j weighted by the conditional choice probability,

\Pr (Chooses j | u)

. The second term in Equation 4 is the “Information cost,” which is defined subsequently. The expectation here is taken against the consumer's subjective prior belief Q so that the consumer assigns more weight to higher-probability consumption utility realizations in her mind. At one extreme, the consumer may gather information about the prices and product attributes of all the products in the aisle. In this case,

Pr (Choose j | u) = 1

for j such that

j = max_{k \in J} {u_{k}}

. At the other extreme, the consumer may choose to not collect any costly information and simply make the utility-maximizing choice based entirely on the prior belief. In this case, the conditional choice probability

\Pr (Chooses j | \cdot)

equals the unconditional choice probability π_j.

The information-cost function, $c (π, {\Pr (Choose j | \cdot)}_{j \in J})$ , balances the consumer's optimal decision between the two extremes. Following information theory, the information cost is defined as proportional to the relative entropy of conditional choice probabilities ${\Pr (Choose j | \cdot)}_{j \in J}$ with respect to unconditional choice probabilities π. Formally,

c (π, {\Pr (Choose j | \cdot)}_{j \in J}) = \frac{1}{μ} [\sum_{j \in J} \Pr (Choose j | \cdot) \ln (\Pr (Chooses j | \cdot)) - \sum_{j \in J} \Pr (Choose j | \cdot) \ln (π_{j})],

(5)

where the multiplied positive constant 1/μ is the unit information cost. The information-cost function, which takes the possible utility-vector realization as its argument, is convex, thereby penalizing more accurate conditional choice probabilities. Suppose the true consumption-utility vector is u. On the one hand, if

Pr (Choose j | u) = 1

for some j, implying that the consumer collects every bit of information about u, the information cost becomes very large. On the other hand, if

\Pr (Choose j | u) = π_{j}

for all j, implying the consumer did not collect any information about u, the cost becomes zero. Note that the definition and interpretation of the information-cost function in Equation 5 is novel in the RI-discrete-choice literature. For more discussion on the definition of the information-cost function, see Appendix A.

The consumer solves the problem stated in Equation 4 by optimizing over the shape of the conditional choice probabilities, ${\Pr (Choose j | \cdot)}_{j \in J}$ . Matĕjka and McKay (2015) show that the optimal conditional choice probability is as follows:

\Pr (Choose j | u) = \frac{\exp (\ln (π_{j}) + μ u_{j})}{\sum_{k \in J} \exp (\ln (π_{k}) + μ u_{k})},

(6)

which captures the probability of the consumer choosing alternative j after engaging in costly product research if the true consumption utility vector is u.⁶ The optimal conditional choice probability in Equation 6 depends on the unconditional choice probabilities π = (π₁, …, π_J)′.⁷ As discussed previously, the unconditional choice probabilities are determined from Equation 3. Combining Equations 6 and 3 yields a fixed-point equation that implicitly defines the conditional and unconditional choice probabilities, given by

π_{j} = \int \frac{\exp (\ln (π_{j}) + μ u_{j})}{\sum_{k \in J} \exp (\ln (π_{k}) + μ u_{k})} dQ (u | F) .

(7)

The fixed-point equation has been a major challenge in the literature in solving out the RI-discrete-choice model theoretically as well as in establishing the RI-discrete-choice model as an empirical framework.

The following subsections explain how I address the challenge to establish the RI-discrete-choice model as an empirical framework. Before proceeding further, I provide a brief explanation of how the proposed SP-RI model differs from the extant discrete-choice RI theory models by Matĕjka and McKay (2015) and Fosgerau et al. (2020).

Key Innovation from Matĕjka and McKay (2015): Subjectivity of the Prior-Belief Distribution

The key innovation of the SP-RI framework is to introduce the subjectivity in the prior-belief distribution explicitly. In the extant RI-discrete-choice theory literature, the consumer's prior belief $Q (\cdot | F)$ is interpreted as representing the distribution of the state-dependent realization probabilities of the consumption-utility vector u (e.g., Caplin and Dean 2015, Sections III and IV; Matĕjka and McKay 2015). The state-dependent random realization of utilities is better suited to model choice circumstances such as gambles or lotteries. However, in consumers’ brand-choice context, viewing the actual consumption-utility value as a random draw from the consumer's perspective is not plausible. Furthermore, from the empirical researcher's perspective, the notion of state-dependent random realization of consumption utilities incurs a deeper problem of identification failure.

I first illustrate why the state dependence of the consumption utilities may lead to identification failure from the empirical researcher's point of view. Suppose, in the context of the consumer-choice example presented previously, two different, equally possible states of the world wherein the consumption-utility value realizations are (2, 2, 2) and (2, 2, 4.2), respectively. Let us further assume that the unit information cost is 1. An empirical researcher, however, typically does not observe the underlying state of the world in consumers’ brand-choice context. Without knowing the underlying state in which the data are generated, the probability of choosing detergent 3 will look like .576 to an empirical researcher because $.576 \approx \frac{1}{2} {\frac{\exp (2)}{\exp (2) + \exp (2) + \exp (2)} + \frac{\exp (4.2)}{\exp (2) + \exp (2) + \exp (4.2)}}$ , the same as what Equation 1 gives in the example in the previous section. However, concluding that the consumption-utility value realizations are (2, 2, 3) will lead to false model-based predictions and counterfactual analyses.

To establish the RI-discrete-choice model as an empirical framework readily applicable to observational choice data, I depart from such interpretation of the prior belief. I take the consumption-utility vector u as deterministic and fixed, that is, not dependent on the underlying “state” of the world. Effectively, it is equivalent to assuming that the same product with the same price delivers the same consumption utility to the consumer. It is a reasonable and standard assumption in empirical models of a consumer's discrete choice. In turn, the prior belief $Q (\cdot | F)$ can no longer be taken as representing the distribution of the deterministic true consumption utility u; if the prior belief gives a unit mass to a point, the consumer does not have anything to learn about the consumption utilities.

I overcome this challenge by interpreting the prior belief $Q (\cdot | F)$ as representing the consumer's subjective belief over consumption utilities, which can be affected by the free information $F$ . $Q (\cdot | F)$ represents the uncertainty in the consumer's mind about consumption utilities before engaging in the costly product research. The subjectivity of prior belief provides a conceptual advantage to the empirical researcher that the prior belief matters only to the extent that it shifts the optimal conditional choice probability. The interpretation is consistent with the RI-discrete-choice consumer's optimization problem developed previously as long as the true consumption-utility vector u is on the support of the prior belief.

The interpretation of the prior belief is novel in the RI-discrete-choice literature. Thus, I refer to the proposed RI-based empirical framework as the subjective-prior rational inattention (SP-RI), emphasizing the subjectivity of the prior belief.

The SP-RI Empirical Model of Discrete-Choice Demand

In this subsection, I derive an empirically estimable version of the SP-RI-discrete-choice model developed previously. I use the optimal conditional-choice-probability expression in Equation 6, with suitable parameterizations on (π, μ, u), as the likelihood of the consumer purchasing product j. In doing so, I need to ensure that the restriction imposed by the RI model—the fixed-point Equation 9—is satisfied.

Recall that the conditional (on realized consumption utility u) and unconditional choice probability under RI have the following form:

\Pr (Choose j | u) = \frac{\exp (\ln (π_{j}) + μ u_{j})}{\sum_{k \in J} \exp (\ln (π_{k}) + μ u_{k})} \forall j \in J,

(8)

π_{j} = \int Pr (Choose j | u) dQ (u | F) \forall j \in J .

(9)

The unconditional choice probabilities, π_j, in the conditional choice probability expression Equation 8 are determined by the fixed-point equation in Equation 9 that integrates u out in

\Pr (Choose j | u)

with respect to the consumer's prior belief distribution

Q (\cdot | F)

. In the extant RI-discrete-choice theory literature, the functional form of consumers’ belief

Q (\cdot | F)

is taken as known to the researcher, and the vector π = (π₁, …, π_J)′ is computed as the fixed point to Equation 9.

In practice, an empirical researcher typically does not observe a consumer's prior belief distribution, $Q (\cdot | F)$ . Only the shifters of (π, μ, u) are observed. The following theorem establishes that for any (π, μ, u), there exists a prior-belief structure, $Q (\cdot | F)$ , such that (i) u is on the support of the $Q (\cdot | F)$ and (ii) $Q (\cdot | F)$ satisfies the fixed-point equation in Equation 7. I will use this theorem as the basis of my empirical formulation of demand under RI.

Theorem 1.

(Existence of Subjective-Prior-Belief Distribution): Suppose the consumer solves an RI-discrete-choice problem described in the previous subsection. Let J ≥ 2 and let $J = {1, \dots, J}$ . For each $j \in J$ , let π_j > 0 such that $\sum_{k \in J} π_{k} = 1$ is given and fixed. Let μ > 0 be given and fixed. Then, a probability measure $Q : R^{J} \to [0, 1]$ over possible utility levels exists such that for each $j (\in J)$ , the following (i) and (ii) hold:

$Q (\cdot | F)$ has a full support over $R^{J}$ .

For each $j (\in J)$ ,

π_{j} = \int \frac{\exp (\ln (π_{j}) + μ u_{j})}{\sum_{k \in J} \exp (\ln (π_{k}) + μ u_{k})} dQ (u | F) .

(10)

Proof.

See Appendix B for a more general version of the theorem and the proof.

Condition (i) is a condition that ensures any consumption-utility vector $u \in R^{J}$ identified from data will be consistent with the consumer's subjective prior belief. Condition (ii) is the fixed-point equation, Equation 9. The theorem allows the conditional-choice-probability expression, Equation 8, to be interpreted as the rationally inattentive consumer's likelihood of purchasing product j and establishes Matĕjka and McKay’s (2015) RI-discrete-choice model as an empirical framework rationalizing the consumer's purchase decision under incomplete information.

Note that I assumed that every alternative must have a positive unconditional choice probability (i.e., π_j > 0) for the following two reasons. First, zero unconditional choice probability cannot be distinguished from infinitesimal unconditional choice probability with only a finite number of choice samples in hand. Second, in the aggregate demand-estimation context, market-share-equation inversion does not apply when an alternative has zero choice probability, leading to identification failure. Requiring π_j to be strictly positive is not the restriction necessarily imposed by the extant RI-discrete-choice theory literature. For example, Caplin, Dean, and Leahy (2019) allow for the zero unconditional choice probability and interpret the alternatives with zero unconditional choice probability as not included in the decision maker's consideration set. On the contrary, in the SP-RI model, higher π_j is interpreted as a higher probability of product j being considered during the choice stage.

In Appendix B, I generalize the results presented thus far by adopting the generalized entropy information-cost function proposed in Fosgerau et al. (2020). Appendix B shows that any functional form of discrete-choice probability resulting from additive random utility maximization (ARUM) models, including but not limited to multilevel nested logit or probit, can be established as a likelihood resulting from a rationally inattentive consumer's choice.

The SP-RI-Based Consumer-Welfare Evaluation

Developing consumer-welfare evaluation frameworks when consumers make choices under incomplete information is an ongoing area of research (e.g., McFadden and Train 2019; Morozov 2021; Train 2015). This subsection derives the Hicksian compensating variation (CV) associated with the SP-RI to demonstrate the applicability of the model to welfare analysis and the measurement of value creation to consumers. To simplify the notation, I omit the consumer subscript; however, I discuss the incorporation of heterogeneity in the empirical application.

The true utility has the usual quasilinear-in-the-numeraire form:

u_{j} = (y - p_{j}) β_{1} + χ_{j},

(11)

where y is the consumer's income, p_j is the price of product j, β₁ is the price-sensitivity parameter, and χ_j is the “quality index” function that can possibly take product attributes as its argument.⁸

I define the consumer-surplus function as follows:

\begin{aligned} W_{SPRI} (u, J) = & E [{\sum_{j \in J} 1 (Choose j) u_{j} - c (π, {\Pr (Choose j | u)}_{j \in J})}] \\ = \sum_{j \in J} \Pr (Choose j | u) u_{j} - c (π, {\Pr (Choose j | u)}_{j \in J}) . \end{aligned}

(12)

The CV is then defined using the consumer-surplus function, Equation 12. Let superscript 0 denote the state before the changes in prices and/or choice-set composition, and let superscript 1 denote the state after the changes, respectively. CV_SPRI is defined by

C V_{SPRI} = \frac{1}{β_{1}} {W_{SPRI} (u^{1}, J^{1}) - W_{SPRI} (u^{0}, J^{0})},

(13)

where −β₁ is the price coefficient in the alternative-specific utility in Equation 11, which represents the marginal utility of money in the quasilinear utility specification in Equation 11. CV for the target population can be obtained by aggregating CV_SPRI against the distribution of population. Note that the proposed formula for the consumer surplus and CV does not require specifying the complete shape of the consumer's subjective prior belief

Q (\cdot | F)

Appendix A provides a formal derivation of the consumer surplus function, including the timing and aggregation argument that lead to the formula, Equation 12. The following example illustrates the proposed SP-RI-based consumer-welfare calculation and compares it with the FI-RUM-based consumer-welfare calculation, focusing on the possibility that consumer welfare may decrease when more alternatives are added to the choice set.

An Example of SP-RI Consumer-Welfare Evaluation

Using the same example presented previously, I next demonstrate how to calculate consumer welfare, defined as the net expected utility of engaging in the problem of choosing one detergent out of all available detergents. Suppose the unit information cost is $1, the store is offering no promotion, and the consumer has no prior experience with the detergents. Suppose the consumption utility of detergents {A, B, C} is now {2, 2, 1}. Then, the proposed welfare calculation using the formulas in Equation 12 with Equation 5 is as follows:

\begin{aligned} Net expected benefit of choosing among {A, B, C} \\ = \underset{Gross expected benefit of choice}{\underset{}{\underset{⏟}{{.422 \times 2 + .422 \times 2 + .155 \times 1}}}} \\ - \underset{Unit information cost}{\underset{}{\underset{⏟}{1}}} \\ \times \underset{Actual information cost paid when {2, 2, 1} is the true utility}{\underset{}{\underset{⏟}{[\begin{matrix} {.422 \times \ln (.422) + .422 \times \ln (.422) + .155 \times \ln (.155)} \\ - {.422 \times \ln (.333) + .422 \times \ln (.333) + .155 \times \ln (.333)} \end{matrix}]}}} \\ = {1.845} - 1 \times [.081] = 1.763. \end{aligned}

“Actual information cost paid when {2, 2, 1} is the true utility” is the relative Shannon (1948) entropy of the post–costly information acquisition choice-probability distribution with respect to the pre–costly information acquisition choice-probability distribution when the true consumption-utility value is {2, 2, 1}.⁹

Next, I compare the analogous welfare calculation under the conventional FI-RUM model. Recall that the usual FI-RUM discrete-choice model with double-exponential distributed preference shocks generates the identical logit choice probability when the consumption-utility values of the detergents are the same. In comparing the two-goods case with the three-goods case, I assume that the consumers’ subjective prior beliefs are initialized in the same way that all the detergents are ex ante homogeneous before being affected by any promotions or engaging in the product/price research.

Table 1 summarizes the welfare calculations for the SP-RI and FI-RUM models. I consider four cases to compare the welfare calculations across different scenarios of available detergents in the aisle. The baseline rows consider the case with only two detergents with utility {2, 2}. The “‘good’ added,” “‘neutral’ added,” and “‘bad’ added” rows then consider the welfare calculations of {2, 2, 3}, {2, 2, 2}, and {2, 2, 1}, respectively.

As expected, the FI-RUM welfare increases with the addition of another choice alternative. In contrast, the change in SP-RI welfare associated with another choice alternative depends on the product. When a “bad” is added to the set of alternatives, the proposed SP-RI consumer welfare decreases to 1.763 from 2. In the calculation of Equation 14, the gross benefit decreases from 2 to 1.845 and the actual information cost increases from 0 to .081 by adding the “bad” with the consumption-utility value of 1; the gross benefit decreases and the information cost increases. The resulting consumer welfare in the SP-RI calculation may strictly decrease as a result of adding more alternatives.

When a neutral alternative is added to the choice set, the SP-RI-based consumer welfare is unchanged from the baseline, whereas the FI-RUM-based welfare increases by more than 15% = (3.099 – 2.693)/2.693. All else equal, one would not expect consumer welfare to change when the same neutral with u_C = 2 is added to the choice set.

Application: Laundry Detergent Demand and Consumer-Welfare Effects of the Tide Pods Introduction

In this section, I estimate the laundry detergent demand using the SP-RI model and the FI-RUM model, run the specification tests for model selection, and compare their differing managerial implications. Then, I measure the consumer-welfare implications of the new Tide Pods laundry detergent products using the welfare formula based on the SP-RI model and the FI-RUM model, respectively.

Data

The data used are a combination of Nielsen-Kilts Homescan and Retail Measurement Services (RMS) data during the sample period of 2006–2016. Homescan data record all the CPG items purchased in the panel households, using the barcode scanner issued to each household, along with the identity of the store where purchased, if available. In addition, Homescan provides the panel projection weights that allows household-level choices to be projected to the entire U.S. population. RMS data record one-third to one-half of all U.S. CPG transactions. Homescan does not provide the set of alternatives that the participating households faced. Therefore, I matched Homescan shopping trip data that include any purchased laundry detergent with the RMS weekly sales using the store code and week information to construct the estimation data. Then, the stores that do not provide the display and feature information in RMS are dropped. The matching and cleaning process results in 170,968 choice observations with 17 million alternative observations. On average, each choice incidence had around 100 options. Further details about data matching and the cleaning procedure are relegated to Web Appendix E.

I classified the brands, product attributes, and scents by manually querying different Universal Product Code (UPC) databases for 10,911 laundry detergent UPCs in the database. The cleaning procedure resulted in 7 major brand dummies and 18 functional product attributes. The average per package price of the laundry detergents in the estimation sample is $8.91. There are 163 pods detergents in the database, of which 73 UPCs belong to the Tide brand. About 4% of the UPCs are displayed, and 8% of the UPCs are featured. Table 2 presents the summary statistics for the estimation data for price, in-store promotion, and brand indicators.

Table 2.

Descriptive Statistics of the Estimation Sample.

Variable	Mean	SD
Per pack price	8.906	4.701
Display	.036	.186
Feature	.076	.266
Display × Feature	.016	.124
All	.107	.309
Arm & Hammer	.096	.294
Gain	.092	.289
Purex	.080	.271
Tide	.365	.482
Wisk	.026	.159
Xtra	.026	.158
# different UPCs	10,911
# choices	170,968
Total # alternatives	16,983,566

Notes: This table presents the descriptive statistics of the estimation sample. Except for “per pack price,” all other variables listed are indicator variables.

Empirical Specification of the Demand Models

I introduce the subscript i to denote different consumers with potentially heterogeneous tastes, where the choice set $J_{i}$ can be consumer specific.¹⁰

Consideration shifters and unit-information-cost shifters as exclusion restrictions from consumption-utility value

Conventional wisdom in the extant marketing literature finds that in-store displays and features shift consumers’ consideration (sets) but do not generate direct consumption-utility (see, e.g., Allenby and Ginter 1995; Anderson and Simester 1998; Andrews and Srinivasan 1995; Bronnenberg and Vanhonacker 1996; Fader and McAlister 1990; Mehta, Rajiv, and Srinivasan 2003; Terui, Ban, and Allenby 2011; Zhang 2006). Simply plugging features and displays into a conventional FI-RUM as additional shifters in the deterministic utility index is at best a reduced-form, since such promotional variables are not expected to generate consumption utility. In contrast, under the SP-RI framework, I can specify these promotional variables as primitives in the unconditional-choice probability component of the model, π_j, shifting only consumers’ consideration as opposed to consumption utility.

I also let a consumer's unit evaluation cost $μ_{i}^{- 1}$ depend on demographics and familiarity with the specific store visited on a given trip. Mathematically, $μ_{i}^{- 1}$ corresponds to the dispersion parameter of the random utility component in the FI-RUM model.¹¹ Taking the dispersion parameter in FI-RUM as a function of demographics or shopping environments would lead to insensible interpretations on model primitives that those variables induce systematically more (less) idiosyncratic purchase decisions.

Formally, let ${d_{i, j}}_{j \in J_{i}}$ denote the promotion status of product j (feature and/or display) when consumer i considers making a purchase. I then parameterize the consideration probability proportionally to $\exp ({d^{'}}_{i, j}^{} γ)$ :

π_{i, j} \propto \exp ({d^{'}}_{i, j}^{} γ) .

(15)

¹² Next, I parametrize the inverse of the unit information-cost μ_i as

μ_{i} = \exp ({w^{'}}_{i} θ),

(16)

where w_i is the vector of demographics and shopping environment, which does not include the constant term, because it is generally not separately identified from the scale of the utility parameters.

The formulations of the model primitives π_i,js and μ_i also generate exclusions restrictions with which I can test the RI framework. Appendix B provides further arguments as to why the parametrizations in Equations 15 and 16 are sensible. Appendix B also discusses the parametrizations in the context of general discrete-choice probabilities beyond simple logit.

Consumption-utility specification

I define the “quality index” function, χ_i,j, introduced in Equation 11, as follows:

χ_{i, j} = β_{0} + {x^{'}}_{j}^{} β_{2} + η_{i, j},

where x_j is a vector of observed (to the researcher) product attributes and

η_{i, j} \sim i . i . d . N (0, σ^{2})

are random-effect terms capturing unobserved (to the researcher) heterogeneity. Assume further that the marginal utility from income is homogeneous across the consumers, so that

β_{1}^{i} = β_{1}

for all i. The alternative-specific consumption utility of consumer i purchasing item j is then

\begin{aligned} u_{i, j} = β_{0} - p_{j} β_{1} + {x^{'}}_{j}^{} β_{2} + η_{i, j} for j \neq 0 \\ u_{i, 0} = 0, \end{aligned}

(17)

¹³ where p_j is the price of product j as before.¹⁴

Purchase likelihood

I obtain the following conditional choice probability that consumer i chooses j:

\begin{matrix} P r_{i} (i Chooses j | u_{i}) = & \frac{\exp ({d^{'}}_{i, j} γ + μ_{i} [β_{0} - p_{j} β_{1} + {x^{'}}_{j} β_{2} + η_{i, j}])}{1 + \sum_{j^{'} \in J_{i} ∖ 0} \exp ({d^{'}}_{i, j^{'}} γ + μ_{i} [β_{0} - p_{j^{'}} β_{1} + {x^{'}}_{j^{'}} β_{2} + η_{i, j^{'}}])} . \end{matrix}

(18)

μ_i captures the observed heterogeneity in the unit cost of information acquisition in the choice-probability expression in Equation 18. μ_i moderates the consumer's choice sensitivity to prices and product attributes. It also moderates the relative role of consideration shifters versus utility shifters on consumer choices. For instance, if

μ_{i} \to + \infty

, consumer i's unit information cost is zero, and they would choose the alternative that yields the highest u_i,j among their choice set with probability 1; the consideration shifters d_i,j would not affect demand. However, if consumer i has an infinitely large unit information cost, μ_i = 0, their choice behavior would be driven entirely by consideration shifters d_i,j. Note that the effective utility coefficients in the purchase likelihood are (μ_iβ₀, μ_iβ₁, μ_iβ₂), thereby accommodating individual heterogeneity.

Identification and estimation

Given the functional form of likelihood in Equation 18, identification of the utility parameters can be achieved in the proposed SP-RI framework because the true utility of a product is fixed and deterministic from the perspective of the researcher in this setup. The utility parameters (−β₁, β₂) are then identified from the cross-product and cross-shopping-instance variations in the prices and product attributes, as well as the variations in the choice set $J_{i}$ in each shopping instance. The consideration-shifter parameter γ is identified from the variation in the promotional activities, under the maintained exclusion restriction that those information shifters do not directly affect the consumption utility.¹⁵ The unit-information-cost parameter θ is identified using the cross-household variations in the demographics and familiarity with the specific store that the household was shopping in. β₀, the mean relative utility of consuming any laundry detergent in comparison to the outside option, is normalized to be 0 because the “outside option” is never chosen in the estimation sample due to the matching procedure during the choice-sample generation—only the shopping trips that purchased a laundry detergent are selected during the sampling scheme.¹⁶

Identification of the individual-product-specific random-effect term η_i,j follows the usual identification argument for the random coefficients/effects. In the model, the functional form of the conditional choice probability follows logit-like form, inherited from the shape of the information-cost function. Deviation of the observed choice and substitution patterns from logit form is attributed to unobserved preference heterogeneity, which is the source of identification of η_i,j's distribution.

I estimate the model using maximum simulated likelihood. Web Appendix A presents details of the consumers’ and researcher's information sets, interpretation of random effects, maximum simulated likelihood formulation, numerical optimization, and further discussion on the implementation details.

Model Selection and Estimation Results

Model selection: Specification test of SP-RI against FI-RUM

This subsection conducts specification tests between the proposed SP-RI model and the FI-RUM model. I compare three specifications: (1) the FI-RUM, (2) the popular reduced-form FI-RUM that includes promotional variables as linear utility shifters (e.g., Guadagni and Little 1983) but without the evaluation cost shifters, and (3) the SP-RI model with promotional variables specified as consideration shifters and demographics specified as evaluation cost shifters. For each model, I report the log-likelihood as well as two in-sample fit statistics, Akaike information criterion (AIC) and Bayesian information criterion (BIC). I also report the out-of-sample average hit rate for the holdout sample.¹⁷

Table 3 reports the model fit measures for specifications 1–3. Specification 3, the SP-RI model that includes both the consideration shifters and evaluation cost shifters in the purchase probability, is selected by all four model-fit measures. The results of the specification tests presented here reconfirm the importance of incorporating the consideration shifters and evaluation cost shifters in the likelihood, where the SP-RI framework offers a microfoundation.

Table 3.

Model Specification Test Results.

	(1)	(2)	(3)
	FI-RUM	G-L 1983	SP-RI
Log-likelihood	−699,971	−689,939	−689,105
AIC	1,399,944	1,379,880	1,378,280
BIC	1,399,954	1,379,890	1,378,629
Average hit rate	.01623	.01926	.01928
Display/feature		✓	✓
Demographics as unit info. cost			✓

Notes: ✓ = indicates that the corresponding variables are included in the purchase likelihood; G-L 1983 = Guadagni and Little (1983). AIC, BIC, and log-likelihood are calculated after scaling the panel-projection factor weights so that they sum to the number of choice observations. The estimation sample has 160,698 choices with a total of 15,989,188 alternatives. The holdout sample has 10,000 choices with a total of 994,378 alternatives.

Parameter estimates of the full SP-RI model

Table 4 summarizes the model-parameter estimation results of the full SP-RI demand model estimated using the entire sample of 170,698 choices. The effective magnitude of the utility parameters for predicting the conditional choice probability $P r_{i} (i Chooses j | u_{i})$ is $(- {\hat{β}}_{1} {\hat{μ}}_{i}, {\hat{β}}_{2} {\hat{μ}}_{i}, \hat{σ} {\hat{μ}}_{i})$ , which is specific to each individual i. The implied mean price elasticity from the model-parameter estimates is −1.217, which is of a reasonable magnitude.¹⁸ All the major brand coefficients are positive and statistically and economically significant, implying that a substantial brand premium exists in this market.

Table 4.

Model-Parameter Estimates.

Mean Price Elasticity			−1.217
Utility Parameter $(- {\hat{β}}_{1} {\hat{β}}_{2} \hat{σ})$				Consideration Shifter Parameter $\hat{γ}$		Unit Information Cost Parameter $\hat{θ}$
Per pack price $(- {\hat{β}}_{1})$	–.155*** (.005)	75 oz. < Pack ≤ 150 oz.	.945*** (.067)	Display	.755*** (.023)
Pack ≤ 75 oz.	.868*** (.063)	150 oz. < Pack ≤ 225 oz.	1.124*** (.074)	Feature ad	.928*** (.014)
All	.750*** (.027)	Tide	.772*** (.026)	Display × Feature	–.200*** (.031)
Arm & Hammer	.708*** (.027)	Wisk	.807*** (.034)
Gain	.241*** (.025)	Xtra	.741*** (.031)
Purex	.645*** (.024)
Powder	–.277*** (.019)	Oxi-Clean/baking soda	–.017 (.023)
Fabric softener	–.126*** (.059)	Colorsafe	.180*** (.023)
Febreze	–.204*** (.025)	Soft	–.243*** (.057)
All temperature	.010 (.031)	Stain remover/deep clean	.227*** (.021)			Visited the same store within 1 year	.212*** (.022)
Bleach	–.132*** (.019)	Unscented/sensitive/baby	–.115*** (.014)			Income ratio to FPL	−.068*** (.005)
Ultra	.107*** (.012)	Low chlorine/sulfur/phosphates	.012 (.018)			Married, living together	–.040** (.018)
n × Concentrated	.032*** (.006)	Pod/tablet/sheet	–.278*** (.028)			Head employment	.106*** (.022)
High efficiency	.054*** (.010)	sd(η_i,j) ( $\hat{σ}$ )	.132 (.086)			Head college degree	–.119*** (.019)

*p < .1.

*p < .05.

***p < .01.

Notes: Choice observations = 170,698; sample size = 17 million; $\bar{μ} = \bar{\exp ({w^{'}}_{i}^{} \hat{θ})} = .893$ ; AIC = 1,463,954; BIC = 1,464,305; log-likelihood = −731,942. This table summarizes the model parameter estimation results from the maximum likelihood estimation. Mean elasticity and $\bar{μ} = \bar{\exp ({w^{'}}_{i}^{} \hat{θ})}$ are calculated after weighting for the panel-projection factor. Due to the utility specification of the model, μ_i should be multiplied by the utility-parameter coefficients $({\hat{β}}_{1}, {\hat{β}}_{2})$ to find the effective magnitudes. Comparing the magnitudes of $(- {\hat{β}}_{1}, {\hat{β}}_{2})$ with the promotion parameter $\hat{γ}$ estimates is only sensible after multiplying μ_is by $(- {\hat{β}}_{1}, {\hat{β}}_{2})$ . The data used for the estimation are the matched sample of laundry detergent purchases in the Nielsen-Kilts consumer panel data and scanner data for households during 2006–2016. Standard-error estimates are in parentheses. AIC, BIC, and log-likelihood are calculated after adjusting the panel-projection factor weights so they sum to the number of choice observations.

The $\hat{γ}$ estimates indicate the consideration shifters have strong effects on purchase. All the $\hat{γ}$ coefficients are statistically significant at the 1% significance level. In-store display has an effect equivalent to $6.70 price discount per pack, and feature advertisement has an effect equivalent to $5.46 price discount per pack. Considering that the average price of a detergent per package is $8.91, the consideration-shifting effect of the point-of-sale promotions on consumer demand is economically very significant.¹⁹ The interaction term Display × Feature's coefficient estimate is negative, implying that the consideration-shifting effects of displays and features are diminishing.

The unit-information-cost parameter estimates $\hat{θ}$ relate to the shopping environment and demographics. Because μ_i is the inverse of the unit information cost, the positive $\hat{θ}$ component implies that the corresponding variable negatively affects the unit information cost. For example, if the household has visited the specific store before and is therefore familiar with it, learning the locations of shelves will require less time and effort, as reflected in a positive coefficient estimate of the “visited the same store within 1 year” variable. Estimates imply that households with higher incomes and a household head with a college degree exhibit a higher unit information cost. $\hat{θ}$ estimates are broadly consistent with the opportunity-cost-of-time story and, therefore, with the search-cost story. Thus, the unit information cost parametrized in the SP-RI framework provides face validity to the results.

Comparing the estimates from the SP-RI model and the FI-RUM model, reported in Table W1 of Web Appendix B.2, I find that the FI-RUM model generates much larger magnitudes for almost all the estimated utility parameters. The FI-RUM model overestimates the price sensitivity by around 18%, the Tide brand coefficient by around 44%, and the Gain by around 65%, relative to the SP-RI model.²⁰ On the contrary, the popular Guadagni–Little specification that includes the in-store promotions as a part of consumption-utility shifter, reported in Table W2 of Web Appendix B.2, generates similar utility-coefficient estimates to the full SP-RI model. The differences in the effective magnitude of the utility coefficients between the SP-RI and FI-RUM models are likely due to the omitted variable bias in the FI-RUM specification that does not include the displays/features in its specification, reconfirming the importance of properly including the consideration shifters in the purchase likelihood.

The net effect of the upward bias in the parameter magnitudes on the implied pricing power, on the supply side, is negative: the FI-RUM understates the optimal contribution margins. I use the standard category pricing conduct, or multiproduct monopoly, to assess optimal contribution margins for a profit-maximizing retailer (e.g., Chintagunta, Dubé, and Singh 2003). The SP-RI model suggests the average optimal margins as $9.60 (brand-level joint pricing) and $7.65 (stockkeeping unit [SKU]–level individual pricing). In contrast, the FI-RUM model suggests $7.70 (brand-level joint pricing) and $6.21 (SKU-level individual pricing). The result demonstrates that biases incurred in the misspecified FI-RUM model may lead managers to suboptimal pricing decisions.

In summary, I systematically select the SP-RI model over the standard FI-RUM model using various in- and out-of-sample testing criteria. I also find that the FI-RUM overstates most of the marginal utility parameters, which, on net, understates the pricing power of a retailer that maximizes category profits. Next, I turn to the differing welfare implications of a new product launch under SP-RI versus FI-RUM.

Counterfactual Welfare Simulation Associated with Tide Pods’ Introduction

In 2012, Procter & Gamble successfully launched its Pods laundry detergent product to the market under the Tide brand name. I now measure the welfare implications of Procter & Gamble's launch of Tide Pods laundry detergent, comparing the results using the SP-RI-based welfare formula proposed in the previous section and the conventional FI-RUM-based welfare formula.²¹

The CV is calculated as follows. Recall the consumer-choice model is estimated by pooling the pre– and post–Tide Pods introduction data. The model-parameter estimates $(- {\hat{β}}_{1}, {\hat{β}}_{2}, \hat{γ}, \hat{θ}, \hat{σ})$ reported in Table 4 are taken as fixed. Let ${\hat{u}}_{i, j}$ , ${\hat{μ}}_{i}$ , ${\hat{π}}_{i, j}$ , and ${\hat{\Pr}}_{i} (i Chooses j | {\hat{u}}_{i})$ be their consistent predictions obtained by plugging the estimated parameters back into the respective functions in Equations 15, 16, and 18. Using the formulas in Equation 12 and Equation W3 in Web Appendix B.1, $W_{SPRI}^{i} ({\hat{u}}_{i}^{1}, J_{i}^{1})$ and $W_{FIRUM}^{i} ({\hat{u}}_{i}^{1}, J_{i}^{1})$ are evaluated using the post–Tide Pods introduction data. Holding everything else the same, Tide Pods laundry detergent items are removed from the consumers’ choice set in the post–Tide Pods introduction data to evaluate $W_{SPRI}^{i} ({\hat{u}}_{i}^{0}, J_{i}^{0})$ and $W_{FIRUM}^{i} ({\hat{u}}_{i}^{0}, J_{i}^{0})$ using the same set of formulas. ${CV}_{SPRI}^{i}$ and ${CV}_{FIRUM}^{i}$ are then calculated following the formulas in Equation 13 and Equation W4 in Web Appendix B.1, respectively. Then, to calculate the average CV_SPRI and CV_FIRUM, ${CV}_{SPRI}^{i}$ and ${CV}_{FIRUM}^{i}$ are integrated out against the estimated distribution of the random effects η_i,j, respectively.

The “average per shopping trip” row in Table 5 summarizes the average CV per shopping trip. Note that average ${CV}_{SPRI}^{i}$ per shopping trip is slightly negative at −$.099/trip, whereas average ${CV}_{FIRUM}^{i}$ is positive at $.286/trip. Comparing columns 1 and 2, ${CV}_{FIRUM}^{i}$ overestimates the CV not just in terms of magnitude but also in terms of signs, relative to ${CV}_{SPRI}^{i}$ . The “annually projected” row of Table 5 presents the results after projecting the results to the entire U.S. population using the panel-projection factors. CV_SPRI is –$41 million, CV_FIRUM is $125 million, and the total annual Tide Pods detergent sales equal $346 million. CV_FIRUM amounts to more than one-third of the total annual Tide Pods detergents sales, which is unrealistic because consumers may easily switch to other laundry detergents that are close substitutes.²²

Table 5.

Average and Annually Projected Compensating Variation per Shopping Trip Associated with Tide Pods’ Introduction.

	(1)	(2)
	CV_SPRI	CV_FIRUM
Average per shopping trip	−$.099 (.007)	$.286 (.007)
Annually projected	−$41,319,890	$124,971,951
Pods sales/year	$346,294,884
Detergent sales/year	$6,449,062,852

Notes: This table presents the CV predictions from my SP-RI model and the FI-RUM model, respectively, associated with Tide Pods introduction. Standard errors of average per shopping trip are reported in parentheses, calculated using parametric bootstrap. The parametric bootstrap is conducted by redrawing the estimates from the estimators’ asymptotic distribution, of which the mean and standard error are reported in Table 4 and Table W1 of Web Appendix B.2, respectively, and then predicting CV_SPRI and CV_FIRUM for 500 times. The “annually projected” row is calculated using the subsample of the estimation data during 2012–2016, projected to the entire U.S. population using the Homescan panel-projection factors as sampling weights, and then annualizing. “Pods sales/year” and “detergent sales/year” rows are calculated using the total Nielsen RMS laundry detergent sales of 2012–2016, multiplied by 2.5 and then annualized.

Figure 1 illustrates the CV distribution associated with Tide Pods’ entry into the market. Each observation corresponds to each shopping instance in the estimation data. Note that adding Tide Pods laundry detergent to consumers’ choice set does not monotonically increase ${CV}_{SPRI}^{i}$ , and only around 19% of the ${CV}_{SPRI}^{i}$ prediction incidences are positive. Notably, ${CV}_{SPRI}^{i}$ is highly heterogeneous, 2.5% and 97.5% percentiles of which are −$.316 and $.064, respectively. Even though the proposed SP-RI demand model predicts that Tide Pods’ introduction had negative consumer-welfare effects on average, some consumers benefited from the new product. By contrast, ${CV}_{FIRUM}^{i}$ is uniformly positive and tends to be much higher than ${CV}_{SPRI}^{i}$ .

Figure 1.

Compensating Variation Associated with Tide Pods’ Entry.

I also calculate the changes in the average gross benefit separately, which corresponds to the change of the first term in the consumer-surplus formula in Equation 12. The average gross-benefit change is still negative at −$.073/trip, implying that the difference of −$.026/trip can be attributed to the changes in the information cost. Therefore, I conclude that both the gross-benefit and information-cost change contribute to the negative average ${CV}_{SPRI}^{i}$ prediction.

The findings from the counterfactual consumer-welfare simulations presented in this subsection are consistent with popular press articles stating that the introduction of Tide Pods into the market did not unanimously benefit consumers, and that the new Tide Pods products were priced substantially higher even after considering their convenience.²³ Furthermore, just after one full year after its launch, the national market shares of Tide Pods began to drop sharply—they declined from 6.9% to 3.9% over the period 2014–2016.²⁴ The consistency of the consumer-welfare analysis results with the popular press and longer-term market shares pattern provide the face validity of the SP-RI-based consumer-welfare evaluation and pricing counterfactuals conducted. Numerous mechanisms may explain why the introduction of the new product harmed at least some consumers. Explaining the exact reason would require supplementing the purchase data with, for example, consumer surveys, which exceeds the scope of the present research.

Web Appendix A provides further discussion on implementation details, possible price endogeneity, and equilibrium responses of the competitors. Finally, Web Appendix D reports demand estimates and CV calculations from several additional specifications, including the specification in which demographics are interacted with prices to shift the consumption utility. I find that the key results are robust across all the specifications I compare, namely, the FI-RUM systematically predicts that launching Tide Pods increases value to consumers, whereas the SP-RI typically predicts negative total value, in spite of heterogeneity in the sign of the value created for different consumers.

Concluding Remarks

I propose a novel empirical framework for brand choice that allows for rationally inattentive consumers who make utility-maximizing choices at the point of sale under imperfect information about the choice set. In a case study of the laundry detergent category, I find that the SP-RI model fits the data better than a conventional random utility model (i.e., FI-RUM). I also find that the FI-RUM systematically overstates the magnitudes of the marginal utility parameters and understates the implied pricing power of a category-profit-maximizing retailer. I also demonstrate that the SP-RI model can potentially predict either welfare increases or decreases from the launch of new products, in contrast with conventional FI-RUM models, which always predict welfare increases. In my case study of the launch of Tide Pods, I indeed find that the consumer-welfare change is heterogeneous in its signs, and the average consumer welfare decreases slightly due to the trade-off of higher search and evaluation costs under the larger choice set.

An interesting direction for future research would consist of applying the SP-RI consumer-welfare-evaluation framework developed herein to contexts that may include, but are not limited to, composition of individual product attributes during new product development, product-line design, product curation of a category, SKU rationalization, pricing, and store choice.

Supplemental Material

sj-pdf-1-mrj-10.1177_00222437221110173 - Supplemental material for Rational Inattention as an Empirical Framework for Discrete Choice and Consumer-Welfare Evaluation

Supplemental material, sj-pdf-1-mrj-10.1177_00222437221110173 for Rational Inattention as an Empirical Framework for Discrete Choice and Consumer-Welfare Evaluation by Joonhwi Joo in Journal of Marketing Research

Footnotes

Difference of the SP-RI Framework from the Extant RI-Discrete-Choice Theory Models

The setup of the proposed SP-RI framework departs from the extant RI theory models in two aspects: the subjectivity of the prior-belief distribution and the definition of the information-cost function. The modifications allow an empirical researcher to not specify the complete shape of the consumer's prior belief, both in implementing the SP-RI discrete-choice model and in conducting the SP-RI-based consumer-welfare evaluation. In this appendix, I detail how the SP-RI framework differs from the extant RI theory models in those aspects.

Generalization of the SP-RI Empirical Framework to Any Discrete-Choice Probabilities and the Proof of Theorem 1

Acknowledgments

The author is grateful to his dissertation advisors, Jean-Pierre H. Dubé, Ali Hortaçsu, Pradeep K. Chintagunta, and Doron Ravid, for their guidance and support. The author thanks Bart J. Bronnenberg, Inyoung Chae, Tat Chan, Andrew Ching, Khai X. Chiong, Lawrence Y. Chung, Filip Matĕjka, Carl Mela, Sanjog Misra, Michael Grubb, Mingyu Joo, Irene Kim, Jun Hyung Kim, Kyeongbae Kim, Tongil Kim, Nanda Kumar, Nitin Mehta, Ram C. Rao, Brian Ratchford, Linda M. Schilling, Jinyeong Son, Kenneth Wilbur, and seminar participants at the University of Chicago, University of British Columbia Sauder School of Business, The University of Texas at Dallas Naveen Jindal School of Management, University College London School of Management, International Industrial Organization Conference 2017, INFORMS Marketing Science 2018, Econometric Society Asian Meeting 2019, and 2nd DFW Marketing Junior Symposium for suggestions and discussions. This article includes figures calculated or derived based on data from The Nielsen Company (US), LLC, Copyright 2018, provided by the Kilts Center for Marketing at the University of Chicago Booth School of Business. The conclusions drawn from the Nielsen data are mine and do not reflect the views of Nielsen. Nielsen is not responsible for, had no role in, and was not involved in analyzing and preparing the results reported herein.

Associate Editor

Brett Gordon

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Online supplement:

ORCID iD

Joonhwi Joo

Notes

References

Allenby

Greg M.

Ginter

James L.

(1995), “The Effects of in-Store Displays and Feature Advertising on Consideration Sets,” International Journal of Research in Marketing, 12 (1), 67–80.

Anderson

Simon P.

de Palma

André

Thisse

Jacques-François

(1992), Discrete Choice Theory of Product Differentiation Cambridge. MIT Press.

Anderson

Eric T.

Simester

Duncan I.

(1998), “The Role of Sale Signs,” Marketing Science, 17 (2), 139–55.

Andrews

Rick L.

Srinivasan

T.C.

(1995), “Studying Consideration Effects in Empirical Choice Models Using Scanner Panel Data,” Journal of Marketing Research, 32 (1), 30–41.

Berry

Steven

(1994), “Estimating Discrete-Choice Models of Product Differentiation,” RAND Journal of Economics, 25 (2), 242–62.

Berry

Steven

Pakes

Ariel

(2007), “The Pure Characteristics Demand Model,” International Economic Review, 48 (4), 1193–1225.

Bertrand

Marianne

Karlan

Dean

Mullainathan

Sendhil

Shafir

Eldar

Zinman

Jonathan

(2010), “What’s Advertising Content Worth? Evidence from a Consumer Credit Marketing Field Experiment,” Quarterly Journal of Economics, 125 (1), 263–306.

Bhattacharya

Debopam

(2015), “Nonparametric Welfare Analysis for Discrete Choice,” Econometrica, 83 (2), 617–49.

Bhattacharya

Vivek

Howard

Greg

(2022), “Rational Inattention in the Infield,” American Economic Journal: Microeconomics (forthcoming), https://www.aeaweb.org/articles?id=10.1257/mic.20200310&&from=f.

10.

Blattberg

Robert C.

Neslin

Scott A.

(1989), “Sales Promotion: The Long and the Short of it,” Marketing Letters, 1 (1), 81–97.

11.

Broniarczyk

Susan M.

Hoyer

Wayne D.

McAlister

Leigh

(1998), “Consumers’ Perceptions of the Assortment Offered in a Grocery Category: The Impact of Item Reduction,” Journal of Marketing Research, 35 (2), 166–76.

12.

Bronnenberg

Bart J.

Kim

Jun B.

Mela

Carl F.

(2016), “Zooming In on Choice: How Do Consumers Search for Cameras Online?” Marketing Science, 35 (5), 693–712.

13.

Bronnenberg

Bart J.

Vanhonacker

Wilfried R.

(1996), “Limited Choice Sets, Local Price Response, and Implied Measures of Price Competition,” Journal of Marketing Research, 33 (2), 163–73.

14.

Brown

Zach Y.

Jeon

Jihye

(2020), Endogenous Information and Simplifying Insurance Choice,” working paper.

15.

Caplin

Andrew

Dean

Mark

(2015), “Revealed Preference, Rational Inattention, and Costly Information Acquisition,” American Economic Review, 105 (7), 2183–2203.

16.

Caplin

Andrew

Dean

Mark

Leahy

John

(2019), “Rational Inattention, Optimal Consideration Sets and Stochastic Choice,” Review of Economic Studies, 86 (3), 1061–94.

17.

Chernev

Alexander

Hamilton

Ryan

(2009), “Assortment Size and Option Attractiveness in Consumer Choice Among Retailers,” Journal of Marketing Research, 46 (3), 410–20.

18.

Chintagunta

Pradeep K.

Dubé

Jean-Pierre

Singh

Vishal

(2003), “Balancing Profitability and Customer Welfare in a Supermarket Chain,” Quantitative Marketing and Economics, 1 (1), 111–47.

19.

Chiong

Khai Xiang

Galichon

Alfred

Shum

Matt

(2016), “Duality in Dynamic Discrete-Choice Models,” Quantitative Economics, 7 (1), 83–115.

20.

Consumer Reports (2020), “Consumer Reports: Best Laundry Detergent,” ABC News (September 16), https://abc7chicago.com/best-laundry-detergent-tide-liquid-cheapest/6425468.

21.

Crawford

Gregory S.

Shum

Matthew

(2005), “Uncertainty and Learning in Pharmaceutical Demand,” Econometrica, 73 (4), 1137–73.

22.

Dagsvik

John K.

Karlström

Anders

(2005), “Compensating Variation and Hicksian Choice Probabilities in Random Utility Models That Are Nonlinear in Income,” Review of Economic Studies, 72 (1), 57–76.

23.

Daly

Andrew

Zachary

Stan

(1978), “Improved Multiple Choice Models,” in Determinants of Travel Choice, David A. Hensher and M. Quasim Dalvi, eds. Saxon House, 321–62.

24.

DeDad

Michael

Lugovskyy

Volodymyr

Melo

Emerson

Skiba

Alexandre

(2021), “Information Processing and Quality Choice: The Case of Organic Milk,” working paper.

25.

Erdem

Tülin

Keane

Michael P.

(1996), “Decision-Making Under Uncertainty: Capturing Dynamic Brand Choice Process in Turbulent Consumer Goods Markets,” Marketing Science, 15 (1), 1–20.

26.

Fader

Peter S.

McAlister

Leigh

(1990), “An Elimination by Aspects Model of Consumer Response to Promotion Calibrated on UPC Scanner Data,” Journal of Marketing Research, 27 (3), 322–32.

27.

Fan

Ying

Yang

Chenyu

(2020), “Competition, Product Proliferation, and Welfare: A Study of the US Smartphone Market,” American Economic Journal: Microeconomics, 12 (2), 99–134.

28.

Fosgerau

Mogens

Melo

Emerson

de Palma

André

Shum

Matthew

(2020), “Discrete Choice and Rational Inattention: A General Equivalence Result,” International Economic Review, 61 (4), 1569–89.

29.

Guadagni

Peter M.

Little

John D.C.

(1983), “A Logit Model of Brand Choice Calibrated on Scanner Data,” Marketing Science, 2 (3), 203–38.

30.

Hauser

John R.

Wernerfelt

Birger

(1990), “An Evaluation Cost Model of Consideration Sets,” Journal of Consumer Research, 16 (4), 393–407.

31.

Herriges

Joseph A.

Kling

Catherine L.

(1999), “Nonlinear Income Effects in Random Utility Models,” Review of Economics and Statistics, 81 (1), 62–72.

32.

Honka

Elisabeth

(2014), “Quantifying Search and Switching Costs in the US Auto Insurance Industry,” RAND Journal of Economics, 45 (4), 847–84.

33.

Howard

John A.

Sheth

Jagdish N.

(1969), The Theory of Buyer Behavior. John Wiley & Sons.

34.

Iyengar

Sheena S.

Huberman

Gur

Jiang

Wei

(2004), “How Much Choice Is Too Much? Contributions to 401(k) Retirement Plans,” in Pension Design and Structure: New Lessons from Behavioral Finance, Olivia S. Mitchell and Stephen P. Utkus, eds. Oxford University Press, 83–96.

35.

Iyengar

Sheena S.

Lepper

Mark

(2000), “When Choice Is Demotivating: Can One Desire Too Much of a Good Thing?” Journal of Personality and Social Psychology, 79 (6), 995–1006.

36.

Joo

Joonhwi

(2018), “Essays in Consumer Choice and Consumer Demand,” doctoral thesis, Department of Economics, University of Chicago.

37.

Kim

Jun B.

Albuquerque

Paulo

Bronnenberg

Bart J.

(2010), “Online Demand Under Limited Consumer Search,” Marketing Science, 29 (6), 1001–23.

38.

Levy

Michael

Weitz

Barton A.

Grewal

Dhruv

(2019), Retailing Management. McGraw Hill.

39.

Mas-Collell

Andrew

Whinston

Michael D.

Green

Jerry R.

(1995), Microeconomic Theory. Oxford University Press.

40.

Matarese, John (2020), “Is P&G’s Tide Detergent Still Worth the High Cost?” ABC News (August 18), https://www.wcpo.com/money/consumer/dont-waste-your-money/is-p-gs-tide-detergent-still-worth-the-high-cost.

41.

Matĕjka

Filip

McKay

Alisdair

(2015), “Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model,” American Economic Review, 105 (1), 272–98.

42.

McFadden

Daniel

(1974), “Conditional Logit Analysis of Qualitative Choice Behavior,” in Frontiers in Econometrics , Zarembka

Paul

, ed. New York: Academic Press, 105–42.

43.

McFadden

Daniel

(1978), “Modelling the Choice of Residential Location,” in Spatial Interaction Theory and Planning Models, Vol. 3, Anders Karlqvist, ed. North-Holland, 75–96.

44.

McFadden

Daniel

(1981), “Econometric Models of Probabilistic Choice,” in Structural Analysis of Discrete Data with Econometric Applications, Charles F. Manski and Daniel L. McFadden, eds. MIT Press, 198–272.

45.

McFadden

Daniel

Train

Kenneth

(2019), “Welfare Economics in Product Markets,” working paper.

46.

Mehta

Nitin

Rajiv

Surendra

Srinivasan

Kannan

(2003), “Price Uncertainty and Consumer Search: A Structural Model of Consideration set Formulation,” Marketing Science, 22 (1), 58–84.

47.

Moorthy

Sridhar

Ratchford

Brian T.

Talukdar

Debabrata

(1997), “Consumer Information Search Revisited: Theory and Empirical Analysis,” Journal of Consumer Research, 23 (4), 263–77.

48.

Morozov

Ilya

(2021), “Measuring Benefits from New Products in Markets with Information Frictions,” working paper.

49.

Natan

Olivia

(2021), “Choice Frictions in Large Assortments,” working paper.

50.

Newman

Joseph W.

Staelin

Richard

(1972), “Prepurchase Information Seeking for New Cars and Major Household Appliances,” Journal of Marketing Research, 9 (3), 249–57.

51.

Noretz

Andriy

Takahashi

Satoru

(2013), “On the Surjectivity of the Mapping Between Utilities and Choice Probabilities,” Quantitative Economics, 4 (1), 149–55.

52.

Petrin

Amil

(2002), “Quantifying the Benefits of New Products: The Case of the Minivan,” Journal of Political Economy, 110 (4), 705–29.

53.

Porcher

Charly

(2020), “Migration with Costly Information,” working paper.

54.

Punj

Girish N.

Staelin

Richard

(1983), “A Model of Consumer Information Search Behavior for New Automobiles,” Journal of Consumer Research, 9 (4), 366–80.

55.

Ratchford

Brian T.

Talukdar

Debabrata

Lee

Myung-Soo

(2007), “The Impact of the Internet on Consumers’ Use of Information Sources for Automobiles: A Re-Inquiry,” Journal of Consumer Research, 34 (1), 111–19.

56.

Shannon

C.E.

(1948), “A Mathematical Theory of Communication,” The Bell System Technical Journal, 27 (3), 379–423.

57.

Sims

Christopher A.

(2003), “Implications of Rational Inattention,” Journal of Monetary Economics, 50 (3), 665–90.

58.

Small

Kenneth A.

Rosen

Harvey S.

(1981), “Applied Welfare Economics with Discrete Choice Models,” Econometrica, 49 (1), 105–30.

59.

Terui

Nobuhiko

Ban

Masataka

Allenby

Greg M.

(2011), “The Effect of Media Advertising on Brand Consideration and Choice,” Marketing Science, 30 (1), 74–91.

60.

Train

Kenneth

(2015), “Welfare Calculations in Discrete Choice Models When Anticipated and Experienced Attributes Differ: A Guide with Examples,” Journal of Choice Modeling, 16, 15–22.

61.

Williams

(1977), “On the Formation of Travel Demand Models and Economic Evaluation Measures of User Benefit,” Environment and Planning A: Economy and Space, 9 (3), 285–344.

62.

Wright

Peter

Barbour

Fredrick

(1977), “Phased Decision Strategies: Sequels to an Initial Screening,” working paper.

63.

Zhang

Jie

(2006), “An Integrated Choice Model Incorporating Alternative Mechanisms for Consumers’ Reactions to In-Store Display and Feature Advertising,” Marketing Science, 25 (3), 278–90.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.77 MB