Effects of Payment on User Engagement in Online Courses

Abstract

Massive open online courses (MOOCs) have the potential to democratize education by improving access. Although retention and completion rates for nonpaying users have not been promising, these statistics are much brighter for users who pay to receive a certificate upon completing the course. We investigate whether paying for the certificate option can increase engagement with course content. In particular, we consider two effects: (1) the certificate effect, which is the boost in motivation to stay engaged to receive the certificate; and (2) the sunk-cost effect, which arises solely because the user paid for the course. We use data from over 70 courses offered on the Coursera platform and study the engagement of individual participants at different milestones within each course. The panel nature of the data enables us to include controls for intrinsic differences between nonpaying and paying users in terms of their desire to stay engaged. We find evidence that the certificate and sunk-cost effects increase user engagement by approximately 8%–9% and 17%–20%, respectively. Whereas the sunk-cost effect is transient and lasts for only a few weeks after payment, the certificate effect lasts until the participant reaches the grade required to be eligible to receive the certificate. We discuss the implications of our findings for how platforms and content creators may design course milestones and schedule payment of course fees. Given that greater engagement tends to improve learning outcomes, our study serves as an important first step in understanding the role of prices and payment in enabling MOOCs to realize their full potential.

Keywords

difference-in-differences user engagement online education sunk-cost fallacy causal inference

Massive open online courses (MOOCs) have the potential to democratize education by improving access (see Christensen et al. 2013; Dillahunt, Wang, and Teasley 2014; Glass, Shiokawa-Baklan, and Saltarelli 2016), especially because many MOOCs allow users to take courses for free, thereby enabling participants from lower socioeconomic strata to access their content. Over the past decade, platforms such as Coursera, edX, XuetangX, Udacity, and FutureLearn have partnered with hundreds of universities, offered thousands of courses, and attracted millions of users. As of November 2018, Coursera alone has attracted 30 million users, offers more than 3,000 courses, and has 177 university partners. Despite this potential to democratize education, low retention and completion rates in MOOCs have raised doubts about their prospects; for example, see Onah, Sinclair, and Boyatt (2014) and Khalil and Ebner (2014).

Although retention and completion rates for nonpaying users have not been promising, these statistics are much brighter for those who pay to receive a certificate upon course completion (Koller, Ng, and Chen 2013). Two factors may contribute to this disparity in performance between paying users and those who take these courses for free: (1) higher-ability participants self-select into signing up for a certificate, and (2) users who pay may stay more engaged with the course content than nonpaying users. Whereas the former is usually viewed as an intrinsic characteristic of users, the latter can potentially be altered via appropriate incentives and course design.

In this article, we investigate whether signing up for the certificate option is associated with greater engagement with course content. Our research is motivated by the premise that increasing user engagement can yield several benefits to the platform, content creators, and users. In a similar vein, research in other contexts has documented that greater engagement can be a good predictor of customer retention, lifetime value (Zhang, Bradlow, and Small 2014), and advocacy and referrals (Pansari and Kumar 2017). Therefore, exploring avenues to improve student engagement is likely to be of interest to all parties involved in online education.

We conceptualize that the relationship between signing up for the certificate option and engagement might be driven by factors that are (1) intrinsic to the student, such as their intrinsic ability and motivation, and (2) related to having paid for the course in order to obtain a certificate. The first of these two factors is consistent with the idea that selection drives the difference between both types of users for a given course. Regarding the second of these factors, we consider that the difference between paying users and their counterparts taking the course for free could stem from factors that influence the engagement of the former group over time. One such feature is that a paying user can obtain a verified certificate after accumulating the minimum points necessary to pass the course. As a result, a paying user is likely to experience a different level of motivation than a user who is taking the course for free. This gap in motivation level can change over time once the user achieves the minimum point threshold. We call this the “certificate effect.” A further difference between the two types of users is that paying users, as the name implies, pay for the course. Thus, they might demonstrate higher engagement merely because they made the payment, plausibly as a result of falling prey to the sunk-cost effect (e.g., Gourville and Soman 1998). Together, the certificate effect and the sunk-cost effect constitute temporal effects that occur in response to paying for the signature track.

From the perspective of the platform, why is identifying which of these two reasons—intrinsic or temporal—drives the differential engagement of paying and nonpaying users important? Evidence in support of the latter reason would suggest that the level of engagement is potentially malleable and can be influenced by modifying the design of the courses and the payment schedule. Therefore, findings from our research investigating these factors can shed light on actions that MOOC platforms can take to increase engagement among their users and, by extension, the learning and course completion rates on the platform. However, as we describe subsequently, empirically separating the various effects can be challenging.

We perform our empirical analysis using data from over 70 courses offered by a large public university on the Coursera platform. Although courses on the platform are generally offered for free, users can choose to pay a fee to obtain a certificate upon successful completion of the course. When our data were collected, the paid service at Coursera was called “signature track.” The signature track offered three services: identity verification, verified certificates, and shareable course records. The certificate is awarded to paying users if they achieve the minimum number of points required to pass the course. The data are granular and contain information on the time spent by individual users in accessing course content, users’ participation in the discussion forums, and users’ performance (grades) on individual quizzes and assignments. We perform our analysis by considering each user’s level of engagement (as measured in terms of time spent accessing course content and participating in discussion forums) with the material corresponding to each quiz/assignment (hereinafter, quiz) within a course. To separate the effects of the intrinsic factors (i.e., ability and intrinsic motivation—the two key drivers of selection that we discussed previously), we leverage the panel nature of our data on engagement by including strict controls in the form of user-course fixed effects (FEs). The panel data also allow us to include quiz FEs to account for differences in the extent to which individual assessments in each of these courses (i.e., quizzes and assignments) demand time commitments from participants.

To parse out the certificate effect, which is the consequence of offering a verified certificate and any benefits that may accrue from it through shareable course records, we exploit the idea that users who pay for the signature track will receive a certificate only if they obtain a passing grade in the course.¹ Although all users are likely to prefer higher grades (to lower ones) in general, users who have signed up for the signature track are likely to derive disproportionately more utility from passing the course. Therefore, we expect to see a shift in their motivation to stay engaged with the material around the threshold grade for passing the course. To infer the presence of the sunk-cost effect, we exploit the fact that the platform allows users to sign up for the signature track as early as a few weeks before the course commences and as late as a few weeks after the start of the course. We observe that users exhibit considerable heterogeneity in terms of when they pay for the service. Therefore, although early and late payers are both motivated by the certificate (i.e., the certificate effect), at each milestone (quiz), they differ in terms of how recently they paid for the course. Note that the certificate effect is common across paying users. Therefore, the systematic relationship between engagement and the recency of payment among these users helps us detect the presence of any sunk-cost effect.

Our results suggest that both paying and nonpaying users were more engaged with the course before reaching the passing grade. However, this elevated level of engagement before reaching the passing grade was 8%–9% higher among paying users than among their peers taking the course for free. This finding shows that the certificate effect, a consequence of signing up for the signature track, altered the motivation to pass the course and drove some of the difference in engagement between both kinds of users. This result is robust when we include alternative controls and approaches to matching nonpaying and paying users. We further consider the idea that paying for the signature option drove higher engagement as a result of the sunk-cost effect. We find that in the weeks immediately following the payment, users spent 17%–20% more time on the platform. However, this effect depreciated rapidly within four weeks after payment, a pattern consistent with the presence of the sunk-cost effect. Therefore, although the sunk-cost effect cannot fully explain all of the higher engagement among paying users until they cross the passing threshold, it seems to have played a role in driving some of the differential engagement during the period immediately after making the payment.² In addition to the main effects, we examine heterogeneity as a function of how early in the course a participant commits to paying for the certificate. Our results suggest that the certificate effect is higher among early payers than among those who pay later in the course. Because users are always weakly better off by delaying the payment, we can view early payment as a potential commitment device that restricts their future actions (see, e.g., Kaur, Kremer, and Mullainathan 2015).

Our results imply that paying for the signature track can increase engagement through the sunk-cost and certificate effects. This finding is likely to be of interest to education platforms and creators of course content, who may be interested in increasing engagement either to improve learning outcomes or to foster future enrollment. Importantly, these benefits are also aligned with the objective of these agents to monetize content by charging users. Some of this malleability in engagement comes from the motivation to receive a certificate. Although the idea that issuing a certificate can motivate participants to stay more engaged might be intuitive, the coexistence of free and paid options in our context enables us to document this effect empirically. Further, the finding that the mere act of paying will induce participants to increase engagement, albeit immediately following the payment, is indeed informative. We discuss the implications of our findings for how platforms and content creators may want to design course milestones and schedule the payment of course fees.

The primary contribution of this paper is to demonstrate the roles of payment and the presence of incentives in the form of certificates on user engagement. This engagement, in turn, is likely to be a key antecedent of educational outcomes; see, for example, Romer (1993), Hughes and Pace (2003), and Carini, Kuh, and Klein (2006). The certificate effect is reminiscent of studies that document the impact of incentive schemes on sales force effectiveness in the marketing literature (Chung, Steenburgh, and Sudhir 2014; Misra and Nair 2011). In contrast to the sales force literature where effort is not observed, a unique feature of our data is that we observe longitudinal variation in student engagement within courses. Furthermore, we are able to leverage the quasi-experimental nature of variation across nonpaying and paying users in different courses. Together, a continuous measure of effort across these different types of users allows us to isolate the temporal effects from cross-sectional ones. The latter has traditionally been the focus of the education literature (e.g., Heckman and Kautz 2012; Heckman and Rubinstein 2001; Romer 1993). Our study therefore highlights the importance of additional temporal factors, namely, the certificate and sunk-cost effects, and how they are influenced by incentive schemes and payments. The findings from this study demonstrate that marketing plays a vital role in influencing online education in a manner different from how educators might approach this issue.

Conceptual Framework

Learning is typically viewed as an outcome of two factors: (1) the innate ability of the participants and (2) their engagement with the course content (a proxy for effort). Of these two, ability can be viewed as a time-invariant trait of a participant, at least within the context of a course. Participants might have higher ability either because they have the requisite background knowledge that enables them to perform well in the course or because of their aptitude to grasp new material. Because ability is unlikely to change within the span of a course, platforms need to consider enhancing engagement as a means to achieving better learning outcomes. Whereas the education literature has had to rely on cross-sectional data and proxies for user engagement and time investment, online courses record information regarding the amount of time participants spend consuming course content. Consistent with prior research (e.g., Brodie et al. 2011; Kumar et al. 2010; Venkatesan, Petersen, and Guissoni 2018; Vivek, Beatty, and Morgan 2012), we can use this direct time spent on the course portal as a proxy for engagement. We can thus investigate plausible drivers of engagement and identify avenues to improve it.

Typically, modifying the teaching style, content, and design of courses is viewed as an avenue for improving engagement. For example, adopting a more active teaching style by moving the exercises and homework to the classroom can improve student engagement and has been the focus of studies in the education literature (Clark 2015; Weiss and Pasley 2004). Changes in course design can also make learning easier and thus enable participants to achieve greater returns for the time that they invest. In a similar vein, Lu, Bradlow, and Hutchinson (2017) study how sequential versus simultaneous release of course content can influence binge consumption of such content. Because such clumpiness in consumption is an important predictor of churn (Zhang, Bradlow, and Small 2014), the timing of release can be used to boost student engagement and retention in online courses. In this section, we consider the role of payment as a driver of engagement. As a starting point, we elaborate on the idea that two broad drivers of engagement exist.

Intrinsic Traits as Drivers of Engagement

We conjecture that two intrinsic traits are relevant. First, a participant’s ability would dictate how their engagement translates into learning outcomes. These differences in ability can have implications for a participant’s engagement. For example, the total factor productivity literature, which has studied the impact of productivity on resource allocation across firms (e.g., see Foster, Haltiwanger, and Syverson 2008; Hsieh and Klenow 2009), shows that more productive firms should produce more output and take in more input factors. Borrowing this analogy, students with greater ability should stay more engaged (i.e., more input) and obtain better grades (i.e., more output). However, participants might sometimes view the final grade as the learning outcome. Because the final grade has an upper bound, the resulting concavity can potentially result in high-ability participants expending less time than their lower-ability counterparts. Therefore, the relationship between the ability of a participant in an online course and their engagement is somewhat ambiguous.

The second intrinsic trait is the time-invariant (within the temporal span of a course) motivation level of a participant. For example, participants might differ, in a cross-sectional sense, in the benefit they derive from learning as well as the cost of time investment. Cost of time would depend on other activities that might compete for a participant’s attention, such as working, commuting, and spending time with their family. As a result, participants with a higher cost of time are likely to stay less engaged with the course material. Similarly, participants who derive greater benefit from learning (i.e., place more value on better learning outcomes) are likely to stay more engaged with the course.

How would paying for the certificate alter these intrinsic traits? We conjecture that payment should not have a direct effect on ability. On the other hand, payment can increase the perceived benefits that participants derive from a course. This perception, in turn, can increase their time-invariant motivation level and lead to greater engagement with the contents of the course. At the same time, we can envision the reverse scenario wherein highly motivated participants choose to pay for the course. The presence of this alternative explanation makes it difficult it parse out the effect of paying for the certificate on the intrinsic motivation of a participant with nonexperimental data. Therefore, we embark on the less ambitious objective of inferring the effect of paying for the certificate on the temporal changes in engagement within the span of a course.

Temporal Effects

We consider two temporal effects that are related to paying for the signature track service: the certificate effect and the mere-payment effect, plausibly driven by the sunk-cost fallacy.

The Certificate Effect

The certificate effect is motivated by the idea that during our period of analysis, participants who signed up for the signature track would receive a certificate upon achieving a minimum passing grade in the course. Therefore, paying users are more likely to be motivated to reach the passing grade than are their peers taking the course for free. These dynamics are similar to those that arise when firms institute tiered customer loyalty programs wherein incentives are offered to customers based on cumulative purchasing behavior (Breugelmans et al. 2015; Kopalle et al. 2012).

In our context, Coursera issued a certificate of completion to paying users who achieved the passing grade for the course. Therefore, per our conceptualization, paying participants are likely to experience a perceptible shift in motivation after they reach the passing grade. Free users are unlikely to experience any such change in motivation.³ In addition, the certificate effect reflects any incremental benefits, such as the ability to share course records.⁴ As a result of these changes, we predict that paying customers are likely to be more engaged with the course content than free users until they reach the passing threshold. Once this threshold is reached, the difference between these two groups of participants should shrink.⁵ The presence of such a pattern can be viewed as evidence that payment altered the motivation structure of participants and thus increased engagement. Because this shift in engagement is expected to occur later in the course, after the deadline for signing up for the certificate, it cannot be explained by the idea that highly motivated participants chose the payment option.

The Sunk-Cost Effect

In addition to the certificate effect, the mere act of paying for a course can have an effect on how participants view a course and motivate them to stay more engaged with its content. The sunk-cost effect is one such phenomenon tied to payment that could potentially influence engagement. The sunk-cost fallacy is a behavioral bias wherein users keep investing time and money in projects merely because of their sunk investment.⁶ If the sunk-cost effect exists among participants in MOOCs, they would increase their engagement if they end up paying for a course. Research by Gourville and Soman (1998) and Ho, Png, and Reza (2018) has also shown that the sunk-cost effect is transient and depreciates over time after the payment.⁷ Such depreciation can have implications for how a MOOC platform should schedule the payment of course fees to keep participants engaged with content. As a result, the sunk-cost fallacy, which is often viewed as throwing good money after bad, might prove to be beneficial in this context by keeping users engaged in online learning platforms.

As noted in the introduction, participants differ in terms of how early in the course they commit to the signature track. If the sunk-cost effect is transient as noted in the literature, the boost in engagement as a result of paying for the course should recede over time. Notably, users might get excited about the course when the knowledge about their payment is salient. This excitement, which is tied to the salience of the payment, is likely to alter engagement as a part of the sunk-cost effect. Therefore, to infer the existence of the sunk-cost effect, we can study how engagement of a paid participant changes depending on the recency of their payment. However, as we discuss subsequently, the timing of payment is an endogenous decision made by a participant. Through a variety of analyses, we present evidence that the mere act of payment alters engagement. However, we exercise caution in interpreting the quantification of the sunk-cost effect as being conclusive.

Data Description and Background

We use data from over 70 courses offered by a large public university on the Coursera platform. The data pertain to courses offered on the platform between 2012 and 2016. An average course is approximately 10 weeks long. The data are highly granular and contain detailed information on the consumption of course material through clickstream data, quiz outcomes, and forum activity. During this period, Coursera employed a freemium model; users could access course material, submit assignments, and get a final grade free of charge. At the same time, interested participants had the option to subscribe to the signature track for a one-time payment. The signature track allowed participants to receive a certificate from the institution upon successful completion of the course.⁸

Our data set consists of three components. First, we have information on the time when each participant enrolled in a course. Enrollment is free and enables users to access the material. In addition, we also have information on the time when each participant registered for the signature track by paying the course fees, henceforth referred to as the payment time. Note that participants can choose to register for the signature track at the time of course registration or make the decision subsequently. In our data, 23,674 participants (2% of users) chose to sign up for the signature-track service. We present the distribution of when participants registered for the signature track relative to the first day of the class (represented by 0) in Figure 1. We observe considerable heterogeneity in terms of when participants registered for the signature track, with a significant fraction making the decision a few days after the first day of the course. Nevertheless, almost all participants who registered for the signature track did so within 24 days after the first day of the course, which appears to be the deadline for making this decision.⁹

Figure 1.
Distribution of time of enrollment in signature track with respect to the first day of courses.

The second component of our data set is the information on consumption and course outcomes. In particular, we have information on the number of occasions (sessions) when a participant accessed course content, the duration of each session, their activities on the course forum (both visits and posts), their performance in the various assessment milestones, their overall course grade, and whether they successfully graduated from the course.¹⁰ Students who are enrolled in the signature track are awarded a verified certificate provided they achieve a minimum passing grade in the course. The final grade is a weighted sum of the grades on individual quizzes (Coursera n.d.).

In addition to these data, we have information from survey data independently collected by Coursera. We find that about 25% of the users in our data completed these surveys. The surveys are used to obtain information about each participant and were not tied to their registration in any particular course. The data contain information on demographic characteristics of participants such as their age, gender, and education.

During the course, each participant is required to complete a series of quizzes and assignments to advance to the next stage. We refer to each unique quiz (or assignment) within a course as a quiz block. A participant might attempt the same quiz multiple times. Therefore, a quiz block might include multiple attempts on the same quiz by the participant. Each observation in the data set corresponds to a quiz/homework attempt by a user, along with corresponding information on the amount of time they spent on the course portal. This information enables us to infer the user’s total time investment between successive attempts on a quiz. Extant research has defined engagement as the intensity of consumption and interaction with the products and services; for example, see Kumar et al. (2010), Brodie et al. (2011), Vivek, Beatty, and Morgan (2012), and Venkatesan, Petersen, and Guissoni (2018) for reviews. In this spirit, we use the user’s time investment (e.g., accessing course content, spending time on forum activity) during the span of a quiz block as a metric of engagement.

We present a visual representation of the data structure for a representative user in Figure 2. The arrow length represents the amount of time this user invested before attempting the quiz. Apart from the quiz attempts, we highlight three other events in Figure 2, namely, the registration, the payment, and the point at which the user crosses the passing threshold (if the user passes the course). We use the variation in the relative timing of these events, which are specific to each user’s calendar, in some of our analyses in the subsequent sections. We find that less than 10% of users attempted a quiz multiple times. Therefore, we perform most of our analyses at the quiz-block level by aggregating the time the participant spent on all the attempts within a quiz block. In our empirical analysis, we have one observation per quiz for each participant.

Figure 2.
Timeline of events. In our studies, each quiz attempt is referred to as an observation. At each step, the user decides which quiz to attempt and how much to study for it, which is indicated as the length of the brown arrows (labeled “q”). After each attempt, the user gets a score and may decide to reattempt that quiz or move on to a new one. Each block consists of a set of attempts on the same quiz.

Descriptive Statistics and Model-Free Evidence

Our main objective is to study the relationship between signing up for the certificate and engagement. As a first step, we consider how free and signature-track (paying) participants differed in terms of the various engagement and outcome metrics. We report these descriptive statistics in Table 1. Overall, we find considerable differences between the two groups of participants. In particular, signature-track participants were more engaged in terms of the time spent in accessing course content and being active on the course forum. They also appear to have better outcomes in terms of graduation rate and final grades. For example, the average final score among signature-track users is 63.7% versus an average of 5.9% for the rest of the users. However, a large number of participants, especially among those who took the course for free, did not complete a sufficient number of quizzes and assignments to receive a nonzero final grade. Therefore, we considered whether the gap between the two groups vanishes when we look only at participants who received a nonzero final grade. After we excluded zero grades, the average final scores are 73% and 33% for paying and nonpaying users, respectively. The completion rate, that is, the rate of achieving a grade higher than the passing threshold, is 56.7% for signature-track users and 3.68% for free users. These numbers increase to 66% and 21% for paying and nonpaying users, respectively, if we exclude zero grades.

Table 1.
Comparison of Engagement and Outcome Metrics Among Paying and Nonpaying Users.

All Observations Nonzero Final Grade

Variable Nonpaying Paying Nonpaying Paying

Total activity (min.) 244.11 1,533.34 749.42 1,703.18

(.58) (9.94) (1.82) (10.56)

Average session duration (min.) 23.1 41.94 38.72 43.82

(.03) (.15) (.05) (.15)

Average no. of sessions 6.93 35.42 18.86 39.17

(.01) (.19) (.04) (.20)

Forum activity (posts/visits) 29.77 90.51 45.2 94.81

(.25) (1.94) (.41) (2.03)

Average grade (%) 5.91 63.73 33.54 73.7

(.01) (.27) (.05) (.24)

Graduation rate (%) 3.69 56.72 20.91 65.59

(.01) (.33) (.06) (.34)

Observations 1,078,057 23,674 298,012 19,645

Together, these results suggest that users who pay stay more engaged with the course content and also fare better in terms of learning outcomes and course completion rates. At first glance, one might conclude that paying for the certificate drove these stark differences between paying and nonpaying users. However, as noted previously, we need to consider the possibility that highly motivated and higher-ability participants may have self-selected into signing up for the certificate. These participants with high motivation and ability might, in turn, have stayed more engaged with the course content and also fared better in terms of learning outcomes.

To address this selection issue, we consider the fact that paying users receive a verified certificate from the institution offering the course upon successful completion and nonpaying users would not. Therefore, we posit that the two types of users are likely to differ in terms of their motivation to achieve the passing grade as a result of the certificate effect. If this motivation plays a major role in time-investment decisions, the intensity of time investment should vary as a function of how far the participant is from achieving the passing grade. For instance, if paying users are investing more time merely to obtain a certificate, this additional motivation should cease when they achieve the minimum necessary grade to pass the course. Consequently, their behavior should become more similar to their nonpaying counterparts.

To conceptualize this idea, for each participant, we first calculate the time investment in each quiz block. We then consider how this time investment varies depending on the distance from the minimum grade threshold for obtaining the certificate. If participants who signed up for the certificate exhibit greater motivation to pass the threshold than nonpaying users, we should observe that these two groups of participants make different time-investment decisions as a function of their distance from the threshold. We present the average time investment on the quiz blocks for both types of users as a function of their distance from the threshold in Figure 3. To alleviate selection issues, we consider only those participants who have reached a final grade at least 20 points above the passing threshold.¹¹ The figure demonstrates that, on average, paying users spend more time on the course than their counterparts who are taking the course for free. More importantly, this gap tends to widen as users approach the threshold and shrinks quickly after the goal is achieved. These data patterns suggest that paying users might exhibit differential engagement than nonpaying users as a function of their distance from the passing grade. In the subsequent sections, we examine whether these patterns are robust when we account for prior investments as well as course, quiz, and user-course FEs.

Figure 3.
Average time investment in quiz blocks among nonpaying and paying users at different stages relative to the passing grade.

As noted previously, we explore a second component of the temporal effect based on the idea that the mere act of paying for the course might also drive users via the sunk-cost effect to stay more engaged with its content. To this end, we consider engagement among paying users in the weeks following their decision to pay for the signature track. Specifically, we consider two groups of paying participants based on the timing of payment relative to the start of the course: (1) paying before the course began and (2) paying after the course began. Because the motivation to receive the certificate is common among all paying users, we can explore how the mere act of paying is related to their engagement levels.

To illustrate the idea behind this identification, let us consider the time spent on the platform by participants who paid for the signature track after the course began. We report the results from this analysis in Figure 4. These results suggest these payers exhibited a considerable increase in engagement (in terms of time spent on the platform) in the weeks immediately following payment. Moreover, the extra engagement among participants who paid after the course began shrinks considerably in the second week after they made the payment and seems to disappear after four weeks. A concern with this analysis is that the boost in engagement among participants who paid after the commencement of the course might be a result of the certificate effect as well as of the fact that they paid for the course. However, the result that the boost in engagement shrinks as we move further away from the timing of the payment suggests that this pattern is probably a transient effect. Since the certificate effect is likely to be preserved until the participant gets closer to the passing grade (which happens much later), this pattern is more likely to have been driven by the sunk-cost effect.

Figure 4.
The difference in time spent in the weeks following the payment week for users who paid after the course began.

Although these results are consistent with the idea that the sunk cost of paying drove the temporary increase in engagement among participants, two alternative explanations could have led to a similar outcome. First, participants who paid later wanted to learn about the match value of the course before making that commitment. Therefore, payment and an increase in engagement were both driven by users’ discovery that the course is a good match for them. Although this possibility can explain the concomitant occurrence of payment and an increase in engagement, it cannot rationalize the transient nature of the increase in engagement. The second explanation is that participants who paid later probably had to catch up with the course content. Therefore, they had to increase engagement immediately after paying for the course. However, when they had caught up with the course content, they reached the same level of engagement as those who paid earlier.

To rule out this second alternative explanation, we investigate the role of recency of payment in increasing engagement by comparing the amount of time paying users spent during the first week of the course as a function of how recently they made their payment. Specifically, we consider only users who paid for the signature track before the course began. This focus eliminates the need to catch up with the contents of the course. We separate these participants into three groups based on the recency of their payments relative to the start of the such that, the certificate effect was common to all users at the start of the course. If payment has a sunk-cost effect and that effect depreciates over time, we should see that users who paid closer to the first day of the course invested more time during the first week than those who paid well in advance. Furthermore, we would expect that the difference between these groups should be less pronounced in subsequent weeks as the effect of payment depreciates over time. We present the results from this analysis in Figure 5. We find that users who paid within a week before the beginning of the course spent more time on the platform than users who paid for the course earlier. Moreover, the gap between the three groups of users is statistically indistinguishable during the second week of the course. However, note that the transient nature of the additional engagement immediately after payment implies only that paying users eventually converge to a common level of engagement. As we discuss subsequently, we cannot directly comment on any persistent effect that payment might have on paying users compared with those taking the course for free.

Figure 5.
Comparison of weekly investment during weeks 1 and 2 of the course by users who paid before the first day of classes.

Overall, our model-free analyses have the following implications:
Users change their engagement depending on whether they have reached the minimum grade required to receive the certificate. In particular, paying users tend to exhibit greater engagement before reaching the passing grade. After the passing grade is reached, the difference in engagement between nonpaying and paying users shrinks. This pattern is consistent with the idea that paying users stay more engaged in order to be eligible to receive the certificate.

The mere act of payment increases user engagement. However, this effect is transient and decays in the four weeks after payment. This pattern is consistent with paying participants exhibiting the sunk-cost fallacy.
In our empirical analyses, we examine these temporal effects of payment while controlling for selection.

Empirical Analysis

We attempt to use the variation in the timing of payment and goal progress to assess the relative magnitudes of the sunk-cost effect and certificate effects. We perform our analysis at the quiz-block level.¹² Let i and q index individuals and quiz blocks, respectively. Also let $S_{it}$ denote the total grade that user i has achieved at the beginning of quiz block q. Let $c_{i}$ be the course that individual i is enrolled in, and let $p_{c_{i}}$ be the minimum grade required for passing the course.¹³ Finally, let $t_{iq}$ be the week in which quiz block q was initiated, and let $τ_{i}$ be the week of payment for user i. According to this notation, $1_{S_{it} < p_{c_{i}}}$ is a dummy that indicates if a quiz block was attempted before (1) or after (0) the user crossed the passing threshold, and $1_{t_{iq} \in [τ_{i} τ_{i} + 3]}$ is a dummy that equals 1 when a quiz block takes place within four weeks following the payment. Finally, let $δ_{i}$ be a dummy indicating whether individual i is a paying user.

Consider the following specification:
$\begin{matrix} \log (1 + I_{iq}) & = \overset{Certificate Component}{\overset{}{\overset{⏞}{α δ_{i} 1_{S_{iq} < p_{c_{i}}} + ξ 1_{S_{iq} < p_{c_{i}}}}}} + \overset{Sunk - Cost Component}{\overset{}{\overset{⏞}{ω 1_{t_{iq} \in [τ_{i} τ_{i} + 3]} + γ (t_{iq} - τ_{i} {) 1}_{t_{iq} \in [τ_{i} τ_{i} + 3]}}}} \\ + η_{i} + η_{q, δ_{i}} + β_{- q} \log (1 + {\bar{I}}_{i, - q}) + ε_{iq}, \end{matrix}$
(1)
where $I_{iq}$ is the total time investment by user i during the $q^{th}$ quiz block, ${\bar{I}}_{i, - q}$ is the total time that user i has invested in all quiz blocks that the user has taken until quiz q (to capture potential complementarity in learning across quizzes), and $β_{- q}$ is the corresponding coefficient, which varies at the quiz level. We include this term to control for the possibility that time investment in other quiz blocks might help a participant to achieve learning goals in the focal quiz, q. Also $η_{i}$ and $η_{q δ_{i}}$ are user and quiz-paid FEs, respectively. Because the average time-investment levels may vary across courses and quiz blocks within the same course, we use the logarithm of time-investment levels to be able to express the magnitude of effects in percentage terms rather than the absolute change in time-investment levels.

Next, we present the expressions for the certificate effect and the sunk-cost effect and discuss the intuition behind the identification of these effects:
We can infer the presence of the certificate effect if paying users exhibit a higher level of engagement than nonpaying users until they reach the passing grade. To parse out a variety of confounds, we include controls in the form of FEs. In particular, we include user-course FEs to control for intrinsic differences across participants (and courses) and quiz-paid FEs to control for varying levels of required time commitment across assignments. The quiz-paid FEs allow nonpaying and paying users to have different tendencies to stay engaged with the course content. Recall that the signature track offers other services, such as identity verification; these cross-sectional characteristics that could influence all paid users will also be picked up by these FEs. The coefficient $ξ$ captures the extent to which the engagement of a nonpaying user before the user reached the passing grade differed from the level of engagement after this milestone. Similarly, $α + ξ$ captures the corresponding magnitude of the change for paying users. Thus, $α$ would tell us if paying users were more (positive coefficient) or less (negative coefficient) engaged than nonpaying users before crossing the passing threshold and is a measure of what we refer to as the certificate effect.

The intuition behind the identification of the sunk-cost effect is that considerable variation exists in the timing of payment among paying users. Given our previous evidence that the sunk-cost effect is transient, we can consider the immediate increase in engagement to be limited to the first four weeks after payment. Furthermore, we assume that the sunk-cost effect decays linearly during this four-week period.¹⁴ Therefore, paying users are likely to experience the boost in engagement as a result of the sunk-cost effect over different periods depending on when they pay. The sunk-cost effect consists of two components: (1) the base effect $ω$ , which captures the immediate increase in engagement after payment, and (2) the decay $γ$ that captures how the payment effect tapers off after payment.
We present the results from this analysis for the full sample of participants in Table 2 and for the subsample of users who passed the course in Table 3. Regarding the certificate effect, these results provide four important insights that are consistent with the patterns presented in the “Descriptive Statistics and Model-Free Evidence” section:
Participants generally spend more time on course content before reaching the passing grade.

Paying participants spend significantly more time on course content before reaching the passing grade than those taking the course for free.

The sunk-cost effect leads to an immediate increase in engagement in the weeks following payment.

The effect depreciates as time since the payment increases.

Table 2.
Quantifying the Certificate and Sunk-Cost Effects: Full Sample.

Dependent Variable: Log(Time Investment + 1)

Full Sample Matched Sample

(1) (2) (3)

Paying user × Before crossing ( $α$ ) .081* .126* .081*

(.026) (.029) (.048)

Paying user × Before crossing × Late payer −.089*

(.025)

Before crossing ( $ξ$ ) .152* .152* .154*

(.010) (.010) (.028)

Weeks after payment ( $ω$ ) .165* .178* .181*

(.014) (.023) (.026)

Weeks after payment × Late payer −.009

(.028)

Weeks after payment depreciation ( $γ$ ) −.016 −.011 −.032*

(.005) (.008) (.010)

Weeks after payment depreciation × Late payer −.014

(.011)

Observations 844,939 844,939 121,127

R² .529 .529 .492

Adjusted R² .463 .463 .426

Residual SE 1.008 (df = 740,587) 1.008 (df = 740,584) .982 (df = 107,144)

p < .1.

p < .05.

p < .01.

Notes: All regressions include quiz-paid, user, and prior investment FEs and controls. All standard errors are clustered at the user-course level.

Table 3.
Quantifying the Certificate and Sunk-Cost Effects: Conditional on Achieving the Passing Grade.

Dependent Variable: Log(Time Investment + 1)

Full Sample Matched Sample

Conditional on Passing Conditional on Passing

(1) (2) (3)

Paying users × Before crossing ( $α$ ) .087* .130* .091*

(.026) (.029) (.048)

Paying users × Before crossing × Late payer −.086*

(.025)

Before crossing ( $ξ$ ) .171* .171* .176*

(.011) (.011) (.029)

Weeks after payment ( $ω$ ) .151* .172* .137*

(.015) (.024) (.028)

Weeks after payment × Late payer −.020

(.029)

Weeks after payment depreciation ( $γ$ ) −.014 −.011 −.022**

(.006) (.009) (.011)

Weeks after payment depreciation × Late payer −.013

(.011)

Observations 614,892 614,892 96,148

R² .487 .487 .462

Adjusted R² .431 .431 .405

Residual SE .985 (d.f. = 553,898) .985 (d.f. = 553,895) .963 (d.f. = 86,830)

p < .1.

p < .05.

p < .01.

Notes: All regressions include quiz-paid, user, and prior investment FEs and controls. All standard errors are clustered at the user-course level.

These results reveal the certificate effect (coefficient of Paying users × Before crossing) is positive. In terms of magnitude, paying users spent $\exp (.081) - 1 = 8.4 %$ more time on a quiz before reaching the passing grade than their peers taking the course for free. Turning to the sunk-cost effect, we see that the estimates in Table 2, column 1, suggest that paying led to an immediate increase in engagement (positive coefficient of Paying × Four weeks after payment). In terms of magnitude, this effect translates to an increase of $\exp (.165) - 1 = 17 %$ in engagement levels after payment. Therefore, the sunk-cost effect is roughly twice as large in magnitude as the certificate effect but is transient and vanishes over time.¹⁵ The effects are similar in magnitude across Tables 2 and 3, thus suggesting that these results cannot be merely due to churn, because the subsample of users in Table 3 passed the course.

Heterogeneity in Timing of Payment

In addition to the main effects, we also explore potential heterogeneity as a function of the timing of payment. Recall that a sizable portion of students enroll in the signature track long before the payment deadline. These students do not gain anything by paying early, because they would gain by waiting strategically to obtain more information both about their match value with the course and their performance before signing up for the certificate. Research has documented that agents may choose dominated contracts to serve as a commitment device by restricting their actions in the future (Kaur, Kremer, and Mullainathan 2015). Therefore, we can view a participant's choice to sign up for the signature option before the start of the course as a proxy for their use of it as a commitment device that would stimulate them to complete the course.

We divide the paying users for each course into two groups: early payers and late payers. For any given course, we define early payers as those who were among the first half of payers, and we classify the rest as late payers. We compare the magnitude of the certificate and sunk-cost effects among early and late payers. Specifically, we include an additional three-way interaction term between the following variables in Equation 1: the dummy for whether the individual is a paying user, a dummy for the period before the passing threshold is crossed, and a dummy indicating a later payer. The coefficient of this interaction tells us the extent to which late payers experienced a greater (positive coefficient) or lesser (negative coefficient) certificate effect than the early payers.

We report the results in column 2 of Tables 2 and 3. The results suggest that the certificate effect is significantly stronger for the users who paid earlier (.126) than for those who paid later ( $.126 - .089 = .037$ ). We find similar results when we consider the subset of users who passed; see column 2 of Table 3. These results suggest that the certificate is more effective among users who pay early, probably because it serves as a commitment device. On the other hand, our results demonstrate that the sunk-cost effect does not vary in a meaningful way with timing of payment.

Matching

Although our analysis controls for cross-sectional differences between nonpaying and paying users by including user-course FEs, these two groups of users could exhibit different trends in terms of their engagement before reaching the passing grade. To address this concern, we used a state-of-the-art machine-learning technique, namely, boosted trees, to match the subset of these users who signed up for the signature track two weeks after the first day of the course with those who did not pay for the signature track. When performing this matching, we used user characteristics during the first two weeks of the course (i.e., before either group paid for the signature track), such as forum activity, quiz outcomes, response to demographic surveys, and overall time spent on the platform.¹⁶

To demonstrate the effectiveness of the matching algorithm, we present the propensity scores for nonpaying and paying users in Figure 6. Note that the propensity scores for users who did not end up paying are more skewed toward zero. This suggests that our algorithm has some predictive power in distinguishing between paying and nonpaying users based on the prepayment outcomes. We then matched the users such that each paying user had five nonpaying users in the matched data.

Figure 6.
The fitted treatment (payment) propensity scores for control (free) and treatment (paid) users are compared on the left.

We present the density of propensities for the matched control and the treatment sample in Figure 6. As stated previously, we used the matched sample to reestimate Equation 1, and we report the results in Tables 2 and 3. Note that the matched regression is effectively measuring the local average treatment effect on late payers.¹⁷ When we assemble similar groups of late payers and nonpaying users in the matched sample, the treatment effect could remain the same, attenuate, or increase depending on different types of selection that could mask or increase the measured treatment effect. The effect for late payers in column 2 of Tables 2 and 3 was .126 − .089 = .037 and .130 − .086 = .044. In the matched regression (see column 3 of the same tables), this effect increases to .081 and .091. In both cases, because our sample size shrinks after matching, we have less statistical power. As a result, the coefficients are significant with p < .1. Given the similarity of these coefficients to the average treatment effect measured in column 1 and the fact that the treatment effect tends to be smaller for late payers as demonstrated in column 2, we believe the coefficients reported in column 1 of Tables 2 and 3 serve as a lower bound for the certificate effect.

Cross-Sectional Differences Between Nonpaying and Paying Users

In the previous section, we show how we use our panel data to estimate the certificate and sunk-cost effects. Recall from our conceptual framework that these effects are reflected in temporal changes in engagement. There, we also note the presence of time-invariant or cross-sectional factors that influence engagement. In this section, we consider how nonpaying and paying users differ cross-sectionally in terms of their engagement, and we provide a comparison with the temporal aspects documented previously. To this end, we focus on the estimated user-course FEs from Table 2, which reflect the average propensity of individual participants to engage with the course content after achieving the passing grade. This user-course FE consists of three components: (1) difference across courses in terms of time commitment requirements, (2) difference in engagement as a result of ability, and (3) difference in engagement as a result of motivation that remains invariant over time.¹⁸

To control for cross-sectional differences that arose because of ability, we exploit the data on engagement and grades of individual participants for each quiz to obtain a metric of a user’s intrinsic ability. To illustrate the idea, in Figure 7, we compare the grades of paying and nonpaying users for each quiz as a function of the time invested in engaging with the course material before the quiz. The figure highlights that for the same effort, paying users achieve higher grades than free users.¹⁹ We argue that this observation is indicative of differences in ability between the two groups of participants. We can extend this idea to obtain a metric of ability for each participant. We present further details on how we build on this intuition to obtain individual user-level measures of ability in the Appendix. We divide the ability measures into five tiers (quintiles) and use them as nonparametric controls for cross-sectional differences across participants.

Figure 7.
The marginal returns to time investment for paid and free users.

To parse out these effects, we regress the estimated FEs for participant $η_{i}$ on course FEs ( $η_{c}$ ), their estimated ability tier ( $η_{A_{i}}$ ), and whether they paid for the course ( $δ_{i}$ ). Formally, we estimate the following regression:
$η_{i} = λ + \sum_{j = 2}^{5} ψ_{j} 1_{{η_{A_{i}} = j}} + γ δ_{i} + η_{c} + ε_{i} .$
(2)
We present the results of this analysis in Table 4, which reflects the cross-sectional differences across users after accounting for the certificate and the sunk-cost effects. The results from column 1 suggest that paying users spent $\exp (.308) - 1 = 36 %$ more time on an average quiz than nonpaying users. Moving to column 2, we control for ability tiers to see how much of this 36% gap is explained by ability. Our estimate shrinks to $\exp (.299) - 1 = 34.8 %$ , which shows that less than 2% of the cross-sectional gap is absorbed after accounting for ability differences. Therefore, most of the cross-sectional difference in engagement between users is probably explained by differences in their intrinsic motivation rather than in their innate ability. At the same time, when we compare the cross-sectional difference explained by the FEs (i.e., 36%) against the $\exp (.081) - 1 = 8.4 %$ change in engagement as a result of the certificate effect, we can infer that the certificate is sizable in magnitude compared with the difference in engagement among nonpaying and paying users.

Table 4.
Decomposing the Estimated User-Course FEs.

Dependent Variable: User FE from Specification 1

Without Ability Tier Controls With Ability Tier Controls

(1) (2)

Paying users .308* .299*

(.013) (.013)

Ability tier 2 .197*

(.011)

Ability tier 3 .235*

(.011)

Ability tier 4 .183*

(.011)

Ability tier 5 −.211*

(.011)

Observations 104,006 104,006

R² .278 .295

Adjusted R² .277 .294

Residual SE 1.081 (df = 103,946) 1.069 (df = 103,942)

p < .1.

p < .05.

p < .01.

Notes: All regressions include course FEs.

The results in column 2 of Table 4 also reveal that engagement levels tend to follow an inverted U-shaped pattern as a function of ability. In particular, results from column 2 indicate that the second ability tier spends an average of $\exp (.197) - 1 = 22 %$ more time than the lowest ability tier (baseline). However, users in the highest ability tier tend to be less engaged than those in other tiers. The inverted U-shaped pattern is consistent with the idea that participants in the lowest tier do not derive sufficient marginal benefit from staying engaged, whereas those in the highest ability tier do not need to stay engaged in order to perform well on the assignments.

In the preceding analysis, we jointly studied the sunk-cost and certificate effects. Our next analysis further examines each effect and demonstrates the robustness of our findings. We consider several threats to validity and address each threat to the extent we can. The analysis also considers an alternative engagement metric, namely, forum activity, and demonstrates that patterns similar to those reported persist.

Further Examination of the Certificate Effect

Precision of timing of treatment

We characterize the certificate effect as the extra time that paying participants spend on the course content before reaching the passing grade. Therefore, our research design uses reaching the passing grade as the treatment. However, participants may anticipate their chances of reaching the passing grade well before reaching it. In this case, they might adjust their engagement before reaching the passing grade. Such a deviation might render the exact timing of the treatment fuzzy.

To verify how the estimated certificate effect changes when we change the time when paying users start adjusting their engagement, we estimated Equation 1 with a different definition of the treatment. Recall that in the original formulation and the corresponding results reported in Table 2, we defined the treatment as the point when the participant has reached the passing grade. As a robustness check, we estimate the model in Equation 1 with three alternative definitions of the treatment (i.e., points where paying users alter their engagement): passing grade − 5, passing grade − 10, and passing grade − 15. The idea is that as we get further away from the passing grade, participants should have limited ability to anticipate their ability to achieve the passing grade. Consequently, the extent to which they adjust their engagement should decrease in magnitude as we move further away from the passing grade.

We present the results from this analysis for the full sample of participants in Table 5. Note that the first column in this table (corresponding to the treatment/threshold defined as the passing grade) is the same as column 1 in Table 2. The subsequent columns in Table 5 move further away from the passing grade. These results suggest that as we move the threshold closer to the beginning of the course, the effect of crossing the threshold decreases in magnitude. More importantly, when we define the threshold as passing grade − 10, we find no statistically significant effect of crossing that grade on engagement among paying users. This analysis also serves as a falsification test of the possibility that the certificate effect only occurs in a narrow band around the passing grade. Note that the coefficient $ξ$ shrinks as we move toward the passing threshold, suggesting that nonpaying users’ engagement is declining and the certificate effect acts by eliminating this decline among the paying users.

Table 5.
Verifying Robustness to the Timing of Treatment.

Dependent Variable: Log(Time Investment + 1)

Passing Grade Passing Grade − 5 Passing Grade − 10 Passing Grade − 15

(1) (2) (3) (4)

Paying user × Before reaching the threshold ( $α$ ) .081* .055 .026 −.018

(.026) (.025) (.023) (.023)

Before reaching the threshold ( $ξ$ ) .152* .188* .219* .289*

(.010) (.010) (.010) (.010)

Weeks after payment ( $ω$ ) .165* .163* .161* .159*

(.014) (.014) (.014) (.014)

Weeks after payment depreciation ( $γ$ ) −.016* −.016* −.015* −.014*

(.005) (.005) (.005) (.005)

Observations 844,939 844,939 844,939 844,939

R² .529 .529 .529 .530

Adjusted R² .463 .463 .463 .464

Residual SE (df = 740,587) 1.008 1.007 1.007 1.006

p < .1.

p < .05.

p < .01.

Notes: All regressions control for prior investment, quiz-paid, and user-course FEs. All standard errors are clustered at the user-course level.

Alternative engagement metrics

To verify whether the certificate effect persists for other metrics of engagement beyond the time spent on the course portal, we consider activity on course forums, which includes page visits, upvotes, and posts created by users. Each action on the course forum, such as visits to a thread, comments, and other interactions, increases the intensity of forum activity by one unit. Because forum activity can only be performed on the platform, it might provide a clean measure of engagement on the platform. We aggregate total forum activity for each user in the periods before and after reaching the passing threshold and compare it across nonpaying and paying users.²⁰ We present the results from this analysis in Table 6. These results suggest a certificate effect of $\exp (.076) - 1 = 7.9 %$ on the forum activity before the passing grade is reached, relative to the corresponding activity of nonpaying users. This finding is consistent with our previous results based on the total time as the metric of engagement.

Table 6.
Forum Activity Before and After Crossing the Passing Threshold.

Dependent Variable: Log(Forum Activity)

Without User FE With User FE

(1) (2)

Paying user × Before reaching passing grade .070* .076

(.023) (.036)

Before reaching passing grade 1.110* 1.122*

(.008) (.013)

Paying user .335***

(.024)

Course FE Yes

User-course FE Yes

Observations 98,818 98,818

R² .270 .881

Adjusted R² .270 .677

Residual SE 1.461 (df = 98,756) 0.971 (df = 36,411)

p < .1.

p < .05.

p < .01.

Notes: All regressions control for user-course FEs. Standard errors are clustered at the user-course level.

Discussion of plausible mechanisms and interpretations

As described previously, users do not benefit from paying well before the payment deadline because they can strategically wait and gather more information. Therefore, we conjectured that these results are consistent with the idea that early payers view the certificate option as a commitment device that will help them complete the course. A nontrivial portion of users pay well in advance, and our results in Tables 2 and 3 show that early payers benefit more from the certificate effect. These observations provide evidence for the commitment-device account.²¹ Here, we discuss plausible interpretations and the mechanism of the certificate effect.²²
Peer effects: We consider whether the certificate is a result of peer effects wherein users who opted for the certificate engage more because they are prompted by their fellow users through forum content. However, because the forum is shared across nonpaying and paying users, any treatment effect of posts should also be present among both groups. However, conceivably, the content on the forum at any point is the same, but users may complete quizzes at different points in time and may face different content. Therefore, if both paying and nonpaying users systematically differ in terms of when they start quizzes, forum activity may indeed affect them differently. To verify this idea, we counted the number of forum posts and comments within a week of the start time of a quiz block for each user and used this variable as an additional control in Equation 1. If greater time investment by paying users is driven by activity of other users on the forums, these controls should absorb the treatment effect. We present the results in Table 7. Although the treatment effect shrinks after controlling for forum activity, the magnitude is not statistically different from the one reported in column 1 of Tables 2 and 3.

Churn after reaching the passing grade: Paid users could have switched their focus from one course to another after reaching the passing grade, which is reminiscent of patterns documented in the goal-balancing literature; see, for example, Monin and Miller (2001) and Fishbach and Dhar (2005). This shift in focus probably drove the decrease in the gap between nonpaying and paying users after reaching the passing grade. To investigate the veracity of this possibility, we perform a robustness check wherein we consider the sample of participants (both nonpaying and paying) who persisted with the course beyond the passing grade. In particular, we considered users whose final grade was 0, 5, 10, and 15 points above the passing threshold. The idea is that if the effect arose from those who diverted their attention to other courses/activities, the local effect on the subsample of “persisting” users would not lead to similar estimates. We present the results from this analysis in Table 8. The estimates in columns 1–4 are similar to those in column 1 of Table 3. Therefore, our data do not seem to support the idea that the drop in engagement after reaching the passing grade was driven by differential dropout rates among both types of users.

Table 7.
Peer Effects: Controlling for the Volume of Activity on the Forum.

Dependent Variable: Log(Time Investment + 1)

Full Sample Conditional on Passing

(1) (2)

Paying users × Before reaching passing grade .072* .078*

(.025) (.025)

Before reaching passing grade .131* .153*

(.010) (.010)

Weeks after payment .165* .152*

(.014) (.015)

Weeks since payment (up to four) −.015* −.013

(.005) (.006)

Log(forum posts/comments in past week + 1) .173* .149*

(.003) (.004)

Observations 844,939 614,892

R² .533 .491

Adjusted R² .468 .435

Residual SE 1.003 (df = 740,586) .981 (df = 553,897)

p < .1.

p < .05.

p < .01.

Notes: All regressions control for prior investment, forum content, quiz-paid, and user-course FEs. All standard errors are clustered at the user-course level.

Table 8.
Effect of the Timing of Churn on the Certificate Effect.

Dependent Variable: Log(Time Investment + 1)

Final Grade > Passing Threshold Final Grade > Passing Threshold + 5 Final Grade > Passing Threshold + 10 Final Grade > Passing Threshold + 15

(1) (2) (3) (4)

Paying users × Before crossing .087* .086* .091* .083*

(.026) (.026) (.026) (.027)

Before crossing .171* .172* .169* .172*

(.011) (.011) (.011) (.011)

Weeks after payment .151* .146* .144* .148*

(.015) (.015) (.015) (.015)

Weeks after payment depreciation. −.014 −.013 −.010* −.014**

(.006) (.006) (.006) (.006)

Observations 614,892 588,716 555,549 506,148

R² .487 .486 .484 .479

Adjusted R² .431 .430 .428 .423

Residual SE .985 (df = 553,898) .981 (df = 530,879) .975 (df = 501,342) .969 (df = 457,120)

p < .1.

p < .05.

p < .01.

Further Examination of the Sunk-Cost Effect

Alternative engagement metrics

Similar to the exercises we performed for the certificate effect, we consider activity on course forums to verify whether the sunk-cost effect persists to other metrics of engagement beyond the time spent on the course portal. As discussed previously, the identification of the sunk-cost effect comes from the variation in the timing of payment among paying users. Therefore, to better reflect the nature of this temporal effect, we use weekly data from paying users. In particular, we use the forum activities described previously and aggregate them at the weekly level for each paying user. We consider two metrics: (1) a dummy variable that reflects the incidence of forum activity in a given week (i.e., the extensive margin), and (2) a continuous variable Log(total forum activity + 1) (i.e., the intensive margin).

We use the following specification:
$\begin{aligned} y_{ijt} = & α \times 1_{{t \in [τ_{ij} τ_{ij} + 3]}} + β \times (t - τ_{ij}) \times 1_{{t \in [τ_{ij} τ_{ij} + 3]}} \\ + η_{ij} + η_{jt} + ε_{ijt}, \end{aligned}$
(3)
where i, j, and t index users, courses, and weeks, respectively. The dependent variable $y_{ijt}$ is an outcome of interest, that is, for the extensive margin whether a participant i engaged in any forum activity in course j during a particular week or, for the intensive margin, the intensity of investment (in log of 1 + total forum activity) during week t. The payment week for user i in course j is denoted by $τ_{ij}$ . The parameter $α$ captures the extent to which engagement increases after payment, and $β$ reflects the payment depreciation. For participants who decided to pay for the signature track, the term $1_{{t \in [τ_{ij} τ_{ij} + 3]}}$ switches from 0 to 1 during the four-week period after payment. In essence, our analysis exploits the variation in timing of payment and compares paying users with paying users, while allowing for user ( $η_{ij}$ ) and course-week ( $η_{jt}$ ) FEs to capture cross-sectional differences between users in terms of their time-investment patterns and within course time trends. Note that different weeks of a given course may require different engagement levels that may vary independently of payment. Using course-week FEs allows us to control for this confound.

We report the results for the effect on the extensive margin in column 1 of Table 9. Subsequently, we report the impact on the intensive margin in column 2 of Table 9. These results are consistent with those in Tables 4 and 5 such that postpayment forum activity increases, and this effect depreciates quickly over time.

Table 9.
Sunk-Cost Effect Based on Forum Activity.

Dependent Variable:

Forum Activity Dummy Log(Total Forum Activity + 1)

(1) (2)

Weeks after payment .101* .256*

(.005) (.011)

Weeks since payment (up to four) −.032* −.074*

(.002) (.004)

Observations 196,750 196,750

R² .359 .457

Adjusted R² .278 .388

Residual SE (df = 174,728) .419 1.070

p < .1.

p < .05.

p < .01.

Notes: All regressions include user-course and course-week FEs. All standard errors are clustered at the user-course level.

Heterogeneity in the sunk-cost effect

Research on the sunk-cost fallacy has documented that it is related to certain demographic characteristics. In particular, research (e.g., Bruine de Bruin, Parker, and Fischhoff 2007; Bruine de Bruin, Strough, and Parker 2014; Strough et al. 2008; Strough, Schlosnagle, and DiDonato 2011) has documented a negative correlation between the sunk-cost fallacy and age. To verify this negative correlation in our context, we consider heterogeneity in the sunk-cost effect. To this end, we use the demographic information on the age, gender, employment, and education status of each individual.²³ Our results revealed that the sunk-cost effect is heterogeneous only along the age dimension, with older individuals being less likely to fall prey to the sunk-cost fallacy. Therefore, these results align with those reported in the literature, further giving us confidence that these patterns indeed stem from the sunk-cost fallacy.²⁴

Engagement and Learning Outcomes

So far we have shown the impact of the certificate and sunk-cost effects on engagement. In this section we present evidence that higher engagement, in turn, leads to better course outcomes. To demonstrate this relationship, we exploit the data on engagement and grades of individual participants for each quiz and consider the following specification:
$\log (m_{it}) = β \log (I_{it}) + α δ_{i} + ε_{it},$
where $I_{it}$ is user $i$ ’s time investment in quiz block t and $m_{it}$ is the grade for quiz t. The variable $δ_{i}$ is a dummy that equals 1 if the user is a paying user. The results are reported in Table 10. Column 1 demonstrates that with the same inputs (time investment), paying users generate more output (grade). Column 2 shows that the gap between paying and nonpaying users persists even after controlling for quiz FEs, and that higher engagement, measured as time spent on the platform, leads to better outcomes. Using quiz FEs controls for cross-sectional differences across different courses and quizzes within course; therefore, the measured gap between paid and free users cannot stem from differences in the type of courses/quizzes that are attempted by paying and nonpaying users.²⁵ Last, but foremost, column 3 of Table 10 shows that more time investment (user engagement) leads to better grades even if we use within-user variation and control for course-user and quiz FEs. Our results show that MOOC platforms could improve learning outcomes by increasing user engagement. This paper illustrates the further link between the platform’s monetization objectives and engagement through the certificate and the sunk-cost effects. Together, our results suggest a way forward for monetizing MOOCs, while improving learning outcomes.

Table 10.
Transformation of Time to Grades, and the Ability Gap Between Paying and Nonpaying Users.

Dependent Variable: Log(Grade on the Quiz)

No FE Quiz FE Full Model

(1) (2) (3)

Log time investment .033* .072* .061*

(.0003) (.0002) (.0002)

Paying user .034* .012*

(.002) (.001)

Constant 1.683*

(.003)

Quiz FE Yes Yes

User-course FE Yes

Observations 1,871,734 1,871,734 1,871,734

R² .007 .637 .797

Adjusted R² .007 .636 .738

Residual SE .602 (df = 1,871,731) .364 (df = 1,871,127) .309 (df = 1,452,986)

p < .1.

p < .05.

p < .01.

Notes: All standard errors are clustered at the user-course level.

Discussion

As MOOCs are gaining greater acceptance, content creators (including educators) and hosting platforms need to develop interventions to increase student engagement and completion rates. This issue is important for ensuring that students gain proficiency in the content and perceive that they derive value from these courses. In this respect, we find suggestive evidence that higher engagement rates within a course are related to better grades and, potentially, learning outcomes.²⁶ Collectively, higher engagement, completion rates, and learning will help in monetizing the content. However, little is known about the drivers of student engagement in MOOCs. Our study is an important first step in understanding the role of payment in driving engagement. The freemium pricing structure of MOOCs provides a unique opportunity to study differences between nonpaying and paying users within a course. Furthermore, the ability to track engagement at different points within a course provides rich variation to understand the causal effect of payment on engagement.

In our objective to understand why paying and nonpaying users in MOOCs exhibit significant differences in the extent to which they engage with course content, we propose three plausible explanations. The first explanation is that these two groups of users are intrinsically different, in terms of both their ability and motivation levels. The second rationale is that the possibility of receiving a certificate upon reaching a passing grade motivates paying users to stay more engaged than their nonpaying peers, at least until they reach the passing grade. We call this response the certificate effect. The third explanation is based on the idea that the mere act of paying for the course might trigger these participants to increase their engagement. We propose the sunk-cost effect as one such mechanism that might trigger this behavior. Of these, the certificate effect and the sunk-cost effect are consequences of paying for the certificate and can potentially be used to increase engagement in online courses by altering the course design and payment structure.

We find that the certificate and the sunk-cost effects influence engagement in different ways. The motivation to be eligible to obtain the certificate results in paying users spending approximately 8%–10% more time on the course portal. This effect lasts until they reach the passing grade, which is typically around 70% for the courses that we consider. On the other hand, the mere act of payment leads to approximately 17%–20% higher engagement among paying users. However, this effect is transient and lasts only for a few weeks. Prima facie, whereas the sunk-cost effect appears to be larger than the certificate effect, the latter lasts considerably longer.

By contrast, we find that intrinsic traits (i.e., ability and intrinsic motivation) led to paying participants spending approximately 36% more time on the course portal than their nonpaying peers. Although we find evidence that ability can affect user engagement, it explains only a small portion of the intrinsic difference between free and paid users. Therefore, we conjecture that intrinsic motivation is an important driver of the cross-sectional difference between the two groups.

Together, these results suggest that temporal effects (i.e., certificate and sunk-cost effects) play a significant role in driving user engagement in MOOCs. This finding implies that MOOC platforms and content providers can increase engagement even if the course content remains unaltered. The presence of the certificate effect suggests that providing tangible rewards tied to outcomes can increase the engagement of participants. Moreover, as more employers begin to consider these certificates when making their hiring and promotion decisions, the higher value of these certificates will increase the ability of the certificate effect to increase engagement. The transient nature of the sunk-cost effect suggests that staggering course payments over several installments could result in multiple doses of this effect. However, whether the lower monetary value of each installment will reduce the role of the sunk-cost effect in increasing engagement is not clear. We believe MOOC platforms can learn about the value of these interventions by conducting field experiments.

We caution the reader about extrapolating these results to nonpaying users (or users who would have paid if the fee were lower) for two reasons. First, nonpaying and paying users might differ in terms of the value that they place on the certificate and might therefore respond differently. Second, if the platform decides to attract current nonpaying users by lowering the fee structure, and the incentive to achieve the passing grade is related to the fees, then our current estimate of the certificate effect might not translate into the new context.

We also acknowledge that we neither have a formal structural model that can quantify these effects for the purpose of running counterfactuals nor have data that can speak to the impact of staggered payments. However, from our results and the fact that the sunk-cost effect is short-lived, we conjecture that spreading the payments might be a plausible approach to effect more persistent changes in engagement. Regarding the certificate effect, we see that users are fairly responsive to passing thresholds, and this observation provides an opportunity for firms to optimize the overall engagement levels of users.

Conclusion

In recent years, MOOCs have gained considerable prominence for providing free access to education. However, as in the case of other digital platforms such as online newspapers, MOOC platforms have been exploring the possibility of monetizing their digital content. Against this backdrop, we investigate the impact that paying for online courses has on how users engage with content available on MOOC platforms. We exploit the panel structure of the data along with variation in the time when participants pay for the premium service and their state in terms of goal progress to estimate two effects of payment on engagement: the certificate effect and the sunk-cost effect. We find evidence that both effects exist. In particular, the presence of the sunk-cost effect implies that the mere act of payment can increase engagement with course content. Therefore, increasing user engagement and potentially course completion rates is aligned with the incentives of the platform and content creators to monetize their content. However, this effect depreciates over time. As discussed previously, staggering payments appears to be a promising way to constantly remind users about their investments. At the same time, offering a certificate seems to play an important role in incentivizing students to exert more effort. Overall, we believe these results have implications for how MOOC platforms should monetize their courses while ensuring that their participants stay engaged with these courses.

The limitations of our study highlight potential fruitful avenues for future investigation. First, because we do not have information on the course fees, we cannot investigate the role of the magnitude of the payment on the sunk-cost effect. Second, as noted elsewhere in the article, the nonexperimental nature of our data precludes us from studying long-term effects of payments on participants who pay for the certificate. We hope researchers will be able to conduct field experiments to understand the presence of such long-term effects. This ability will give us a more complete understanding of the overall effect of payment on user engagement. Despite these limitations, our research enhances the understanding of how paying for a certificate can potentially increase user engagement in online courses and motivates future research in this area.

	All Observations	Nonzero Final Grade
Total activity (min.)	244.11	1,533.34	749.42	1,703.18
	(.58)	(9.94)	(1.82)	(10.56)
Average session duration (min.)	23.1	41.94	38.72	43.82
	(.03)	(.15)	(.05)	(.15)
Average no. of sessions	6.93	35.42	18.86	39.17
	(.01)	(.19)	(.04)	(.20)
Forum activity (posts/visits)	29.77	90.51	45.2	94.81
	(.25)	(1.94)	(.41)	(2.03)
Average grade (%)	5.91	63.73	33.54	73.7
	(.01)	(.27)	(.05)	(.24)
Graduation rate (%)	3.69	56.72	20.91	65.59
	(.01)	(.33)	(.06)	(.34)
Observations	1,078,057	23,674	298,012	19,645

	Dependent Variable: Log(Time Investment + 1)
Paying user × Before crossing ( $α$ )	.081***	.126***	.081*
	(.026)	(.029)	(.048)
Paying user × Before crossing × Late payer		−.089***
		(.025)
Before crossing ( $ξ$ )	.152***	.152***	.154***
	(.010)	(.010)	(.028)
Weeks after payment ( $ω$ )	.165***	.178***	.181***
	(.014)	(.023)	(.026)
Weeks after payment × Late payer		−.009
		(.028)
Weeks after payment depreciation ( $γ$ )	−.016	−.011	−.032***
	(.005)	(.008)	(.010)
Weeks after payment depreciation × Late payer		−.014
		(.011)
Observations	844,939	844,939	121,127
R²	.529	.529	.492
Adjusted R²	.463	.463	.426
Residual SE	1.008 (df = 740,587)	1.008 (df = 740,584)	.982 (df = 107,144)

	Dependent Variable: Log(Time Investment + 1)
Paying users × Before crossing ( $α$ )	.087***	.130***	.091*
	(.026)	(.029)	(.048)
Paying users × Before crossing × Late payer		−.086***
		(.025)
Before crossing ( $ξ$ )	.171***	.171***	.176***
	(.011)	(.011)	(.029)
Weeks after payment ( $ω$ )	.151***	.172***	.137***
	(.015)	(.024)	(.028)
Weeks after payment × Late payer		−.020
		(.029)
Weeks after payment depreciation ( $γ$ )	−.014**	−.011	−.022**
	(.006)	(.009)	(.011)
Weeks after payment depreciation × Late payer		−.013
		(.011)
Observations	614,892	614,892	96,148
R²	.487	.487	.462
Adjusted R²	.431	.431	.405
Residual SE	.985 (d.f. = 553,898)	.985 (d.f. = 553,895)	.963 (d.f. = 86,830)

	Dependent Variable: User FE from Specification 1
Paying users	.308***	.299***
	(.013)	(.013)
Ability tier 2		.197***
		(.011)
Ability tier 3		.235***
		(.011)
Ability tier 4		.183***
		(.011)
Ability tier 5		−.211***
		(.011)
Observations	104,006	104,006
R²	.278	.295
Adjusted R²	.277	.294
Residual SE	1.081 (df = 103,946)	1.069 (df = 103,942)

	Dependent Variable: Log(Time Investment + 1)
Paying user × Before reaching the threshold ( $α$ )	.081***	.055**	.026	−.018
	(.026)	(.025)	(.023)	(.023)
Before reaching the threshold ( $ξ$ )	.152***	.188***	.219***	.289***
	(.010)	(.010)	(.010)	(.010)
Weeks after payment ( $ω$ )	.165***	.163***	.161***	.159***
	(.014)	(.014)	(.014)	(.014)
Weeks after payment depreciation ( $γ$ )	−.016***	−.016***	−.015***	−.014***
	(.005)	(.005)	(.005)	(.005)
Observations	844,939	844,939	844,939	844,939
R²	.529	.529	.529	.530
Adjusted R²	.463	.463	.463	.464
Residual SE (df = 740,587)	1.008	1.007	1.007	1.006

	Dependent Variable: Log(Forum Activity)
Paying user × Before reaching passing grade	.070***	.076**
	(.023)	(.036)
Before reaching passing grade	1.110***	1.122***
	(.008)	(.013)
Paying user	.335***
	(.024)
Course FE	Yes
User-course FE		Yes
Observations	98,818	98,818
R²	.270	.881
Adjusted R²	.270	.677
Residual SE	1.461 (df = 98,756)	0.971 (df = 36,411)

	Dependent Variable: Log(Time Investment + 1)
Paying users × Before reaching passing grade	.072***	.078***
	(.025)	(.025)
Before reaching passing grade	.131***	.153***
	(.010)	(.010)
Weeks after payment	.165***	.152***
	(.014)	(.015)
Weeks since payment (up to four)	−.015***	−.013**
	(.005)	(.006)
Log(forum posts/comments in past week + 1)	.173***	.149***
	(.003)	(.004)
Observations	844,939	614,892
R²	.533	.491
Adjusted R²	.468	.435
Residual SE	1.003 (df = 740,586)	.981 (df = 553,897)

	Dependent Variable: Log(Time Investment + 1)
Paying users × Before crossing	.087***	.086***	.091***	.083***
	(.026)	(.026)	(.026)	(.027)
Before crossing	.171***	.172***	.169***	.172***
	(.011)	(.011)	(.011)	(.011)
Weeks after payment	.151***	.146***	.144***	.148***
	(.015)	(.015)	(.015)	(.015)
Weeks after payment depreciation.	−.014**	−.013**	−.010*	−.014**
	(.006)	(.006)	(.006)	(.006)
Observations	614,892	588,716	555,549	506,148
R²	.487	.486	.484	.479
Adjusted R²	.431	.430	.428	.423
Residual SE	.985 (df = 553,898)	.981 (df = 530,879)	.975 (df = 501,342)	.969 (df = 457,120)

	Dependent Variable:
Weeks after payment	.101***	.256***
	(.005)	(.011)
Weeks since payment (up to four)	−.032***	−.074***
	(.002)	(.004)
Observations	196,750	196,750
R²	.359	.457
Adjusted R²	.278	.388
Residual SE (df = 174,728)	.419	1.070

	Dependent Variable: Log(Grade on the Quiz)
Log time investment	.033***	.072***	.061***
	(.0003)	(.0002)	(.0002)
Paying user	.034***	.012***
	(.002)	(.001)
Constant	1.683***
	(.003)
Quiz FE		Yes	Yes
User-course FE			Yes
Observations	1,871,734	1,871,734	1,871,734
R²	.007	.637	.797
Adjusted R²	.007	.636	.738
Residual SE	.602 (df = 1,871,731)	.364 (df = 1,871,127)	.309 (df = 1,452,986)

Footnotes

Appendix: Constructing an Ability Metric

In our conceptualization, we discussed the fact that ability could be an important underlying factor that could explain different engagement levels across nonpaying and paying users. To empirically test this conjecture, we need a measure of ability. We extend the ideas discussed in the “Engagement and Learning Outcomes” section to construct a measure for ability, that is, efficiency in transforming input (time) to output (grades). We use the data at the quiz-attempt level to parse out a metric of ability. More precisely, we regard each student as a production plant that transforms input (time investment) to output (grade). A participant’s performance on a quiz attempt is likely to be a function of the following:

Time that they have invested in course content for the quiz, including the time spent in all previous attempts on the same quiz.

Total number of prior attempts on the same quiz. We conjecture that more prior attempts will enable a participant to obtain better grades on the quiz.

Time that they have invested in course content for all other quizzes prior to the focal quiz. The natural progression of a course might demand accumulation of knowledge. This component would capture engagement with prior material and thus act as a proxy for accumulated knowledge.

Characteristics of the quiz.

The participant’s intrinsic ability.

For each quiz attempt, we observe the total amount of time invested so far on the focal quiz, the rest of the quizzes, and the number of attempts made on the current quiz by each student. Our idea is that once we control for these and other variables such as the quiz FEs, the individual-level FEs reflect the ability of users in converting time investment into grades.

Formally, we consider the following specification: (4)

\begin{aligned} \log (1 + S_{it}) & = A_{i} + η_{q_{t}} + α_{- q_{t}} \log (1 + {\bar{I}}_{i, - q_{t}, t}) \\ + α_{q_{t}} \log (1 + {\bar{I}}_{i, q_{t}, t}) + β_{q_{t}} \log (1 + T_{i, q_{t}, t - 1}) + ε_{it}, \end{aligned}

where

S_{it}

is the score of user i after the

t^{th}

observation (quiz attempt),

A_{i}

is the user-course FE,

η_{q_{t}}

is a quiz FE,

{\bar{I}}_{i, - q_{t}, t}

represents the total amount of time spent on other quizzes until the

t^{th}

attempt,

{\bar{I}}_{i, q_{t}, t}

denotes the total amount of time spent on quiz

q_{t}

until the

t^{th}

observation, and

T_{i q_{t} t - 1}

is the total number of attempts on quiz q until step t. For a given level of time investment on current and past quizzes, a participant with a higher user-course FE (i.e.,

A_{i}

) is likely to achieve a better grade. In what follows, we view

A_{i}

as a metric of a participant’s ability.

Note that $A_{i}$ also captures cross-sectional differences across courses in terms of required engagement levels. To perform a fair comparison between the ability of both types of users, we take the z-score of $A_{i}$ within each course and compare the empirical CDF of ability z-scores for the users in Figure A1. Consistent with the results in Figure 7, we see that paying users have greater ability than nonpaying participants. More importantly, the CDF of the estimated ability for paying users seems to first-order stochastically dominate that of nonpaying users.

We attempt to verify the validity of our ability metric through a battery of checks. First, conceivably, participants with more advanced educational achievement probably have greater ability. This relationship could arise either because participants with higher educational achievement have the requisite background to assimilate the course material faster than their peers or because participants with higher intellectual ability also possess more advanced degrees. To verify whether this relationship holds, we grouped the users into five tiers according to their estimated ability, $A_{i}$ . In Figure A2, we compare the ability tier of users who submitted the demographic survey as a function of the highest degree of schooling they reported in the surveys. We find that users with more advanced degrees tend to have a higher estimated ability.

Next, we compared these ability tiers with the average frequency of quiz attempts in Table A1. The idea is that participants with greater ability do not need to attempt a quiz multiple times to fare well. The results in Table A1 show a negative correlation between the number of quiz attempts and student ability as elicited by the FEs in Equation 4. This is consistent with our expectation. Together, these analyses give us the confidence that we have a valid measure of ability, which we use in our subsequent empirical analyses.

Acknowledgments

The authors thank the JMR review team. They thank seminar participants at University of Chicago, University of Washington, University College London, University of Houston, Columbia University, Indiana University, and the Marketing Science conference at Duke University. This article has benefited substantially from conversations with Jean-Pierre Dubé, Günter Hitsch, Anita Rao, Sanjog Misra, and Sarah Moshary.

Associate Editor

Randolph Bucklin

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Online supplement:

Notes

References

Arkes

Hal R.

Blumer

Catherine

(1985), “The Psychology of Sunk Cost,” Organizational Behavior and Human Decision Processes, 35 (1), 124–40.

Breugelmans

Els

Bijmolt

Tammo H.A.

Zhang

Jie

Basso

Leonardo J.

Dorotic

Matilda

Kopalle

Praveen

, et al. (2015), “Advancing Research on Loyalty Programs: A Future Research Agenda,” Marketing Letters, 26 (2), 127–39.

Brodie

Roderick J.

Hollebeek

Linda D.

Jurić

Biljana

Ilić

Ana

(2011). “Customer Engagement: Conceptual Domain, Fundamental Propositions, and Implications for Research,” Journal of Service Research, 14 (3), 252–71.

Bruine de Bruin

Wändi

Parker

Andrew M.

Fischhoff

Baruch

(2007), “Individual Differences in Adult Decision-Making Competence,” Journal of Personality and Social Psychology, 92 (5), 938.

Bruine de Bruin

Wändi

Strough

JoNell

Parker

Andrew M.

(2014), “Getting Older Isn’t All That Bad: Better Decisions and Coping When Facing ‘Sunk Costs,’” Psychology and Aging, 29 (3), 642.

Carini

Robert M.

Kuh

George D.

Klein

Stephen P.

(2006), “Student Engagement and Student Learning: Testing the Linkages,” Research in Higher Education, 47 (1), 1–32.

Chen

Tianqi

Tong

Benesty

Michael

Khotilovich

Vadim

Tang

Yuan

(2015), “XGboost: Extreme Gradient Boosting,” R Package Version 0.4-2, 1 (4).

Christensen

Gayle

Steinmetz

Andrew

Alcorn

Brandon

Bennett

Amy

Woods

Deirdre

Emanuel

Ezekiel

(2013), “The MOOC Phenomenon: Who Takes Massive Open Online Courses and Why?” working paper, https://doi.org/10.2139/ssrn.2350964.

Chung

Doug J.

Steenburgh

Thomas

Sudhir

(2014), “Do Bonuses Enhance Sales Productivity? A Dynamic Structural Analysis of Bonus-Based Compensation Plans,” Marketing Science, 33 (2), 165–87.

10.

Clark

Kevin R

. (2015), “The Effects of the Flipped Model of Instruction on Student Engagement and Performance in the Secondary Mathematics Classroom,” Journal of Educators Online, 12 (1), 91–115.

11.

Coursera (n.d.), “Solve Problems with Your Final Grade,” (accessed July 26, 2021), https://learner.coursera.help/hc/en-us/articles/209818773-Solve-problems-with-your-final-grade.

12.

Dillahunt

Tawanna R.

Wang

Brian Zengguang

Teasley

Stephanie

(2014), “Democratizing Higher Education: Exploring MOOC Use Among Those Who Cannot Afford a Formal Education,” International Review of Research in Open and Distributed Learning, 15 (5), 177–96.

13.

Fishbach

Ayelet

Dhar

Ravi

(2005), “Goals as Excuses or Guides: The Liberating Effect of Perceived Goal Progress on Choice,” Journal of Consumer Research, 32 (3), 370–77.

14.

Foster

Lucia

Haltiwanger

John

Syverson

Chad

(2008), “Reallocation, Firm Turnover, and Efficiency: Selection on Productivity or Profitability?” American Economic Review, 98 (1), 394–425.

15.

Glass

Chris R.

Shiokawa-Baklan

Mitsue S.

Saltarelli

Andrew J.

(2016), “Who Takes MOOCs?” New Directions for Institutional Research, 2015 (167), 41–55.

16.

Gourville

John T.

Soman

Dilip

(1998), “Payment Depreciation: The Behavioral Effects of Temporally Separating Payments from Consumption,” Journal of Consumer Research, 25 (2), 160–74.

17.

Heckman

James J.

Kautz

Tim

(2012), “Hard Evidence on Soft Skills,” Labour Economics, 19 (4), 451–64.

18.

Heckman

James J.

Rubinstein

Yona

(2001), “The Importance of Noncognitive Skills: Lessons from the GED Testing Program,” American Economic Review, 91 (2), 145–49.

19.

Teck-Hua

Png

Ivan P.L.

Reza

Sadat

(2018), “Sunk Cost Fallacy in Driving the World’s Costliest Cars,” Management Science, 64 (4), 1761–78.

20.

Hsieh

Chang-Tai

Klenow

Peter J.

(2009), “Misallocation and Manufacturing TFP in China and India,” Quarterly Journal of Economics, 124 (4), 1403–48.

21.

Hughes

Rees

Robert Pace

(2003), “Using NSSE to Study Student Retention and Withdrawal,” Assessment Update, 15 (4), 1–2.

22.

Kahneman

Daniel

Tversky

Amos

(2013), “Prospect Theory: An Analysis of Decision Under Risk,” in Handbook of the Fundamentals of Financial Decision Making: Part I, MacLean

L.C.

Ziemba

W.T.

, eds. Singapore: World Scientific, 99–127.

23.

Kaur

Supreet

Kremer

Michael

Mullainathan

Sendhil

(2015), “Self-Control at Work,” Journal of Political Economy, 123 (6), 1227–77.

24.

Khalil

Hanan

Ebner

Martin

(2014), “MOOCs Completion Rates and Possible Methods to Improve Retention—a Literature Review,” in EdMedia+ Innovate Learning. Waynesville, NC: Association for the Advancement of Computing in Education, 1305–13.

25.

King

Gary

Daniel

Stuart

Elizabeth A.

Imai

Kosuke

(2011), “Matchit: Nonparametric Preprocessing for Parametric Causal Inference,” Journal of Statistical Software 42 (8), 1–28.

26.

Koller

Daphne

Andrew

Chen

Zhenghao

(2013), “Retention and Intention in Massive Open Online Courses,” Educause Review (June 3), https://er.educause.edu/articles/2013/6/retention-and-intention-in-massive-open-online-courses.

27.

Kopalle

Praveen K.

Sun

Yacheng

Neslin

Scott A.

Sun

Baohong

Swaminathan

Vanitha

(2012), “The Joint Sales Impact of Frequency Reward and Customer Tier Components of Loyalty Programs,” Marketing Science, 31 (2), 216–35.

28.

Kumar

Aksoy

Lerzan

Donkers

Bas

Venkatesan

Rajkumar

Wiesel

Thorsten

Tillmanns

Sebastian

(2010), “Undervalued or Overvalued Customers: Capturing Total Customer Engagement Value,” Journal of Service Research, 13 (3), 297–310.

29.

Tong (Joy)

Bradlow

Eric T.

Wesley Hutchinson

(2017), “Binge Consumption of Online Content,” working paper, University of Pennsylvania.

30.

Misra

Sanjog

Nair

Harikesh S.

(2011), “A Structural Model of Sales-Force Compensation Dynamics: Estimation and Field Implementation,” Quantitative Marketing and Economics, 9 (3), 211–57.

31.

Monin

Benoit

Miller

Dale T.

(2001), “Moral Credentials and the Expression of Prejudice,” Journal of Personality and Social Psychology, 81 (1), 33.

32.

Onah

Daniel F.O.

Sinclair

Jane

Boyatt

Russell

(2014), “Dropout Rates of Massive Open Online Courses: Behavioural Patterns,” EDULEARN14 Proceedings, 1, 5825–34.

33.

Pansari

Anita

Kumar

(2017), “Customer Engagement: The Construct, Antecedents, and Consequences,” Journal of the Academy of Marketing Science, 45 (3), 294–311.

34.

Prelec

Drazen

Loewenstein

George

(1998), “The Red and the Black: Mental Accounting of Savings and Debt,” Marketing Science, 17 (1), 4–28.

35.

Romer

David

(1993), “Do Students Go to Class? Should They?” Journal of Economic Perspectives, 7 (3), 167–74.

36.

Strough

Jo Nell

Mehta

Clare M.

McFall

Joseph P.

Schuller

Kelly L.

(2008), “Are Older Adults Less Subject to the Sunk-Cost Fallacy than Younger Adults?” Psychological Science, 19 (7), 650–2.

37.

Strough

JoNell

Schlosnagle

Leo

DiDonato

Lisa

(2011), “Understanding Decisions About Sunk Costs from Older and Younger Adults’ Perspectives,” Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 66 (6), 681–6.

38.

Thaler

Richard

(1980), “Toward a Positive Theory of Consumer Choice,” Journal of Economic Behavior & Organization, 1 (1), 39–60.

39.

Thaler

Richard

(1985), “Mental Accounting and Consumer Choice,” Marketing Science, 4 (3), 199–214.

40.

Thaler

Richard H

. (1999), “Mental Accounting Matters,” Journal of Behavioral Decision Making, 12 (3), 183–206.

41.

Venkatesan

Rajkumar

Andrew Petersen

Guissoni

Leandro

(2018), “Measuring and Managing Customer Engagement Value Through the Customer Journey,” in Customer Engagement Marketing, Palmatier

Robert W.

Kumar

Harmeling

Colleen M.

, eds. Berlin: Springer, 53–74.

42.

Vivek

Shiri D.

Beatty

Sharon E.

Morgan

Robert M.

(2012), “Customer Engagement: Exploring Customer Relationships Beyond Purchase,” Journal of Marketing Theory and Practice, 20 (2), 122–46.

43.

Weiss

Iris R.

Pasley

Joan D.

(2004), “What Is High-Quality Instruction?” Educational Leadership, 61 (5), 24.

44.

Zhang

Tong

Bin

(2005), “Boosting with Early Stopping: Convergence and Consistency,” Annals of Statistics, 33 (4), 1538–79.

45.

Zhang

Yao

Bradlow

Eric T.

Small

Dylan S.

(2014), “Predicting Customer Value Using Clumpiness: From RFM to RFMC,” Marketing Science, 34 (2), 195–208.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.17 MB

	All Observations		Nonzero Final Grade
Variable	Nonpaying	Paying	Nonpaying	Paying
Total activity (min.)	244.11	1,533.34	749.42	1,703.18
	(.58)	(9.94)	(1.82)	(10.56)
Average session duration (min.)	23.1	41.94	38.72	43.82
	(.03)	(.15)	(.05)	(.15)
Average no. of sessions	6.93	35.42	18.86	39.17
	(.01)	(.19)	(.04)	(.20)
Forum activity (posts/visits)	29.77	90.51	45.2	94.81
	(.25)	(1.94)	(.41)	(2.03)
Average grade (%)	5.91	63.73	33.54	73.7
	(.01)	(.27)	(.05)	(.24)
Graduation rate (%)	3.69	56.72	20.91	65.59
	(.01)	(.33)	(.06)	(.34)
Observations	1,078,057	23,674	298,012	19,645

	Dependent Variable: Log(Time Investment + 1)
	Full Sample		Matched Sample
	(1)	(2)	(3)
Paying user × Before crossing ( $α$ )	.081***	.126***	.081*
	(.026)	(.029)	(.048)
Paying user × Before crossing × Late payer		−.089***
		(.025)
Before crossing ( $ξ$ )	.152***	.152***	.154***
	(.010)	(.010)	(.028)
Weeks after payment ( $ω$ )	.165***	.178***	.181***
	(.014)	(.023)	(.026)
Weeks after payment × Late payer		−.009
		(.028)
Weeks after payment depreciation ( $γ$ )	−.016	−.011	−.032***
	(.005)	(.008)	(.010)
Weeks after payment depreciation × Late payer		−.014
		(.011)
Observations	844,939	844,939	121,127
R²	.529	.529	.492
Adjusted R²	.463	.463	.426
Residual SE	1.008 (df = 740,587)	1.008 (df = 740,584)	.982 (df = 107,144)

	Dependent Variable: Log(Time Investment + 1)
	Full Sample		Matched Sample
	Conditional on Passing		Conditional on Passing
	(1)	(2)	(3)
Paying users × Before crossing ( $α$ )	.087***	.130***	.091*
	(.026)	(.029)	(.048)
Paying users × Before crossing × Late payer		−.086***
		(.025)
Before crossing ( $ξ$ )	.171***	.171***	.176***
	(.011)	(.011)	(.029)
Weeks after payment ( $ω$ )	.151***	.172***	.137***
	(.015)	(.024)	(.028)
Weeks after payment × Late payer		−.020
		(.029)
Weeks after payment depreciation ( $γ$ )	−.014**	−.011	−.022**
	(.006)	(.009)	(.011)
Weeks after payment depreciation × Late payer		−.013
		(.011)
Observations	614,892	614,892	96,148
R²	.487	.487	.462
Adjusted R²	.431	.431	.405
Residual SE	.985 (d.f. = 553,898)	.985 (d.f. = 553,895)	.963 (d.f. = 86,830)

	Dependent Variable: User FE from Specification 1
	Without Ability Tier Controls	With Ability Tier Controls
	(1)	(2)
Paying users	.308***	.299***
	(.013)	(.013)
Ability tier 2		.197***
		(.011)
Ability tier 3		.235***
		(.011)
Ability tier 4		.183***
		(.011)
Ability tier 5		−.211***
		(.011)
Observations	104,006	104,006
R²	.278	.295
Adjusted R²	.277	.294
Residual SE	1.081 (df = 103,946)	1.069 (df = 103,942)

	Dependent Variable: Log(Time Investment + 1)
	Passing Grade	Passing Grade − 5	Passing Grade − 10	Passing Grade − 15
	(1)	(2)	(3)	(4)
Paying user × Before reaching the threshold ( $α$ )	.081***	.055**	.026	−.018
	(.026)	(.025)	(.023)	(.023)
Before reaching the threshold ( $ξ$ )	.152***	.188***	.219***	.289***
	(.010)	(.010)	(.010)	(.010)
Weeks after payment ( $ω$ )	.165***	.163***	.161***	.159***
	(.014)	(.014)	(.014)	(.014)
Weeks after payment depreciation ( $γ$ )	−.016***	−.016***	−.015***	−.014***
	(.005)	(.005)	(.005)	(.005)
Observations	844,939	844,939	844,939	844,939
R²	.529	.529	.529	.530
Adjusted R²	.463	.463	.463	.464
Residual SE (df = 740,587)	1.008	1.007	1.007	1.006

	Dependent Variable: Log(Forum Activity)
	Without User FE	With User FE
	(1)	(2)
Paying user × Before reaching passing grade	.070***	.076**
	(.023)	(.036)
Before reaching passing grade	1.110***	1.122***
	(.008)	(.013)
Paying user	.335***
	(.024)
Course FE	Yes
User-course FE		Yes
Observations	98,818	98,818
R²	.270	.881
Adjusted R²	.270	.677
Residual SE	1.461 (df = 98,756)	0.971 (df = 36,411)

	Dependent Variable: Log(Time Investment + 1)
	Final Grade > Passing Threshold	Final Grade > Passing Threshold + 5	Final Grade > Passing Threshold + 10	Final Grade > Passing Threshold + 15
	(1)	(2)	(3)	(4)
Paying users × Before crossing	.087***	.086***	.091***	.083***
	(.026)	(.026)	(.026)	(.027)
Before crossing	.171***	.172***	.169***	.172***
	(.011)	(.011)	(.011)	(.011)
Weeks after payment	.151***	.146***	.144***	.148***
	(.015)	(.015)	(.015)	(.015)
Weeks after payment depreciation.	−.014**	−.013**	−.010*	−.014**
	(.006)	(.006)	(.006)	(.006)
Observations	614,892	588,716	555,549	506,148
R²	.487	.486	.484	.479
Adjusted R²	.431	.430	.428	.423
Residual SE	.985 (df = 553,898)	.981 (df = 530,879)	.975 (df = 501,342)	.969 (df = 457,120)

	Dependent Variable:
	Forum Activity Dummy	Log(Total Forum Activity + 1)
	(1)	(2)
Weeks after payment	.101***	.256***
	(.005)	(.011)
Weeks since payment (up to four)	−.032***	−.074***
	(.002)	(.004)
Observations	196,750	196,750
R²	.359	.457
Adjusted R²	.278	.388
Residual SE (df = 174,728)	.419	1.070

	Dependent Variable: Log(Grade on the Quiz)
	No FE	Quiz FE	Full Model
	(1)	(2)	(3)
Log time investment	.033***	.072***	.061***
	(.0003)	(.0002)	(.0002)
Paying user	.034***	.012***
	(.002)	(.001)
Constant	1.683***
	(.003)
Quiz FE		Yes	Yes
User-course FE			Yes
Observations	1,871,734	1,871,734	1,871,734
R²	.007	.637	.797
Adjusted R²	.007	.636	.738
Residual SE	.602 (df = 1,871,731)	.364 (df = 1,871,127)	.309 (df = 1,452,986)