Abstract
Massive open online courses (MOOCs) have the potential to democratize education by improving access. Although retention and completion rates for nonpaying users have not been promising, these statistics are much brighter for users who pay to receive a certificate upon completing the course. We investigate whether paying for the certificate option can increase engagement with course content. In particular, we consider two effects: (1) the certificate effect, which is the boost in motivation to stay engaged to receive the certificate; and (2) the sunk-cost effect, which arises solely because the user paid for the course. We use data from over 70 courses offered on the Coursera platform and study the engagement of individual participants at different milestones within each course. The panel nature of the data enables us to include controls for intrinsic differences between nonpaying and paying users in terms of their desire to stay engaged. We find evidence that the certificate and sunk-cost effects increase user engagement by approximately 8%–9% and 17%–20%, respectively. Whereas the sunk-cost effect is transient and lasts for only a few weeks after payment, the certificate effect lasts until the participant reaches the grade required to be eligible to receive the certificate. We discuss the implications of our findings for how platforms and content creators may design course milestones and schedule payment of course fees. Given that greater engagement tends to improve learning outcomes, our study serves as an important first step in understanding the role of prices and payment in enabling MOOCs to realize their full potential.
Massive open online courses (MOOCs) have the potential to democratize education by improving access (see Christensen et al. 2013; Dillahunt, Wang, and Teasley 2014; Glass, Shiokawa-Baklan, and Saltarelli 2016), especially because many MOOCs allow users to take courses for free, thereby enabling participants from lower socioeconomic strata to access their content. Over the past decade, platforms such as Coursera, edX, XuetangX, Udacity, and FutureLearn have partnered with hundreds of universities, offered thousands of courses, and attracted millions of users. As of November 2018, Coursera alone has attracted 30 million users, offers more than 3,000 courses, and has 177 university partners. Despite this potential to democratize education, low retention and completion rates in MOOCs have raised doubts about their prospects; for example, see Onah, Sinclair, and Boyatt (2014) and Khalil and Ebner (2014).
Although retention and completion rates for nonpaying users have not been promising, these statistics are much brighter for those who pay to receive a certificate upon course completion (Koller, Ng, and Chen 2013). Two factors may contribute to this disparity in performance between paying users and those who take these courses for free: (1) higher-ability participants self-select into signing up for a certificate, and (2) users who pay may stay more engaged with the course content than nonpaying users. Whereas the former is usually viewed as an intrinsic characteristic of users, the latter can potentially be altered via appropriate incentives and course design.
In this article, we investigate whether signing up for the certificate option is associated with greater engagement with course content. Our research is motivated by the premise that increasing user engagement can yield several benefits to the platform, content creators, and users. In a similar vein, research in other contexts has documented that greater engagement can be a good predictor of customer retention, lifetime value (Zhang, Bradlow, and Small 2014), and advocacy and referrals (Pansari and Kumar 2017). Therefore, exploring avenues to improve student engagement is likely to be of interest to all parties involved in online education.
We conceptualize that the relationship between signing up for the certificate option and engagement might be driven by factors that are (1) intrinsic to the student, such as their intrinsic ability and motivation, and (2) related to having paid for the course in order to obtain a certificate. The first of these two factors is consistent with the idea that selection drives the difference between both types of users for a given course. Regarding the second of these factors, we consider that the difference between paying users and their counterparts taking the course for free could stem from factors that influence the engagement of the former group over time. One such feature is that a paying user can obtain a verified certificate after accumulating the minimum points necessary to pass the course. As a result, a paying user is likely to experience a different level of motivation than a user who is taking the course for free. This gap in motivation level can change over time once the user achieves the minimum point threshold. We call this the “certificate effect.” A further difference between the two types of users is that paying users, as the name implies, pay for the course. Thus, they might demonstrate higher engagement merely because they made the payment, plausibly as a result of falling prey to the sunk-cost effect (e.g., Gourville and Soman 1998). Together, the certificate effect and the sunk-cost effect constitute temporal effects that occur in response to paying for the signature track.
From the perspective of the platform, why is identifying which of these two reasons—intrinsic or temporal—drives the differential engagement of paying and nonpaying users important? Evidence in support of the latter reason would suggest that the level of engagement is potentially malleable and can be influenced by modifying the design of the courses and the payment schedule. Therefore, findings from our research investigating these factors can shed light on actions that MOOC platforms can take to increase engagement among their users and, by extension, the learning and course completion rates on the platform. However, as we describe subsequently, empirically separating the various effects can be challenging.
We perform our empirical analysis using data from over 70 courses offered by a large public university on the Coursera platform. Although courses on the platform are generally offered for free, users can choose to pay a fee to obtain a certificate upon successful completion of the course. When our data were collected, the paid service at Coursera was called “signature track.” The signature track offered three services: identity verification, verified certificates, and shareable course records. The certificate is awarded to paying users if they achieve the minimum number of points required to pass the course. The data are granular and contain information on the time spent by individual users in accessing course content, users’ participation in the discussion forums, and users’ performance (grades) on individual quizzes and assignments. We perform our analysis by considering each user’s level of engagement (as measured in terms of time spent accessing course content and participating in discussion forums) with the material corresponding to each quiz/assignment (hereinafter, quiz) within a course. To separate the effects of the intrinsic factors (i.e., ability and intrinsic motivation—the two key drivers of selection that we discussed previously), we leverage the panel nature of our data on engagement by including strict controls in the form of user-course fixed effects (FEs). The panel data also allow us to include quiz FEs to account for differences in the extent to which individual assessments in each of these courses (i.e., quizzes and assignments) demand time commitments from participants.
To parse out the certificate effect, which is the consequence of offering a verified certificate and any benefits that may accrue from it through shareable course records, we exploit the idea that users who pay for the signature track will receive a certificate only if they obtain a passing grade in the course. 1 Although all users are likely to prefer higher grades (to lower ones) in general, users who have signed up for the signature track are likely to derive disproportionately more utility from passing the course. Therefore, we expect to see a shift in their motivation to stay engaged with the material around the threshold grade for passing the course. To infer the presence of the sunk-cost effect, we exploit the fact that the platform allows users to sign up for the signature track as early as a few weeks before the course commences and as late as a few weeks after the start of the course. We observe that users exhibit considerable heterogeneity in terms of when they pay for the service. Therefore, although early and late payers are both motivated by the certificate (i.e., the certificate effect), at each milestone (quiz), they differ in terms of how recently they paid for the course. Note that the certificate effect is common across paying users. Therefore, the systematic relationship between engagement and the recency of payment among these users helps us detect the presence of any sunk-cost effect.
Our results suggest that both paying and nonpaying users were more engaged with the course before reaching the passing grade. However, this elevated level of engagement before reaching the passing grade was 8%–9% higher among paying users than among their peers taking the course for free. This finding shows that the certificate effect, a consequence of signing up for the signature track, altered the motivation to pass the course and drove some of the difference in engagement between both kinds of users. This result is robust when we include alternative controls and approaches to matching nonpaying and paying users. We further consider the idea that paying for the signature option drove higher engagement as a result of the sunk-cost effect. We find that in the weeks immediately following the payment, users spent 17%–20% more time on the platform. However, this effect depreciated rapidly within four weeks after payment, a pattern consistent with the presence of the sunk-cost effect. Therefore, although the sunk-cost effect cannot fully explain all of the higher engagement among paying users until they cross the passing threshold, it seems to have played a role in driving some of the differential engagement during the period immediately after making the payment. 2 In addition to the main effects, we examine heterogeneity as a function of how early in the course a participant commits to paying for the certificate. Our results suggest that the certificate effect is higher among early payers than among those who pay later in the course. Because users are always weakly better off by delaying the payment, we can view early payment as a potential commitment device that restricts their future actions (see, e.g., Kaur, Kremer, and Mullainathan 2015).
Our results imply that paying for the signature track can increase engagement through the sunk-cost and certificate effects. This finding is likely to be of interest to education platforms and creators of course content, who may be interested in increasing engagement either to improve learning outcomes or to foster future enrollment. Importantly, these benefits are also aligned with the objective of these agents to monetize content by charging users. Some of this malleability in engagement comes from the motivation to receive a certificate. Although the idea that issuing a certificate can motivate participants to stay more engaged might be intuitive, the coexistence of free and paid options in our context enables us to document this effect empirically. Further, the finding that the mere act of paying will induce participants to increase engagement, albeit immediately following the payment, is indeed informative. We discuss the implications of our findings for how platforms and content creators may want to design course milestones and schedule the payment of course fees.
The primary contribution of this paper is to demonstrate the roles of payment and the presence of incentives in the form of certificates on user engagement. This engagement, in turn, is likely to be a key antecedent of educational outcomes; see, for example, Romer (1993), Hughes and Pace (2003), and Carini, Kuh, and Klein (2006). The certificate effect is reminiscent of studies that document the impact of incentive schemes on sales force effectiveness in the marketing literature (Chung, Steenburgh, and Sudhir 2014; Misra and Nair 2011). In contrast to the sales force literature where effort is not observed, a unique feature of our data is that we observe longitudinal variation in student engagement within courses. Furthermore, we are able to leverage the quasi-experimental nature of variation across nonpaying and paying users in different courses. Together, a continuous measure of effort across these different types of users allows us to isolate the temporal effects from cross-sectional ones. The latter has traditionally been the focus of the education literature (e.g., Heckman and Kautz 2012; Heckman and Rubinstein 2001; Romer 1993). Our study therefore highlights the importance of additional temporal factors, namely, the certificate and sunk-cost effects, and how they are influenced by incentive schemes and payments. The findings from this study demonstrate that marketing plays a vital role in influencing online education in a manner different from how educators might approach this issue.
Conceptual Framework
Learning is typically viewed as an outcome of two factors: (1) the innate ability of the participants and (2) their engagement with the course content (a proxy for effort). Of these two, ability can be viewed as a time-invariant trait of a participant, at least within the context of a course. Participants might have higher ability either because they have the requisite background knowledge that enables them to perform well in the course or because of their aptitude to grasp new material. Because ability is unlikely to change within the span of a course, platforms need to consider enhancing engagement as a means to achieving better learning outcomes. Whereas the education literature has had to rely on cross-sectional data and proxies for user engagement and time investment, online courses record information regarding the amount of time participants spend consuming course content. Consistent with prior research (e.g., Brodie et al. 2011; Kumar et al. 2010; Venkatesan, Petersen, and Guissoni 2018; Vivek, Beatty, and Morgan 2012), we can use this direct time spent on the course portal as a proxy for engagement. We can thus investigate plausible drivers of engagement and identify avenues to improve it.
Typically, modifying the teaching style, content, and design of courses is viewed as an avenue for improving engagement. For example, adopting a more active teaching style by moving the exercises and homework to the classroom can improve student engagement and has been the focus of studies in the education literature (Clark 2015; Weiss and Pasley 2004). Changes in course design can also make learning easier and thus enable participants to achieve greater returns for the time that they invest. In a similar vein, Lu, Bradlow, and Hutchinson (2017) study how sequential versus simultaneous release of course content can influence binge consumption of such content. Because such clumpiness in consumption is an important predictor of churn (Zhang, Bradlow, and Small 2014), the timing of release can be used to boost student engagement and retention in online courses. In this section, we consider the role of payment as a driver of engagement. As a starting point, we elaborate on the idea that two broad drivers of engagement exist.
Intrinsic Traits as Drivers of Engagement
We conjecture that two intrinsic traits are relevant. First, a participant’s ability would dictate how their engagement translates into learning outcomes. These differences in ability can have implications for a participant’s engagement. For example, the total factor productivity literature, which has studied the impact of productivity on resource allocation across firms (e.g., see Foster, Haltiwanger, and Syverson 2008; Hsieh and Klenow 2009), shows that more productive firms should produce more output and take in more input factors. Borrowing this analogy, students with greater ability should stay more engaged (i.e., more input) and obtain better grades (i.e., more output). However, participants might sometimes view the final grade as the learning outcome. Because the final grade has an upper bound, the resulting concavity can potentially result in high-ability participants expending less time than their lower-ability counterparts. Therefore, the relationship between the ability of a participant in an online course and their engagement is somewhat ambiguous.
The second intrinsic trait is the time-invariant (within the temporal span of a course) motivation level of a participant. For example, participants might differ, in a cross-sectional sense, in the benefit they derive from learning as well as the cost of time investment. Cost of time would depend on other activities that might compete for a participant’s attention, such as working, commuting, and spending time with their family. As a result, participants with a higher cost of time are likely to stay less engaged with the course material. Similarly, participants who derive greater benefit from learning (i.e., place more value on better learning outcomes) are likely to stay more engaged with the course.
How would paying for the certificate alter these intrinsic traits? We conjecture that payment should not have a direct effect on ability. On the other hand, payment can increase the perceived benefits that participants derive from a course. This perception, in turn, can increase their time-invariant motivation level and lead to greater engagement with the contents of the course. At the same time, we can envision the reverse scenario wherein highly motivated participants choose to pay for the course. The presence of this alternative explanation makes it difficult it parse out the effect of paying for the certificate on the intrinsic motivation of a participant with nonexperimental data. Therefore, we embark on the less ambitious objective of inferring the effect of paying for the certificate on the temporal changes in engagement within the span of a course.
Temporal Effects
We consider two temporal effects that are related to paying for the signature track service: the certificate effect and the mere-payment effect, plausibly driven by the sunk-cost fallacy.
The Certificate Effect
The certificate effect is motivated by the idea that during our period of analysis, participants who signed up for the signature track would receive a certificate upon achieving a minimum passing grade in the course. Therefore, paying users are more likely to be motivated to reach the passing grade than are their peers taking the course for free. These dynamics are similar to those that arise when firms institute tiered customer loyalty programs wherein incentives are offered to customers based on cumulative purchasing behavior (Breugelmans et al. 2015; Kopalle et al. 2012).
In our context, Coursera issued a certificate of completion to paying users who achieved the passing grade for the course. Therefore, per our conceptualization, paying participants are likely to experience a perceptible shift in motivation after they reach the passing grade. Free users are unlikely to experience any such change in motivation. 3 In addition, the certificate effect reflects any incremental benefits, such as the ability to share course records. 4 As a result of these changes, we predict that paying customers are likely to be more engaged with the course content than free users until they reach the passing threshold. Once this threshold is reached, the difference between these two groups of participants should shrink. 5 The presence of such a pattern can be viewed as evidence that payment altered the motivation structure of participants and thus increased engagement. Because this shift in engagement is expected to occur later in the course, after the deadline for signing up for the certificate, it cannot be explained by the idea that highly motivated participants chose the payment option.
The Sunk-Cost Effect
In addition to the certificate effect, the mere act of paying for a course can have an effect on how participants view a course and motivate them to stay more engaged with its content. The sunk-cost effect is one such phenomenon tied to payment that could potentially influence engagement. The sunk-cost fallacy is a behavioral bias wherein users keep investing time and money in projects merely because of their sunk investment. 6 If the sunk-cost effect exists among participants in MOOCs, they would increase their engagement if they end up paying for a course. Research by Gourville and Soman (1998) and Ho, Png, and Reza (2018) has also shown that the sunk-cost effect is transient and depreciates over time after the payment. 7 Such depreciation can have implications for how a MOOC platform should schedule the payment of course fees to keep participants engaged with content. As a result, the sunk-cost fallacy, which is often viewed as throwing good money after bad, might prove to be beneficial in this context by keeping users engaged in online learning platforms.
As noted in the introduction, participants differ in terms of how early in the course they commit to the signature track. If the sunk-cost effect is transient as noted in the literature, the boost in engagement as a result of paying for the course should recede over time. Notably, users might get excited about the course when the knowledge about their payment is salient. This excitement, which is tied to the salience of the payment, is likely to alter engagement as a part of the sunk-cost effect. Therefore, to infer the existence of the sunk-cost effect, we can study how engagement of a paid participant changes depending on the recency of their payment. However, as we discuss subsequently, the timing of payment is an endogenous decision made by a participant. Through a variety of analyses, we present evidence that the mere act of payment alters engagement. However, we exercise caution in interpreting the quantification of the sunk-cost effect as being conclusive.
Data Description and Background
We use data from over 70 courses offered by a large public university on the Coursera platform. The data pertain to courses offered on the platform between 2012 and 2016. An average course is approximately 10 weeks long. The data are highly granular and contain detailed information on the consumption of course material through clickstream data, quiz outcomes, and forum activity. During this period, Coursera employed a freemium model; users could access course material, submit assignments, and get a final grade free of charge. At the same time, interested participants had the option to subscribe to the signature track for a one-time payment. The signature track allowed participants to receive a certificate from the institution upon successful completion of the course. 8
Our data set consists of three components. First, we have information on the time when each participant enrolled in a course. Enrollment is free and enables users to access the material. In addition, we also have information on the time when each participant registered for the signature track by paying the course fees, henceforth referred to as the payment time. Note that participants can choose to register for the signature track at the time of course registration or make the decision subsequently. In our data, 23,674 participants (2% of users) chose to sign up for the signature-track service. We present the distribution of when participants registered for the signature track relative to the first day of the class (represented by 0) in Figure 1. We observe considerable heterogeneity in terms of when participants registered for the signature track, with a significant fraction making the decision a few days after the first day of the course. Nevertheless, almost all participants who registered for the signature track did so within 24 days after the first day of the course, which appears to be the deadline for making this decision. 9

Distribution of time of enrollment in signature track with respect to the first day of courses.
The second component of our data set is the information on consumption and course outcomes. In particular, we have information on the number of occasions (sessions) when a participant accessed course content, the duration of each session, their activities on the course forum (both visits and posts), their performance in the various assessment milestones, their overall course grade, and whether they successfully graduated from the course. 10 Students who are enrolled in the signature track are awarded a verified certificate provided they achieve a minimum passing grade in the course. The final grade is a weighted sum of the grades on individual quizzes (Coursera n.d.).
In addition to these data, we have information from survey data independently collected by Coursera. We find that about 25% of the users in our data completed these surveys. The surveys are used to obtain information about each participant and were not tied to their registration in any particular course. The data contain information on demographic characteristics of participants such as their age, gender, and education.
During the course, each participant is required to complete a series of quizzes and assignments to advance to the next stage. We refer to each unique quiz (or assignment) within a course as a quiz block. A participant might attempt the same quiz multiple times. Therefore, a quiz block might include multiple attempts on the same quiz by the participant. Each observation in the data set corresponds to a quiz/homework attempt by a user, along with corresponding information on the amount of time they spent on the course portal. This information enables us to infer the user’s total time investment between successive attempts on a quiz. Extant research has defined engagement as the intensity of consumption and interaction with the products and services; for example, see Kumar et al. (2010), Brodie et al. (2011), Vivek, Beatty, and Morgan (2012), and Venkatesan, Petersen, and Guissoni (2018) for reviews. In this spirit, we use the user’s time investment (e.g., accessing course content, spending time on forum activity) during the span of a quiz block as a metric of engagement.
We present a visual representation of the data structure for a representative user in Figure 2. The arrow length represents the amount of time this user invested before attempting the quiz. Apart from the quiz attempts, we highlight three other events in Figure 2, namely, the registration, the payment, and the point at which the user crosses the passing threshold (if the user passes the course). We use the variation in the relative timing of these events, which are specific to each user’s calendar, in some of our analyses in the subsequent sections. We find that less than 10% of users attempted a quiz multiple times. Therefore, we perform most of our analyses at the quiz-block level by aggregating the time the participant spent on all the attempts within a quiz block. In our empirical analysis, we have one observation per quiz for each participant.

Timeline of events. In our studies, each quiz attempt is referred to as an observation. At each step, the user decides which quiz to attempt and how much to study for it, which is indicated as the length of the brown arrows (labeled “q”). After each attempt, the user gets a score and may decide to reattempt that quiz or move on to a new one. Each block consists of a set of attempts on the same quiz.
Descriptive Statistics and Model-Free Evidence
Our main objective is to study the relationship between signing up for the certificate and engagement. As a first step, we consider how free and signature-track (paying) participants differed in terms of the various engagement and outcome metrics. We report these descriptive statistics in Table 1. Overall, we find considerable differences between the two groups of participants. In particular, signature-track participants were more engaged in terms of the time spent in accessing course content and being active on the course forum. They also appear to have better outcomes in terms of graduation rate and final grades. For example, the average final score among signature-track users is 63.7% versus an average of 5.9% for the rest of the users. However, a large number of participants, especially among those who took the course for free, did not complete a sufficient number of quizzes and assignments to receive a nonzero final grade. Therefore, we considered whether the gap between the two groups vanishes when we look only at participants who received a nonzero final grade. After we excluded zero grades, the average final scores are 73% and 33% for paying and nonpaying users, respectively. The completion rate, that is, the rate of achieving a grade higher than the passing threshold, is 56.7% for signature-track users and 3.68% for free users. These numbers increase to 66% and 21% for paying and nonpaying users, respectively, if we exclude zero grades.
Comparison of Engagement and Outcome Metrics Among Paying and Nonpaying Users.
Together, these results suggest that users who pay stay more engaged with the course content and also fare better in terms of learning outcomes and course completion rates. At first glance, one might conclude that paying for the certificate drove these stark differences between paying and nonpaying users. However, as noted previously, we need to consider the possibility that highly motivated and higher-ability participants may have self-selected into signing up for the certificate. These participants with high motivation and ability might, in turn, have stayed more engaged with the course content and also fared better in terms of learning outcomes.
To address this selection issue, we consider the fact that paying users receive a verified certificate from the institution offering the course upon successful completion and nonpaying users would not. Therefore, we posit that the two types of users are likely to differ in terms of their motivation to achieve the passing grade as a result of the certificate effect. If this motivation plays a major role in time-investment decisions, the intensity of time investment should vary as a function of how far the participant is from achieving the passing grade. For instance, if paying users are investing more time merely to obtain a certificate, this additional motivation should cease when they achieve the minimum necessary grade to pass the course. Consequently, their behavior should become more similar to their nonpaying counterparts.
To conceptualize this idea, for each participant, we first calculate the time investment in each quiz block. We then consider how this time investment varies depending on the distance from the minimum grade threshold for obtaining the certificate. If participants who signed up for the certificate exhibit greater motivation to pass the threshold than nonpaying users, we should observe that these two groups of participants make different time-investment decisions as a function of their distance from the threshold. We present the average time investment on the quiz blocks for both types of users as a function of their distance from the threshold in Figure 3. To alleviate selection issues, we consider only those participants who have reached a final grade at least 20 points above the passing threshold. 11 The figure demonstrates that, on average, paying users spend more time on the course than their counterparts who are taking the course for free. More importantly, this gap tends to widen as users approach the threshold and shrinks quickly after the goal is achieved. These data patterns suggest that paying users might exhibit differential engagement than nonpaying users as a function of their distance from the passing grade. In the subsequent sections, we examine whether these patterns are robust when we account for prior investments as well as course, quiz, and user-course FEs.

Average time investment in quiz blocks among nonpaying and paying users at different stages relative to the passing grade.
As noted previously, we explore a second component of the temporal effect based on the idea that the mere act of paying for the course might also drive users via the sunk-cost effect to stay more engaged with its content. To this end, we consider engagement among paying users in the weeks following their decision to pay for the signature track. Specifically, we consider two groups of paying participants based on the timing of payment relative to the start of the course: (1) paying before the course began and (2) paying after the course began. Because the motivation to receive the certificate is common among all paying users, we can explore how the mere act of paying is related to their engagement levels.
To illustrate the idea behind this identification, let us consider the time spent on the platform by participants who paid for the signature track after the course began. We report the results from this analysis in Figure 4. These results suggest these payers exhibited a considerable increase in engagement (in terms of time spent on the platform) in the weeks immediately following payment. Moreover, the extra engagement among participants who paid after the course began shrinks considerably in the second week after they made the payment and seems to disappear after four weeks. A concern with this analysis is that the boost in engagement among participants who paid after the commencement of the course might be a result of the certificate effect as well as of the fact that they paid for the course. However, the result that the boost in engagement shrinks as we move further away from the timing of the payment suggests that this pattern is probably a transient effect. Since the certificate effect is likely to be preserved until the participant gets closer to the passing grade (which happens much later), this pattern is more likely to have been driven by the sunk-cost effect.

The difference in time spent in the weeks following the payment week for users who paid after the course began.
Although these results are consistent with the idea that the sunk cost of paying drove the temporary increase in engagement among participants, two alternative explanations could have led to a similar outcome. First, participants who paid later wanted to learn about the match value of the course before making that commitment. Therefore, payment and an increase in engagement were both driven by users’ discovery that the course is a good match for them. Although this possibility can explain the concomitant occurrence of payment and an increase in engagement, it cannot rationalize the transient nature of the increase in engagement. The second explanation is that participants who paid later probably had to catch up with the course content. Therefore, they had to increase engagement immediately after paying for the course. However, when they had caught up with the course content, they reached the same level of engagement as those who paid earlier.
To rule out this second alternative explanation, we investigate the role of recency of payment in increasing engagement by comparing the amount of time paying users spent during the first week of the course as a function of how recently they made their payment. Specifically, we consider only users who paid for the signature track before the course began. This focus eliminates the need to catch up with the contents of the course. We separate these participants into three groups based on the recency of their payments relative to the start of the such that, the certificate effect was common to all users at the start of the course. If payment has a sunk-cost effect and that effect depreciates over time, we should see that users who paid closer to the first day of the course invested more time during the first week than those who paid well in advance. Furthermore, we would expect that the difference between these groups should be less pronounced in subsequent weeks as the effect of payment depreciates over time. We present the results from this analysis in Figure 5. We find that users who paid within a week before the beginning of the course spent more time on the platform than users who paid for the course earlier. Moreover, the gap between the three groups of users is statistically indistinguishable during the second week of the course. However, note that the transient nature of the additional engagement immediately after payment implies only that paying users eventually converge to a common level of engagement. As we discuss subsequently, we cannot directly comment on any persistent effect that payment might have on paying users compared with those taking the course for free.

Comparison of weekly investment during weeks 1 and 2 of the course by users who paid before the first day of classes.
Overall, our model-free analyses have the following implications:
Users change their engagement depending on whether they have reached the minimum grade required to receive the certificate. In particular, paying users tend to exhibit greater engagement before reaching the passing grade. After the passing grade is reached, the difference in engagement between nonpaying and paying users shrinks. This pattern is consistent with the idea that paying users stay more engaged in order to be eligible to receive the certificate. The mere act of payment increases user engagement. However, this effect is transient and decays in the four weeks after payment. This pattern is consistent with paying participants exhibiting the sunk-cost fallacy.
In our empirical analyses, we examine these temporal effects of payment while controlling for selection.
Empirical Analysis
We attempt to use the variation in the timing of payment and goal progress to assess the relative magnitudes of the sunk-cost effect and certificate effects. We perform our analysis at the quiz-block level.
12
Let i and q index individuals and quiz blocks, respectively. Also let
Consider the following specification:
Next, we present the expressions for the certificate effect and the sunk-cost effect and discuss the intuition behind the identification of these effects:
We can infer the presence of the certificate effect if paying users exhibit a higher level of engagement than nonpaying users until they reach the passing grade. To parse out a variety of confounds, we include controls in the form of FEs. In particular, we include user-course FEs to control for intrinsic differences across participants (and courses) and quiz-paid FEs to control for varying levels of required time commitment across assignments. The quiz-paid FEs allow nonpaying and paying users to have different tendencies to stay engaged with the course content. Recall that the signature track offers other services, such as identity verification; these cross-sectional characteristics that could influence all paid users will also be picked up by these FEs. The coefficient The intuition behind the identification of the sunk-cost effect is that considerable variation exists in the timing of payment among paying users. Given our previous evidence that the sunk-cost effect is transient, we can consider the immediate increase in engagement to be limited to the first four weeks after payment. Furthermore, we assume that the sunk-cost effect decays linearly during this four-week period.
14
Therefore, paying users are likely to experience the boost in engagement as a result of the sunk-cost effect over different periods depending on when they pay. The sunk-cost effect consists of two components: (1) the base effect Participants generally spend more time on course content before reaching the passing grade. Paying participants spend significantly more time on course content before reaching the passing grade than those taking the course for free. The sunk-cost effect leads to an immediate increase in engagement in the weeks following payment. The effect depreciates as time since the payment increases.
We present the results from this analysis for the full sample of participants in Table 2 and for the subsample of users who passed the course in Table 3. Regarding the certificate effect, these results provide four important insights that are consistent with the patterns presented in the “Descriptive Statistics and Model-Free Evidence” section:
Quantifying the Certificate and Sunk-Cost Effects: Full Sample.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions include quiz-paid, user, and prior investment FEs and controls. All standard errors are clustered at the user-course level.
Quantifying the Certificate and Sunk-Cost Effects: Conditional on Achieving the Passing Grade.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions include quiz-paid, user, and prior investment FEs and controls. All standard errors are clustered at the user-course level.
These results reveal the certificate effect (coefficient of Paying users × Before crossing) is positive. In terms of magnitude, paying users spent
Heterogeneity in Timing of Payment
In addition to the main effects, we also explore potential heterogeneity as a function of the timing of payment. Recall that a sizable portion of students enroll in the signature track long before the payment deadline. These students do not gain anything by paying early, because they would gain by waiting strategically to obtain more information both about their match value with the course and their performance before signing up for the certificate. Research has documented that agents may choose dominated contracts to serve as a commitment device by restricting their actions in the future (Kaur, Kremer, and Mullainathan 2015). Therefore, we can view a participant's choice to sign up for the signature option before the start of the course as a proxy for their use of it as a commitment device that would stimulate them to complete the course.
We divide the paying users for each course into two groups: early payers and late payers. For any given course, we define early payers as those who were among the first half of payers, and we classify the rest as late payers. We compare the magnitude of the certificate and sunk-cost effects among early and late payers. Specifically, we include an additional three-way interaction term between the following variables in Equation 1: the dummy for whether the individual is a paying user, a dummy for the period before the passing threshold is crossed, and a dummy indicating a later payer. The coefficient of this interaction tells us the extent to which late payers experienced a greater (positive coefficient) or lesser (negative coefficient) certificate effect than the early payers.
We report the results in column 2 of Tables 2 and 3. The results suggest that the certificate effect is significantly stronger for the users who paid earlier (.126) than for those who paid later (
Matching
Although our analysis controls for cross-sectional differences between nonpaying and paying users by including user-course FEs, these two groups of users could exhibit different trends in terms of their engagement before reaching the passing grade. To address this concern, we used a state-of-the-art machine-learning technique, namely, boosted trees, to match the subset of these users who signed up for the signature track two weeks after the first day of the course with those who did not pay for the signature track. When performing this matching, we used user characteristics during the first two weeks of the course (i.e., before either group paid for the signature track), such as forum activity, quiz outcomes, response to demographic surveys, and overall time spent on the platform. 16
To demonstrate the effectiveness of the matching algorithm, we present the propensity scores for nonpaying and paying users in Figure 6. Note that the propensity scores for users who did not end up paying are more skewed toward zero. This suggests that our algorithm has some predictive power in distinguishing between paying and nonpaying users based on the prepayment outcomes. We then matched the users such that each paying user had five nonpaying users in the matched data.

The fitted treatment (payment) propensity scores for control (free) and treatment (paid) users are compared on the left.
We present the density of propensities for the matched control and the treatment sample in Figure 6. As stated previously, we used the matched sample to reestimate Equation 1, and we report the results in Tables 2 and 3. Note that the matched regression is effectively measuring the local average treatment effect on late payers. 17 When we assemble similar groups of late payers and nonpaying users in the matched sample, the treatment effect could remain the same, attenuate, or increase depending on different types of selection that could mask or increase the measured treatment effect. The effect for late payers in column 2 of Tables 2 and 3 was .126 − .089 = .037 and .130 − .086 = .044. In the matched regression (see column 3 of the same tables), this effect increases to .081 and .091. In both cases, because our sample size shrinks after matching, we have less statistical power. As a result, the coefficients are significant with p < .1. Given the similarity of these coefficients to the average treatment effect measured in column 1 and the fact that the treatment effect tends to be smaller for late payers as demonstrated in column 2, we believe the coefficients reported in column 1 of Tables 2 and 3 serve as a lower bound for the certificate effect.
Cross-Sectional Differences Between Nonpaying and Paying Users
In the previous section, we show how we use our panel data to estimate the certificate and sunk-cost effects. Recall from our conceptual framework that these effects are reflected in temporal changes in engagement. There, we also note the presence of time-invariant or cross-sectional factors that influence engagement. In this section, we consider how nonpaying and paying users differ cross-sectionally in terms of their engagement, and we provide a comparison with the temporal aspects documented previously. To this end, we focus on the estimated user-course FEs from Table 2, which reflect the average propensity of individual participants to engage with the course content after achieving the passing grade. This user-course FE consists of three components: (1) difference across courses in terms of time commitment requirements, (2) difference in engagement as a result of ability, and (3) difference in engagement as a result of motivation that remains invariant over time. 18
To control for cross-sectional differences that arose because of ability, we exploit the data on engagement and grades of individual participants for each quiz to obtain a metric of a user’s intrinsic ability. To illustrate the idea, in Figure 7, we compare the grades of paying and nonpaying users for each quiz as a function of the time invested in engaging with the course material before the quiz. The figure highlights that for the same effort, paying users achieve higher grades than free users. 19 We argue that this observation is indicative of differences in ability between the two groups of participants. We can extend this idea to obtain a metric of ability for each participant. We present further details on how we build on this intuition to obtain individual user-level measures of ability in the Appendix. We divide the ability measures into five tiers (quintiles) and use them as nonparametric controls for cross-sectional differences across participants.

The marginal returns to time investment for paid and free users.
To parse out these effects, we regress the estimated FEs for participant
Decomposing the Estimated User-Course FEs.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions include course FEs.
The results in column 2 of Table 4 also reveal that engagement levels tend to follow an inverted U-shaped pattern as a function of ability. In particular, results from column 2 indicate that the second ability tier spends an average of
In the preceding analysis, we jointly studied the sunk-cost and certificate effects. Our next analysis further examines each effect and demonstrates the robustness of our findings. We consider several threats to validity and address each threat to the extent we can. The analysis also considers an alternative engagement metric, namely, forum activity, and demonstrates that patterns similar to those reported persist.
Further Examination of the Certificate Effect
Precision of timing of treatment
We characterize the certificate effect as the extra time that paying participants spend on the course content before reaching the passing grade. Therefore, our research design uses reaching the passing grade as the treatment. However, participants may anticipate their chances of reaching the passing grade well before reaching it. In this case, they might adjust their engagement before reaching the passing grade. Such a deviation might render the exact timing of the treatment fuzzy.
To verify how the estimated certificate effect changes when we change the time when paying users start adjusting their engagement, we estimated Equation 1 with a different definition of the treatment. Recall that in the original formulation and the corresponding results reported in Table 2, we defined the treatment as the point when the participant has reached the passing grade. As a robustness check, we estimate the model in Equation 1 with three alternative definitions of the treatment (i.e., points where paying users alter their engagement): passing grade − 5, passing grade − 10, and passing grade − 15. The idea is that as we get further away from the passing grade, participants should have limited ability to anticipate their ability to achieve the passing grade. Consequently, the extent to which they adjust their engagement should decrease in magnitude as we move further away from the passing grade.
We present the results from this analysis for the full sample of participants in Table 5. Note that the first column in this table (corresponding to the treatment/threshold defined as the passing grade) is the same as column 1 in Table 2. The subsequent columns in Table 5 move further away from the passing grade. These results suggest that as we move the threshold closer to the beginning of the course, the effect of crossing the threshold decreases in magnitude. More importantly, when we define the threshold as passing grade − 10, we find no statistically significant effect of crossing that grade on engagement among paying users. This analysis also serves as a falsification test of the possibility that the certificate effect only occurs in a narrow band around the passing grade. Note that the coefficient
Verifying Robustness to the Timing of Treatment.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions control for prior investment, quiz-paid, and user-course FEs. All standard errors are clustered at the user-course level.
Alternative engagement metrics
To verify whether the certificate effect persists for other metrics of engagement beyond the time spent on the course portal, we consider activity on course forums, which includes page visits, upvotes, and posts created by users. Each action on the course forum, such as visits to a thread, comments, and other interactions, increases the intensity of forum activity by one unit. Because forum activity can only be performed on the platform, it might provide a clean measure of engagement on the platform. We aggregate total forum activity for each user in the periods before and after reaching the passing threshold and compare it across nonpaying and paying users.
20
We present the results from this analysis in Table 6. These results suggest a certificate effect of
Forum Activity Before and After Crossing the Passing Threshold.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions control for user-course FEs. Standard errors are clustered at the user-course level.
Discussion of plausible mechanisms and interpretations
As described previously, users do not benefit from paying well before the payment deadline because they can strategically wait and gather more information. Therefore, we conjectured that these results are consistent with the idea that early payers view the certificate option as a commitment device that will help them complete the course. A nontrivial portion of users pay well in advance, and our results in Tables 2 and 3 show that early payers benefit more from the certificate effect. These observations provide evidence for the commitment-device account.
21
Here, we discuss plausible interpretations and the mechanism of the certificate effect.
22
Peer Effects: Controlling for the Volume of Activity on the Forum.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions control for prior investment, forum content, quiz-paid, and user-course FEs. All standard errors are clustered at the user-course level.
Effect of the Timing of Churn on the Certificate Effect.
*p < .1.
**p < .05.
***p < .01.
Further Examination of the Sunk-Cost Effect
Alternative engagement metrics
Similar to the exercises we performed for the certificate effect, we consider activity on course forums to verify whether the sunk-cost effect persists to other metrics of engagement beyond the time spent on the course portal. As discussed previously, the identification of the sunk-cost effect comes from the variation in the timing of payment among paying users. Therefore, to better reflect the nature of this temporal effect, we use weekly data from paying users. In particular, we use the forum activities described previously and aggregate them at the weekly level for each paying user. We consider two metrics: (1) a dummy variable that reflects the incidence of forum activity in a given week (i.e., the extensive margin), and (2) a continuous variable Log(total forum activity + 1) (i.e., the intensive margin).
We use the following specification:
We report the results for the effect on the extensive margin in column 1 of Table 9. Subsequently, we report the impact on the intensive margin in column 2 of Table 9. These results are consistent with those in Tables 4 and 5 such that postpayment forum activity increases, and this effect depreciates quickly over time.
Sunk-Cost Effect Based on Forum Activity.
*p < .1.
**p < .05.
***p < .01.
Notes: All regressions include user-course and course-week FEs. All standard errors are clustered at the user-course level.
Heterogeneity in the sunk-cost effect
Research on the sunk-cost fallacy has documented that it is related to certain demographic characteristics. In particular, research (e.g., Bruine de Bruin, Parker, and Fischhoff 2007; Bruine de Bruin, Strough, and Parker 2014; Strough et al. 2008; Strough, Schlosnagle, and DiDonato 2011) has documented a negative correlation between the sunk-cost fallacy and age. To verify this negative correlation in our context, we consider heterogeneity in the sunk-cost effect. To this end, we use the demographic information on the age, gender, employment, and education status of each individual. 23 Our results revealed that the sunk-cost effect is heterogeneous only along the age dimension, with older individuals being less likely to fall prey to the sunk-cost fallacy. Therefore, these results align with those reported in the literature, further giving us confidence that these patterns indeed stem from the sunk-cost fallacy. 24
Engagement and Learning Outcomes
So far we have shown the impact of the certificate and sunk-cost effects on engagement. In this section we present evidence that higher engagement, in turn, leads to better course outcomes. To demonstrate this relationship, we exploit the data on engagement and grades of individual participants for each quiz and consider the following specification:
Transformation of Time to Grades, and the Ability Gap Between Paying and Nonpaying Users.
*p < .1.
**p < .05.
***p < .01.
Notes: All standard errors are clustered at the user-course level.
Discussion
As MOOCs are gaining greater acceptance, content creators (including educators) and hosting platforms need to develop interventions to increase student engagement and completion rates. This issue is important for ensuring that students gain proficiency in the content and perceive that they derive value from these courses. In this respect, we find suggestive evidence that higher engagement rates within a course are related to better grades and, potentially, learning outcomes. 26 Collectively, higher engagement, completion rates, and learning will help in monetizing the content. However, little is known about the drivers of student engagement in MOOCs. Our study is an important first step in understanding the role of payment in driving engagement. The freemium pricing structure of MOOCs provides a unique opportunity to study differences between nonpaying and paying users within a course. Furthermore, the ability to track engagement at different points within a course provides rich variation to understand the causal effect of payment on engagement.
In our objective to understand why paying and nonpaying users in MOOCs exhibit significant differences in the extent to which they engage with course content, we propose three plausible explanations. The first explanation is that these two groups of users are intrinsically different, in terms of both their ability and motivation levels. The second rationale is that the possibility of receiving a certificate upon reaching a passing grade motivates paying users to stay more engaged than their nonpaying peers, at least until they reach the passing grade. We call this response the certificate effect. The third explanation is based on the idea that the mere act of paying for the course might trigger these participants to increase their engagement. We propose the sunk-cost effect as one such mechanism that might trigger this behavior. Of these, the certificate effect and the sunk-cost effect are consequences of paying for the certificate and can potentially be used to increase engagement in online courses by altering the course design and payment structure.
We find that the certificate and the sunk-cost effects influence engagement in different ways. The motivation to be eligible to obtain the certificate results in paying users spending approximately 8%–10% more time on the course portal. This effect lasts until they reach the passing grade, which is typically around 70% for the courses that we consider. On the other hand, the mere act of payment leads to approximately 17%–20% higher engagement among paying users. However, this effect is transient and lasts only for a few weeks. Prima facie, whereas the sunk-cost effect appears to be larger than the certificate effect, the latter lasts considerably longer.
By contrast, we find that intrinsic traits (i.e., ability and intrinsic motivation) led to paying participants spending approximately 36% more time on the course portal than their nonpaying peers. Although we find evidence that ability can affect user engagement, it explains only a small portion of the intrinsic difference between free and paid users. Therefore, we conjecture that intrinsic motivation is an important driver of the cross-sectional difference between the two groups.
Together, these results suggest that temporal effects (i.e., certificate and sunk-cost effects) play a significant role in driving user engagement in MOOCs. This finding implies that MOOC platforms and content providers can increase engagement even if the course content remains unaltered. The presence of the certificate effect suggests that providing tangible rewards tied to outcomes can increase the engagement of participants. Moreover, as more employers begin to consider these certificates when making their hiring and promotion decisions, the higher value of these certificates will increase the ability of the certificate effect to increase engagement. The transient nature of the sunk-cost effect suggests that staggering course payments over several installments could result in multiple doses of this effect. However, whether the lower monetary value of each installment will reduce the role of the sunk-cost effect in increasing engagement is not clear. We believe MOOC platforms can learn about the value of these interventions by conducting field experiments.
We caution the reader about extrapolating these results to nonpaying users (or users who would have paid if the fee were lower) for two reasons. First, nonpaying and paying users might differ in terms of the value that they place on the certificate and might therefore respond differently. Second, if the platform decides to attract current nonpaying users by lowering the fee structure, and the incentive to achieve the passing grade is related to the fees, then our current estimate of the certificate effect might not translate into the new context.
We also acknowledge that we neither have a formal structural model that can quantify these effects for the purpose of running counterfactuals nor have data that can speak to the impact of staggered payments. However, from our results and the fact that the sunk-cost effect is short-lived, we conjecture that spreading the payments might be a plausible approach to effect more persistent changes in engagement. Regarding the certificate effect, we see that users are fairly responsive to passing thresholds, and this observation provides an opportunity for firms to optimize the overall engagement levels of users.
Conclusion
In recent years, MOOCs have gained considerable prominence for providing free access to education. However, as in the case of other digital platforms such as online newspapers, MOOC platforms have been exploring the possibility of monetizing their digital content. Against this backdrop, we investigate the impact that paying for online courses has on how users engage with content available on MOOC platforms. We exploit the panel structure of the data along with variation in the time when participants pay for the premium service and their state in terms of goal progress to estimate two effects of payment on engagement: the certificate effect and the sunk-cost effect. We find evidence that both effects exist. In particular, the presence of the sunk-cost effect implies that the mere act of payment can increase engagement with course content. Therefore, increasing user engagement and potentially course completion rates is aligned with the incentives of the platform and content creators to monetize their content. However, this effect depreciates over time. As discussed previously, staggering payments appears to be a promising way to constantly remind users about their investments. At the same time, offering a certificate seems to play an important role in incentivizing students to exert more effort. Overall, we believe these results have implications for how MOOC platforms should monetize their courses while ensuring that their participants stay engaged with these courses.
The limitations of our study highlight potential fruitful avenues for future investigation. First, because we do not have information on the course fees, we cannot investigate the role of the magnitude of the payment on the sunk-cost effect. Second, as noted elsewhere in the article, the nonexperimental nature of our data precludes us from studying long-term effects of payments on participants who pay for the certificate. We hope researchers will be able to conduct field experiments to understand the presence of such long-term effects. This ability will give us a more complete understanding of the overall effect of payment on user engagement. Despite these limitations, our research enhances the understanding of how paying for a certificate can potentially increase user engagement in online courses and motivates future research in this area.
Footnotes
Appendix: Constructing an Ability Metric
In our conceptualization, we discussed the fact that ability could be an important underlying factor that could explain different engagement levels across nonpaying and paying users. To empirically test this conjecture, we need a measure of ability. We extend the ideas discussed in the “Engagement and Learning Outcomes” section to construct a measure for ability, that is, efficiency in transforming input (time) to output (grades). We use the data at the quiz-attempt level to parse out a metric of ability. More precisely, we regard each student as a production plant that transforms input (time investment) to output (grade). A participant’s performance on a quiz attempt is likely to be a function of the following:
Time that they have invested in course content for the quiz, including the time spent in all previous attempts on the same quiz. Total number of prior attempts on the same quiz. We conjecture that more prior attempts will enable a participant to obtain better grades on the quiz. Time that they have invested in course content for all other quizzes prior to the focal quiz. The natural progression of a course might demand accumulation of knowledge. This component would capture engagement with prior material and thus act as a proxy for accumulated knowledge. Characteristics of the quiz. The participant’s intrinsic ability.
For each quiz attempt, we observe the total amount of time invested so far on the focal quiz, the rest of the quizzes, and the number of attempts made on the current quiz by each student. Our idea is that once we control for these and other variables such as the quiz FEs, the individual-level FEs reflect the ability of users in converting time investment into grades.
Formally, we consider the following specification:
Note that
We attempt to verify the validity of our ability metric through a battery of checks. First, conceivably, participants with more advanced educational achievement probably have greater ability. This relationship could arise either because participants with higher educational achievement have the requisite background to assimilate the course material faster than their peers or because participants with higher intellectual ability also possess more advanced degrees. To verify whether this relationship holds, we grouped the users into five tiers according to their estimated ability,
Next, we compared these ability tiers with the average frequency of quiz attempts in Table A1. The idea is that participants with greater ability do not need to attempt a quiz multiple times to fare well. The results in Table A1 show a negative correlation between the number of quiz attempts and student ability as elicited by the FEs in Equation 4. This is consistent with our expectation. Together, these analyses give us the confidence that we have a valid measure of ability, which we use in our subsequent empirical analyses.
Acknowledgments
The authors thank the JMR review team. They thank seminar participants at University of Chicago, University of Washington, University College London, University of Houston, Columbia University, Indiana University, and the Marketing Science conference at Duke University. This article has benefited substantially from conversations with Jean-Pierre Dubé, Günter Hitsch, Anita Rao, Sanjog Misra, and Sarah Moshary.
Associate Editor
Randolph Bucklin
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
