A time series approach to player churn and conversion in videogames

Abstract

Players of a free-to-play game are divided into three main groups: non-paying active users, paying active users and inactive users. A State Space time series approach is then used to model the daily conversion rates between the different groups, i.e., the probability of transitioning from one group to another. This allows, not only for predictions on how these rates are to evolve, but also for a deeper understanding of the impact that in-game planning and calendar effects have. It is also used in this work for the detection of marketing and promotion campaigns about which no information is available. In particular, two different State Space formulations are considered and compared: an Autoregressive Integrated Moving Average process and an Unobserved Components approach, in both cases with a linear regression to explanatory variables. Both yield very close estimations for covariate parameters, producing forecasts with similar performances for most transition rates. While the Unobserved Components approach is more robust and needs less human intervention in regards to model definition, it produces significantly worse forecasts for non-paying user abandonment probability. More critically, it also fails to detect a plausible marketing and promotion campaign scenario.

Keywords

Time series state space models videogames ARIMA structural time series

1. Introduction

Player profiling has become of crucial importance for the video game industry in an increasingly competitive market. For many titles, and specially for free-to-play-games, the main source of revenue are in-app purchases [1]. Thus, characterizing players based on their purchasing behavior is a common practice when aiming to improve game monetization. In this study, the total player population is divided into three groups: paying users (PUs), non-paying users (non-PUs) and inactive players, i.e., those that have already abandoned the game or churned. This classification is not as straight forward as it could appear for most titles. The definition of churn and of who is or is not a PU, are in themselves, as for any service not bounded by contract, not clear for most video games, as is discussed for example in Guitart et al. [2].

The goal of this work is to understand and predict the evolution of the daily conversion or transition rates from one group to another. In particular, the focus is on churn probability (for both PUs and non-PUs), on non-PU to PU transition rate (sometimes refered to in the literarture simply as player conversion), and on the the probability of PU becoming non-PU (or purchase churn). These will be modeled using time series State Space Models (SSMs), taking as covariates or explanatory regressors information regarding in-game planning, as well as holidays and other calendar effects. The main goal of this work is then to study the evolution of probability for a player of any group to transition to another one (and thus also of remaining in the same group). These models can be used to predict the daily conversion rates between groups. They also provide a measure of the impact of the different covariates, thus yielding a classification of, for example, in-game events, depending on which transition probabilities they impact, with which sign, and how large the effect is.

There would be several practical uses of having such a system running operationally. Accurate forecasts could be used to improve resource allocation. While these could be useful, predictions are in themselves probably not the most interesting application in this case. Having however an estimation of the impact of different events (inside the game and external), could significantly improve in-game planning. Finally, as it will be described, this type of modelling can also be used to try to detect missing information that would correspond to large discrepancies between model and reality. In this work, this approach is aimed mainly at the detection of new user acquisition campaigns and promotion campaigns (for which there is no information available except that they are known to have existed often and with significant impact). In a production setup, where all relevant information would be available, it could be used as some sort of automatic monitoring, to detect for example server failures or buggy new releases. It would also allow for comparison of how an event going on with all its particularities compares (positively or negatively) to the average effect that type of event has had in the past, i.e., to measure the relative success of any individual event or campaign.

Another interesting path to explore would be to use these predictions in the modeling of individual player behavior, for example for churn predictions [3, 4, 5, 6, 2] or of conversion to PU [7]. By using them as features, the effect of calendar effects, campaigns and in-game events could be easily incorporated, and the models thought of as correcting the probability of a phenomenon (churn, conversion …) in each group to reflect the probability of each individual.

Of course, this approach can be extended to more complex landscapes of user types. Players can be further divided into additional subgroups depending on their specific purchasing behavior (distinguishing for example, between frequent and impulsive purchasers, and/or between top spenders and the rest), skills, play frequency and/or playstyle; and transitions between all these groups studied with this technique. This could provide deep insights about game dynamics and how different types of players are affected differently by different game planning strategies and events outside the game.

While the setup under study is very specific, this approach could be useful outside the realm of videogames. It is directly translatable with identical user grouping to any online platform where purchases are available. It could also be applied to online communities to understand, for example, how users that produce content are affected differently by events than those that merely consume it, and what drives the transition between these two regimes. Even in physical stores a similar approach could be used to model the probability that someone coming into the store will make a purchase. It is basically a valid approach in any setting where there are users or customers of different types, and where it can be interesting to analyze and/or predict how users transition from one type to another.

To the best of our knowledge, this is the first work in video-games that predicts transition probabilities between different types of players and provides an analysis of the impact in-game and calendar events have in them.

This paper is organized as follows: Section 2 compiles some relevant bibliographic references of both related statistical and machine learning applications to videogames and of time series approaches in other fields. The models used are described in Section 3, and the particular dataset to which these are applied to in Section 4. The methodology followed is described in detail in Section 5, and the results are then presented and discussed in Section 6. The paper closes with a summary and conclusions in Section 7 and a brief description of the software used in 8.

2. Related works

There is not much work available in terms of time-series prediction in video-games. In Guitart et al. [8] the total aggregated value of daily sales and playtime is forecasted via different time series modelling techniques. Online traffic generated by on-line first person shooter games is dealt with from a time series perspective in Cricenti et al. [9]. Previous attempts at player profiling include for example Drachen et al. [10], Saas et al. [11], Fernández del Río et al. [12] and Guitart et al. [2]. Numerous previous studies on in-game player behavior have focused on individual player predictions (and not on aggregated time series as is the case of the present work). For example, in determining which players are going to abandon the game and when [3, 4, 5, 6, 2], predicting their lifetime-value [13, 14, 15] and their purchase decisions [16, 17], or if and when players will become PUs [7].

Time series state space modeling approaches are ubiquitous in the study of economic and social processes, both in academia and in the industry. Examples include crime [18], printed newspaper [19] or automobile [20] demand, stock prices [21, 22], electricity prices [23] or epidemics [24, 25, 26]. Applications go beyond human related processes, and they have also been used to predict for example population growth of animal species [27, 28] or their parasites [29], weather [30] or water quality [31].

3. Time series state space modeling

State Space Models (SSM) are a broad type of time series models that assume probabilistic dependence between the latent state variable and the observed measurement, thus estimating the state of an unobservable process (latent state) from an observed data set [32, 33, 34, 35]. An earlier example of a SSM is the Kalman Filter (KM) [36], widely used today.

SSMs are made up of two components. The state transition equation describes the evolution of the so called latent state $l_{t}\in{\rm I\!R}^{L}$ . The observation model describes the relation between the latent (non directly observable) state and the time series of observations $z_{t}\in{\rm I\!R}$ . The state transition equation describes stochastic transition dynamics and the observation model is also probabilistic in nature, so any such model would be determined by the equations $p(l_{t}|l_{t-1})$ and $p(z_{t}|l_{t})$ .

Any linear SSM can be expressed in the form:

$\displaystyle l_{t}=T_{t}l_{t-1}+c_{t}+R_{t}\eta_{t}$ (1) $\displaystyle z_{t}=D_{t}l_{t}+d_{t}+\epsilon_{t}$ (2)

where $T_{t}$ is the transition matrix, $c_{t}$ the latent state intercept, $R_{t}$ the selection matrix, $D_{t}$ the design matrix and $d_{t}$ the observation intercept. The terms $\eta_{t}$ and $\epsilon_{t}$ represent random innovations that are typically considered to be normally distributed, i.e,

$\displaystyle\eta_{t}=\mathcal{N}(0,\Sigma_{t}^{s})$ (3) $\displaystyle\epsilon_{t}=\mathcal{N}(0,\Sigma_{t}^{o})$ (4)

where $\Sigma_{t}^{s}$ is the state covariance matrix and $\Sigma_{t}^{o}$ is the observation covariance matrix.

Many different well known time series models can be described as SSMs. In particular, the two approaches we compare in this paper: Autoregressive Integrated Moving Average (ARIMA) and Unobserved Components (UC) models (both with regressors) have an SSM formulation [32, 33, 34, 35]. Besides, any two given SSMs can be combined. For example:

$\displaystyle z_{t}=D_{1}l_{1,t}+D_{2}l_{2,t}$ (5) $\displaystyle l_{1,t}=T_{1}l_{1,t-1}+\eta_{1,t}$ (6) $\displaystyle l_{2,t}=T_{2}l_{2,t-1}+\eta_{2,t}$ (7)

would become:

$\displaystyle z_{t}=(D_{1}D_{2})\left(\begin{array}[]{c}l_{1,t}\\ l_{2,t}\\ \end{array}\right)+\epsilon_{t}$ (8) $\displaystyle\left(\begin{array}[]{c}l_{1,t}\\ l_{2,t}\\ \end{array}\right)=\left(\begin{array}[]{cc}T_{1}&0\\ 0&T_{2}\\ \end{array}\right)\left(\begin{array}[]{c}z_{1,t-1}\\ z_{2,t-2}\\ \end{array}\right)+\left(\begin{array}[]{c}\eta_{1,t}\\ \eta_{2,t}\\ \end{array}\right)$ (9)

This allows for the combination of a linear regression in the explanatory variables (described as SSM in Section 3.1) with either an ARIMA (described in Section 3.2) or an UC (described in Section 3.3) stochastic component. While a huge variety of filters and smoothers that can be written using state linear gaussian space model formulation, this section will only describe in some detail the three aforementioned big model families that will be used throughout this paper.

3.1 Linear regression

One of the main goals of this work is to understand the deterministic behavior of the series that can be modelled in terms of other exogenous explanatory variables or covariates. This will be done through a linear regression, that can also be expressed as an SSM by setting $T_{t}=d_{t}=R_{t}=0$ and $c_{t}=\sum_{i}\beta_{i}x_{t}^{i}$ in Eq. (1), where $x^{i}$ are the covariates or regressors and $\beta_{i}$ the corresponding parameters to be estimated.

3.2 Autoregressive Integrated Moving Average models

An ARIMA model of order ( $p, d, q$ ) is and stochastic time series model. Each observation has a weighted dependence on the previous $p$ observations (AR terms) and on the previous $q$ noise realizations (MA terms). This results in a parsimonious model in which dependence of each time step to virtually infinite previous lags can be captured with a few parameters. The integrated refers to the $d$ differences that can be taken on the original series in order to make it stationary and/or reduce its variance.

An ARMA model (identical to the ARIMA described above without differentiating the time series) with regressors in its better known form is usually written [32]:

$\displaystyle z_{t}=\alpha+\sum_{i=1}^{n}\beta_{i}x_{i,t}+y_{t}$ (10) $\displaystyle y_{t}=\phi_{1}y_{t-1}+\phi_{2}y_{t-2}+\ldots+\phi_{p}y_{t-p}+% \theta_{1}\epsilon_{t-1}+\theta_{2}\epsilon_{t-2}+\ldots+\theta_{q}\epsilon_{t% -q}+\epsilon_{t}$ (12) $\displaystyle\epsilon_{t}=\mathcal{N}(0,\sigma^{2})$ (13)

where $p$ and $q$ are the orders of the autoregressive (AR) and moving average (MA) polynomials respectively, $\phi_{1},\ldots\phi_{p}$ the autorregresive parameters, $\theta_{1},\ldots\theta_{q}$ the moving average parameters, the $n$ $x_{i}$ are the explanatory variables (covariates or regresors) and the $\beta_{i}$ their associated parameters and $\alpha$ the model’s intercept.

In SSM format, the ARMA ( $p, q$ ) equation can be written as [33]:

$\displaystyle y_{t}=(1,0,\ldots,0)l_{t}$ (14) $\displaystyle l_{t}=\left(\begin{array}[]{ccccc}\phi_{1}&1&0&\ldots&0\\ \phi_{2}&0&1&0&\ldots\\ \vdots&\vdots&\vdots&\vdots&\\ \phi_{r}&0&0&\ldots&0\\ \end{array}\right)l_{t-1}+\left(\begin{array}[]{c}1\\ \theta_{1}\\ \vdots\\ \theta_{r}\\ \end{array}\right)$ (15)

where $r=\max(p,q+1)$ , $\theta_{i}=0$ for $q<i\leqslant r$ and $\phi_{i}=0$ for $p<i\leqslant r$ , and $l_{t}^{T}=(y_{t},y_{t-1},\ldots y_{t-p})$ .

3.3 Unobserved Components models

We refer to Unobserved Component or Structural Time Series models to formulations in which a time series is explained in terms of underlying trends, cycles or seasonal dependencies. They can be generally expressed as [37, 38]:

$\displaystyle z_{t}=\mu_{t}+\gamma_{t}+c_{t}+\epsilon_{t}$ (16)

where $\mu_{t}$ is the trend component, $\gamma_{t}$ is the seasonal component, $c_{t}$ the cyclic component and $\epsilon_{t}$ a random shock $\epsilon_{t}\sim\mathcal{N}(0,\sigma^{2})$ .

Both the cyclical and seasonal components intend to capture behavior that repeats itself. The seasonal part with a fixed, defined frequency $s$ (for example $s=7$ for weekly seasonality of a daily time series):

$\displaystyle\gamma_{t}=-\sum_{j=1}^{s-1}\gamma_{t-j}+w_{t}$ (17)

where $w_{t}$ is random noise with zero mean and variance estimated as an additional parameter. The cyclical through longer periods of unknown frequency:

$\displaystyle c_{t+1}=c_{t}\cos\lambda_{c}+c_{t}^{*}\sin\lambda_{c}+u_{t}$ (18) $\displaystyle c_{t+1}^{*}=-c_{t}\sin\lambda_{c}+c_{t}^{*}\cos\lambda_{c}+u_{t}% ^{*}$ (19)

where $u_{t}$ is also normally distributed with mean zero and estimated variance. The cyclic frequency $\lambda$ is also estimated as a parameter.

The trend component can be expressed as:

$\displaystyle\mu_{t+1}=\mu_{t}+\nu_{t}+\eta_{t+1}$ (20) $\displaystyle\nu_{t+1}=\nu_{t}+\zeta_{t+1}$ (21)

where $\eta_{t}$ and $\zeta_{t}$ represent white noise (normally distributed with zero mean) with variances additional parameters to be estimated. If all the elements in Eqs (20) and (21) are non zero the term is referred to as local linear (stochastic) trend. Other particular behaviors correspond to some of the elements of the equation being null: smooth trend ( $\eta_{t}=0$ ), local (stochastic) level and deterministic trend ( $\zeta_{t}=0$ ), deterministic trend ( $\eta_{t}=\zeta_{t}=0$ ), local (stochastic) level ( $\nu_{t}=\zeta_{t}=0$ ) or a simple constant term ( $\nu_{t}=\zeta_{t}=\eta_{t}=0$ ). As will be soon discussed, the local level model is used extensively in this paper. Note that this corresponds to a random walk.

Many structural time series models are related to ARIMA ones. As the UC formulation typically has several random noise terms, these have to be combined to be made equivalent to the single noise term of ARIMA formulations. This normally translates into some regions of the ARIMA parameter space being forbidden, with this resulting form, equivalent to the UC model, usually refered to as reduced model [38]. For example, the reduced model of a local level is an ARIMA of order (0, 1, 1).

4. Dataset

The game under study is Age of Ishtaria, a mobile role-playing card freemium game developed by Silicon Studio. Data is available since its launch on September 25, 2014 to May 9, 2017. In these close to first two years and a half of its history 2107166 players went through the game, of which 33194 did at least one purchase. The game had in this period typically between ten and twenty thousand daily active users (DAU), with peaks (presumably due mainly to new user acquisition campaigns) of nearly fifty thousand DAU. At the end of the period 18483 players were considered to still be active.

Three groups of players will be the main focus of attention in this work: non-paying users (active and not purchasing), paying users (active and purchasing) and churned players (inactive players).

As it has been mentioned already in Section 1 the definition of churn is not straight forward for online games. Following the method discussed in Guitart et al. [2], players can be considered inactive when they have not logged in for a fixed number of days. This number is determined so as to be useful in detecting churn as soon as possible, while keeping false churners (players flagged as churned that come back to the game) and missed sales (purchases made by false churners after they come back to the game) under a reasonable threshold. In particular, for this dataset, using the first two months of data, the churn definition is set to 9 days, as this yields less than 10% false churners and less than 1.5% lost sales. Unlike in some previous work related to player purchasing behavior, where all players that have made at least one purchase in their lifetime are considered as PUs, here we also consider transitions between active PU and active non-PU. That is, we consider paying players become non-PUs after a long enough period with no purchasing activity. Purchase churn is defined analogously to login churn, and the period without purchases to mark a previous PU as transitioned back to non-PU is set in this case to 50 days.

Data collected includes individual player-related information such as daily logs into the game and purchases made, and also non-user related such as in-game events. The latter are included in the modelling as exogenous variables, considering all in-game event types provided: Gigant Break, Gift Event, Gacha, Duel Arena, Battle Arena, Battle Event, Mission Event, Mission Bingo, Raid Event, Raid Boss, Item Collection, Poll Event, Call to Arms, Raid Battle and Adveniment. The start date and end date of each event of each of these types throughout the period is known, as well as a measure of the expected impact they had when planned, tagged as 0, 1, 2, 3 or 4. Although numeric, it is better understood as a qualitative measure of expected outcome.

When discussing online videogame data, it is important to note the exceptional quality these datasets have. Every action every player takes in the game is recorded. This makes both the information on daily logs and purchases virtually noise free and eliminates the problem of missing value treatment. Regarding the logs, even if some of the actions were not recorded due to some technical problem, a single action per player would be enough to rightfully count them as logged in that day. In what concerns purchases, players would complain if these were not effective (and thus not recorded). This makes the amount of lost logins and purchases in the dataset negligible.

Aggregating log and purchase information the following daily time series of populations of interest can be obtained: PUs (number of players that have made a purchase in the last 50 days and have logged in in the previous 9 days), non-PUs: (number of players that have logged in the past 9 days but have made no purchase in the last 50 days) and inactive players (players that have not logged into the game in the previous 9 days).

This information also allows for the construction of the (absolute) daily transition time series: new players (users that log in for the first time that day), non-PU to PU (players that purchase that day and had not made a purchase in the previous 50 days), PU to non-PU (players that purchased 51 days ago for the last time), non-PU to inactive (players that have not purchased in the previous 50 days and logged into the game for the last time 10 days ago), PU to inactive (players that have purchased in the previous 50 days and logged into the game for the last time 10 days ago), churned to PU (players that log back into the game and make a purchase on that day after having been deemed churned) and churned to non PU (inactive players that log back into the game and do not make a purchase on that day). Figure 1 shows the matrix of transition and remaining series (in number of users).

Figure 1.

Matrix of daily transitioning or remaining players between the three different segments considered. Top row concerns non-PUs, with plots for the number of them who are remaining non-PU (left), becoming PU (middle), or churning (right). Middle row refers to PUs who are: becoming non-PU (left), remaining PU (middle), or churning (right). Bottom row shows the number of churned players who are: becoming once again active non-PU (left), active PU (middle) or remaining inactive (right).

With the population and transition series the conversion rates can be easily computed by dividing the daily transitions between the population of the group of origin on the previous day. These represent the daily probability of a user in a given group transitioning to a different one. In this work the transitions of churned players back to life (players that become active again) are not considered, i.e, the false churner probability will not be modeled. Both involve a small number of players and are of less interest than the other four conversion rates. The daily new user series is however also taken into account. This series is not only of utmost interest in itself, but is also crucial, as will be described, in the detection of the unknown marketing campaigns (as it is here where they will have the largest impact) and in discriminating these from promotion campaigns (that should have no measurable effect on it).

5. Methodology

Although the idea is to define a methodology that requires limited human intervention, the aim of this work is not to find a way of automatically producing forecasts. Human intervention is deemed necessary in acquiring qualitative knowledge of the systems and processes at play, which is one of the most important goals of this exercise. There must be however a fixed procedure guiding and limiting this intervention. This will allow for the use of this framework when defining more complicated segmentation landscapes. It also guarantees that different people will arrive to very similar or identical model definitions.

For each time series two different state space models – an ARIMA and a UC approach – both with covariates – are considered. The following five steps (described in some detail in the subsections below) describe the process followed for each time series to be modeled: (1) model selection; (2) selection of significant exogenous variables; (3) intervention definition; (4) model selection revisited; (5) forecasting and verification.

It is important to stress that the process described, though automatic to some extent, is still time consuming and relies heavily in human expertise. Though some steps could be taken to further simplify and automatize the process, this is inevitably going to be the case for the initial modeling phase and/or to use this approach to uncover missing information (marketing and promotion campaigns in this work). Once this phase is completed however, though some expertise would still be necessary periodically for model maintenance, the need for human intervention would be radically diminished.

5.1 Model selection

For the model definition and covariate selection and definition, all the historic data available is employed. The process begins with a general inspection of the series, its regular, weekly and monthly differences, and their correlograms. For the transition series, only additive models (non-transformed series) are analyzed. For the new users series, both the original and the log-transformed series are considered. The latter is selected as its variance is more stable. Regression parameters for explanatory variables in log-transformed series modeling (i.e. in multiplicative models) have a straight forward interpretation as elasticities, which is convenient to intuitively understand the parameters estimated. In the case of conversion rates (that already represent a fraction), parameters of additive models can also be understood in a straight forward manner as the increase in a day due to each unit of increase in the covariate series. This first inspection together with some basic stationarity tests decides in favour of a regular difference in the ARIMA case for all series. This leaves weekly structure to be accounted for, but taking a weekly difference yields a much higher anticorrelation and variance in the resulting series than a regular one in all cases.

The starting training date (which will be different for the different time series) is decided upon. The behavior in the first days after launching the game is almost always very erratic and it is normally advisable to simply eliminate it. Further more, because of the nature of the launch, some series may start earlier than others (for example in this case purchases were not possible during the first days). In other cases the definition itself of the player segment (churn takes 9 days to be detected, purchase churn 50 days) accounts for these differences.

The two models – ARIMA and UC – to use as a base with which to explore the effects of other variables will be selected using brute exploration. This means an estimation is run for the different possibilities of a selected subset of the model space (without linear regression to covariates) and the results compared in order to select the best option. The selection is done then through human intervention, but only considering the best 5 performing options according to the Akaike information criterion (AIC) [39] after exploring a large amount of possibilities. Additionally, the Bayesian information criterion (BIC) [40], the Hannan-Quinn information criterion (HQIC) [41], residual1

¹
Residuals refer to the unexplained part of the series after modeling. They should correspond to the random noise term described in Section 3.

variance, independence and normality, and parameter significance are taken into account to select one option, always favouring less parameters for similar performance. Ljung-Box [42] and Jarque-Bera [43] tests are used to assess independence and normality of the residuals respectively. Parameter significance is evaluated using Z-scores [44], with parameters with associated

p

-values under 0.1 considered significant. If less human expertise is available and/or the time for the model definition phase wants to be reduced, the best AIC performing option could be selected.

In the ARIMA case estimations are run for all possible combinations of weekly and regular ARMA polynomials of order up to 5 (in both AR and MA). This is a very extensive exploration designed to minimize the expertise and time devoted to the preliminary phase of time series inspection, and to make the process as automatic as possible. Given that the different combinations can be run in parallel and the ARIMA estimation is not computationally expensive, this is in general a reasonable approach. The use of higher than order two ARMA polynomials is however rarely justified, so the parameter space to be explored could be bounded to lower orders. Carefully analyzing the differently differenced series and their correlograms would also allow for a selection of only a few different models to try, and this would also be a valid option.

Interestingly, although there is a very clear weekly structure in at least the PU churn and purchase churn series (as shown by the significant correlations in the correlograms for lags 7 and some of its multipliers), this analysis favours in all cases models with no seasonal ARIMA. Weekly effects will be therefore accounted for using day of the week exogenous variables as described in Section 5.2.

In the UC case, the use (or not) of a cyclic term and the use (or not) of weekly and monthly seasonality is explored. Monthly seasonality (as the preliminary analysis suggested) is rejected in all cases. In regards to weekly seasonality, its use is deemed favourable in all cases (and it will be used instead of the day of the week covariates employed for the ARIMA models). For the level-trend, different options are also explored: no trend, fixed intercept (deterministic constant), local level (random walk), fixed slope (deterministic trend), local level with deterministic trend (random walk with drift), local linear trend and smooth trend (integrated random walk). In all cases, the local level type of trend is the best option. As it would be expected, other options with more degrees of freedom have an additional reduction of the residual variance, but have notably worse information scores. Here again a more careful initial exploratory analysis could limit the number of models to be tried, but this hardly seems justified as UC models are even less computationally expensive than ARIMA ones, and the parameter space explored is in any case smaller.

5.2 Exogenous variable selection

With the information available, the following explanatory time series to be used as covariates are built:

•
Day of week: Effects for each day of the week (one variable per day of the week which is 1 that day and 0 elsewhere).
•
Calendar effects: First of month, last of month, first of year and last of year effects (estimated separately).
•
Holidays: National holidays and school holidays effects are considered separately (with effect estimated jointly for all days in each of these two groups).
•
In-game events: All events of the same type and with the same event scale are considered jointly. Out of each type of event and event scale two inputs are built: event on/off (covariate is 1 when there is an event of that type and 0 elsewhere) and event start (covariate is 1 when an event of that type is beginning on that day and 0 elsewhere).
•
Number of in-game events: Besides inputs for each event type and event scale combination, two additional inputs with values number of events going on that day and number of events starting on that day are also considered.

Additionally, interventions will be defined for each of the time series as described in Section 5.3. These interventions will be tried as exogeneous variables not only for the series for which they were detected but also for the rest. This means that the methodology described will be repeated twice for all of the series to ensure that any and all interventions are tested for all series.

To decide which of the variables to use with each time series, the available covariates are added progressively in groups, each time discarding those with parameters that are not estimated to be significant. Significance is evaluated using Z-scores [44], and parameters with associated $p$ -value larger than 0.1 are rejected. The grouping and order in which explanatory variables are tried is that of the enumeration above in the first round. When a group is made up of more than ten inputs (i.e. for in-game event covariates), these are tried in groups of ten. The handling of interventions is very similar and is described in more detail in Section 5.3 (as well as the order used in trying all covariates in the second round). After covariates from all groups including interventions have been selected in this way, all variables that have been left out are then tried again one by one to make sure that they are still not significant with the final configuration.

Naturally, for churn probability series exogenous variables are always introduced with a delay corresponding to (regular or purchase) churn definition (10 and 51 days respectively for this game).
5.3 Interventions

Interventions are exogenous variables defined adhoc after analyzing the residuals of a previous covariate configuration. They should capture the effect of the most important marketing campaigns aimed at new user acquisition, as well as promotion campaigns aimed at conversion to PU or enhanced spending of PUs, of which, as it has been already noted, no information is available (except for the fact that they did exist and had significant impact). The detection and classification of these campaigns is one of the main goals of this work.

Marketing and promotion campaigns (or other effects of unknown origin with significant impact in the transition rates) are expected to leave very large residuals in the absence of these interventions. The procedure to construct them will be to start with the day with largest deviation in the residuals. Human inspection is needed to decide the exact shape of the intervention. If, for example, a very large positive residual is followed by a large negative one several days later for the ARIMA model (which always uses a regular difference), a campaign will be assumed to have run starting on the day with large positive residual and ending the day before the negative one. The model is then reestimated with the intervention designed to capture the effect seen in the original series and the residuals. If the paramater is significant, the variance is reduced, and if Jarque-Bera normality test of the residuals yields a better score, the intervention is kept, and a new intervention is included for the next largest deviation in the new residuals. This process is repeated until adding new interventions makes the residual less and not more normal.

As described in the previous Section 5.2, interventions discovered for any of the series will also be tried on the other ones. Depending on the type of effect and on which series is estimated as significant, interventions are classified as:

•
Marketing interventions: These are outliers which look like they could be a result of marketing (out of game) or new user acquisition campaigns. They should have a strong positive impact in the new user series. Typically, they will also have a positive effect on churn transitions (after 9 days), unless they have been particularly good at getting to the right target (i.e., people that have actually kept playing after the first day they tried the game). This effect is expected to be larger for non-PU churners (as people who try the game and rapidly move to make their first purchase are more likely to be really interested in the game and thus continue playing). They could also possibly have a (limited) positive effect in the conversion to PU series, as it could also encourage spending in people that are already playing but are exposed to the campaign. In this case, they could also have some impact in purchase churn probability 50 days afterwards.
•
Promotion interventions: These should reflect promotions offered to players, and are therefore mainly characterized by having a strong positive impact on the probability of conversion to PU. They will also typically have a measurable effect on the purchase churn probability 50 days later, and the difference of impact in both series for different campaigns will help detect which have been more useful in generating long term conversion. They could have some minimal impact on churn probabilities if they have been particularly bad (if players are spammed with notifications for example). Never should there be any mensurable effect in new users, as these are promotions that are only available for already existing players.
•
Unknown interventions: Outliers in a different direction from what would be expected due to marketing or promotion campaigns. These could be related to other relevant missing information such as server problems, buggy releases, changes in the game dynamics or content, etc.

Interventions are tried with all series in the same way as the rest of covariates and considering the three groups listed above (and grouped in tens when needed). The series were modeled in the following order: (1) new users (in an attempt to discover as many new user acquisition or marketing campaigns as possible); (2) conversion to PU (in search for missing important promotion campaigns); (3) non-PU churn (for further marketing campaign detection, as promotions should have none to very little impact), (4) PU churn and (5) purchase churn (where further promotions can be detected).

After this first round all additional covariates in form of interventions are assumed to have been detected, and the exogenous variable selection process is then repeated starting from scratch for all series. In this second round, taking into account the nature of each of the series modeled and of the impact the different interventions and in-game events are expected to have, the order in which the different groups are tried varies slightly, with marketing interventions being tried before in-game events and the rest of interventions for new users and before the rest of interventions for both churn series; and promotion interventions being tried before the rest of interventions for conversion to PU and purchase churn. Unknown interventions are tried last in all cases.

This process yielded a plausible campaign scenario, as will be described in Section 6, when working with ARIMA models. The process was more cumbersome and less effective when dealing with UC models, as they have several components and noise terms that can better capture sudden rises and drops in the series without the need of interventions. It was finally decided to carry out the intervention definition process with the ARIMA models only and then use these for both ARIMA and UC on what has been described as second round.

This is the most time consuming part of the process, and the one where expert human intervention is more critical. This is however unavoidable if there is missing information (marketing and promotion campaigns in our case) that should be unveiled in the process. If finding out the more plausible particular scenario is not a priority, a fully automatic simplified approach could be followed. Namely, the largest outliers in the residuals could be corrected using a single additional variable for that day. Analogously to the process described above, these would be then accepted or rejected depending on whether its associated parameter is estimated as significant or not, and whether it improves or not the normality tests. This process would be repeated automatically until the introduced variable is not significant or the normality test is degraded. This would correct outliers, yielding a more consistent model and preventing it from learning from atypical values. The anomalous realizations of the series will remain however unaccounted for. In addition, a threshold could be introduced below which the normality test would be considered valid and the iterative method interrupted, as specially without human control, it is important to avoid overfitting the model. Even in cases where there is not known missing relevant information, it is convenient to follow such a process in order to ensure normality of the residuals, as the models would not be formally valid otherwise, and failure to correct unusual behaviour would result in underfitting.
5.4 Model selection revisited

After having a final set of exogenous explanatory variables with which to proceed, the model space is again revisited and the best AIC scoring options analyzed in some detail again. Although in some cases there were slight changes from the originally selected model definition described in 5.1, the main findings described there hold. Namely, in all cases ARIMA models with a regular difference and without weekly polynomial and UC models with local level outperform the others.

5.5 Forecasting and verification

Finally, after selecting the best model definition with the available data, daily forecasts are run for all of 2016 and what is available of 2017 for verification. Replicating a possible production setup, new daily forecasts are run for each month using data until the last day of the previous month to train the model. However, compared to a real production setup, the current model and exogenous variable selection made use of more data (all historic data available). Nevertheless, it still only uses for each training that will produce the forecasts the data that would have been available at the time. Besides, as the interventions used are always local, information on future interventions (planned marketing and promotion campaigns, or other unexpected events such as buggy updates or server failures) will not be available for the models, accounting for large forecast errors due to missing relevant information. Monthly Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are then examined and compared for the different models and series to assess forecast accuracy. Note the aim of computing these validation metrics is that of comparing the performance of both models, and of each model for different months, rather than assessing the overall goodnes of any of the two models, as there is no appropriate baseline to which to compare them.

Figure 2.

Daily new users original series (top), log-transformed (second row), difference of the log-transformed (third row), ACF (bottom left) and PACF (bottom right). Start of the training period is marked with a dashed line and corresponds to October 10, 2014. The dotted lines in the ACF and PACF define the significance region (for values larger than the positive line or smaller that the negative one) with 95% confidence.

Figure 3.

Daily non-PU to PU conversion original series (top), its regular difference (middle), ACF (bottom left) and PACF (bottom right). Start of the training period is marked with a dashed line and corresponds to October 5, 2014. The dotted lines in the ACF and PACF define the significance region (for values larger than the positive line or smaller that the negative one) with 95% confidence.

Figure 4.

Daily PU churning rate original series (top), its regular difference (middle), ACF (bottom left) and PACF (bottom right). Start of the training period is marked with a dashed line and corresponds to October 31, 2014. The dotted lines in the ACF and PACF define the significance region (for values larger than the positive line or smaller that the negative one) with 95% confidence.

Figure 5.

Daily non-PU churning rate original series (top), its regular difference (middle), ACF (bottom left) and PACF (bottom right). Start of the training period is marked with a dashed line and corresponds to October 31, 2014. The dotted lines in the ACF and PACF define the significance region (for values larger than the positive line or smaller that the negative one) with 95% confidence.

Figure 6.

Daily purchase churn rate original series (top), its regular difference (middle), ACF (bottom left) and PACF (bottom right). Start of the training period is marked with a dashed line and corresponds to November 5, 2014. The dotted lines in the ACF and PACF define the significance region (for values larger than the positive line or smaller that the negative one) with 95% confidence.

6. Results

The daily number of new users (players who log into the game for the first time) is shown in Fig. 2’s top plot, with a dashed line marking what has been taken as starting day for the training, October 10, 2014 for this series. The ARIMA used was (2, 1, 1) and the local level model with weekly seasonality used a longer cycle periodicity too. In both cases the series was log-transformed (all prediction error measures given refer to the untransformed forecast though). The log-transformed series together with its regular difference are Fig. 2’s second and third plots from the top respectively (again with a dashed line marking the starting date for the training). The two bottom figures correspond to the Autocorrelation Function (ACF, left) and Partial Autocorrelation Function (PACF, right) of the log-transformed. For this and all other series, the ACF and PACF were computed leaving out the period before the training starting date. In the correlograms, regions outside the area delimited by the dotted lines correspond to significant correlation values with 95% confidence.

Figures 3–6 refer to conversion to PU, PU churn, non-PU churn and purchase churn rates respectively. They all show the original series at the top, the differenced series in the middle and the ACF and PACF of the original series (excluding the beginning of the series which is not used in the training and with regions outside the area delimited by the dotted lines corresponding to significant correlation values with 95% confidence) at the bottom. Starting date for the training is shown as a dashed line on both the original and differenced series. The training began on October 5, 2014 and the ARIMA model used was (0, 1, 3) for conversion to PU. Churn modeling began on October 31, 2014 for both PU and non-PU, with (0, 1, 2) used as ARIMA for the former and (1, 1, 2) for the latter. For purchase churn the starting date was November 25, 2014 and the ARIMA chosen was (0,1, 3). As it was already mentioned, all UC models used a local level and weekly periodicity, and none (except new users) added a longer cycle component.

Not only are the same covariates selected for both models by following the process described in Section 5, but the parameters estimated by both are very similar differing typically less than 10% and in very few cases more than 20%. The value estimated for a selection of parameters for the different series is displayed in Table 1 for the ARIMA model and in Table 2 for its UC counterparts. It is by no means comprehensive and it is meant to illustrate some of the discussions that follow only.

Taking into account the multiplicative nature of the model used for the new users series (in that it is the log-transform of an absolute number), estimated parameters for it can be understood as elasticities. This means, for example, that using the ARIMA estimation, a day which is national holiday would mean nearly 5% more new users than a day which is not. For the rest of series that are conversion rates subject to an additive model, parameters should be understood as absolute increases. For example, this would mean that national holidays will make the probability of PU churn increase by 0.0033.

Table 1
ARIMA estimates for a selection of parameters for the different series

Parameter	New users	Conversion to PU	Non-PU churn	PU churn	Purchase churn
National holidays	4.89 $\times$ 10 ${}^{-2}$	1.61 $\times$ 10 ${}^{-4}$	3.30 $\times$ 10 ${}^{-3}$	–	7.67 $\times$ 10 ${}^{-4}$
Battle event (start)	–	2.16 $\times$ 10 ${}^{-4}$	–	–	$-$ 2.77 $\times$ 10 ${}^{-4}$
Gacha 4	–	1.03 $\times$ 10 ${}^{-3}$	–	–	1.61 $\times$ 10 ${}^{-2}$
Raid event (start)	–	$-$ 1.07 $\times$ 10 ${}^{-3}$	–	3.80 $\times$ 10 ${}^{-3}$	3.23 $\times$ 10 ${}^{-3}$
Unknown 2017/02/09	–	$-$ 1.33 $\times$ 10 ${}^{-3}$	1.51 $\times$ 10 ${}^{-2}$	1.37 $\times$ 10 ${}^{-4}$	1.88 $\times$ 10 ${}^{-2}$
Marketing 2015/02/05-07	5.71 $\times$ 10 ${}^{-3}$	9.97 $\times$ 10 ${}^{-4}$	3.45 $\times$ 10 ${}^{-2}$	4.38 $\times$ 10 ${}^{-3}$	1.23 $\times$ 10 ${}^{-3}$
Marketing 2015/03/16-25	6.06 $\times$ 10 ${}^{-3}$	–	3.70 $\times$ 10 ${}^{-2}$	5.14 $\times$ 10 ${}^{-3}$	–
Marketing 2015/05/25-31	6.07 $\times$ 10 ${}^{-3}$	–	2.74 $\times$ 10 ${}^{-3}$	–	–
Marketing 2016/09/21-22	6.25 $\times$ 10 ${}^{-3}$	1.90 $\times$ 10 ${}^{-4}$	2.40 $\times$ 10 ${}^{-2}$	–	–
Marketing 2017/02/07-09	2.06 $\times$ 10 ${}^{-3}$	–	$-$ 1.26 $\times$ 10 ${}^{-3}$	–	–
Promotion 2015/03/19	–	1.53 $\times$ 10 ${}^{-3}$	–	–	4.60 $\times$ 10 ${}^{-3}$
Promotion 2015/04/23-24	–	1.74 $\times$ 10 ${}^{-3}$	–	–	2.30 $\times$ 10 ${}^{-3}$
Promotion 2016/09/21-23	–	3.38 $\times$ 10 ${}^{-3}$	–	–	–

Table 2

Local level estimates for a selection of parameters for the different series

Parameter	New users	Conversion to PU	Non-PU churn	PU churn	Purchase churn
National holidays	5.17 $\times$ 10 ${}^{-2}$	1.47 $\times$ 10 ${}^{-4}$	2.83 $\times$ 10 ${}^{-3}$	–	7.10 $\times$ 10 ${}^{-4}$
Battle event (start)	–	2.13 $\times$ 10 ${}^{-4}$	–	–	$-$ 2.96 $\times$ 10 ${}^{-4}$
Gacha 4	–	1.01 $\times$ 10 ${}^{-3}$	–	–	1.59 $\times$ 10 ${}^{-2}$
Raid event (start)	–	$-$ 1.06 $\times$ 10 ${}^{-3}$	–	3.88 $\times$ 10 ${}^{-3}$	2.50 $\times$ 10 ${}^{-3}$
Unknown 2017/02/09	–	$-$ 1.25 $\times$ 10 ${}^{-3}$	1.57 $\times$ 10 ${}^{-2}$	1.31 $\times$ 10 ${}^{-4}$	1.79 $\times$ 10 ${}^{-2}$
Marketing 2015/02/05-07	5.78 $\times$ 10 ${}^{-3}$	9.81 $\times$ 10 ${}^{-4}$	3.56 $\times$ 10 ${}^{-2}$	4.17 $\times$ 10 ${}^{-3}$	2.73 $\times$ 10 ${}^{-3}$
Marketing 2015/03/16-25	6.45 $\times$ 10 ${}^{-3}$	–	3.95 $\times$ 10 ${}^{-2}$	5.17 $\times$ 10 ${}^{-3}$	–
Marketing 2015/05/25-31	5.49 $\times$ 10 ${}^{-3}$	–	1.34 $\times$ 10 ${}^{-3}$	–	–
Marketing 2016/09/21-22	6.12 $\times$ 10 ${}^{-3}$	1.95 $\times$ 10 ${}^{-4}$	2.29 $\times$ 10 ${}^{-2}$	–	–
Marketing 2017/02/07-09	2.72 $\times$ 10 ${}^{-3}$	–	$-$ 1.88 $\times$ 10 ${}^{-3}$	–	–
Promotion 2015/03/19	–	1.46 $\times$ 10 ${}^{-3}$	–	–	3.44 $\times$ 10 ${}^{-3}$
Promotion 2015/04/23-24	–	1.74 $\times$ 10 ${}^{-3}$	–	–	2.11 $\times$ 10 ${}^{-3}$
Promotion 2016/09/21-23	–	3.46 $\times$ 10 ${}^{-3}$	–	–	–

The estimated weekly structure and of calendar and holiday effects is qualitatively similar for all conversion rates, and it has to do mainly with general patterns observed in the playtime. People tend to play more towards the end of the week and specially during weekends, and less on Monday to Wednesday. National holidays also have a clear positive impact in all series (except PU churn), while school holidays are not estimated as significant in any of the series, suggesting a limited amount of school age players in the game as compared to older working population.

In-game events have, as expected, no significant effect estimated in the new users series. None were also estimated significant in explaining non-PU churn, and only a couple of them had a low impact in PU churn. Most of them had however clear impact in conversion and purchase churn, suggesting they drive spending (or lack thereof) more than login engagement. Impact in both conversion to PU and purchase churn can be positive or negative, which implies that compared to no events at all, some encourage and some discourage spending. A typical event will have positive effect on both series (see for example Gacha 4 in Tables 1 and 2): they encourage spending and drive conversion to PU, but at least part of this effect is lost once the event is over, hence their positive delayed effect in purchase churn. Other event types have more interesting effects. Battle event for example, not only drives conversion to PU, but actually reduces purchase churn. It does seem to motivate non-PU into becoming PU, while also driving expenditure in players who are already PUs. On the other hand, there are event types such as Raid event that discourage conversion while having a positive effect in purchase churn and even a small effect in PU churn, which suggests that this event type discourages spending and that it is generally disliked by PUs.

Following the methodology described in Section 5, using ARIMA models yields a reasonable marketing and promotion intervention scenario (as described in some more detail below). The same process using local level models however, as has already been briefly discussed, yielded worse results, with less interventions detected and producing degraded forecasts. Following the covariate selection process with the ARIMA defined interventions yields exactly the same selection for each series for both models. It was therefore decided to use only the ARIMA models for the intervention definition process. As already noted before, the poor performance of UC as compared to ARIMA models in outlier detection can be explained as the local level model’s multiple noise terms are more able to absorb sudden changes in the series than the more fixed structure provided by ARIMA.

Marketing interventions have a large positive impact in the new users series, and typically also have a positive effect on both churn ones. Of course, campaigns with no effect (or even negative) in churn are precisely the most successful campaigns. Comparing the effect of the different new acquisition campaigns in these series gives an idea of which ones were targeting more effectively potential players that will do more than just try the game. A limited number of marketing interventions were also estimated as significant for the conversion to PU series. Either they were linked with something (for example new content) that encouraged spending, or they were simply good at motivating spending in people who were already players. Marketing type interventions were also tried with one and two days delay in the conversion to PU series. The idea was to account for newly acquired players through these campaigns that could decide to purchase for the first time in the next days of play. No significant impact was estimated for any of the marketing interventions, suggesting that this is not a frequent event, and that newly acquired players through these campaigns will either become PUs the same day they first log in, or will do so later on in the game if at all.

For example, most of the marketing interventions in Tables 1 and 2 have been chosen to have a comparable impact in the new users series (around a 6% increase except for that taking place on February 2017). The four of them also impact non-PU churn and two of them impact PU churn as well. From these four, the most successful one would probably be that of May 2015: it has a much lower positive effect in non-PU churn and no effect in PU churn. The worst one would be that of March 2015: it has the highest impact on non-PU churn and also increases PU churn. Two of them also impact conversion to PU, and those on February 2015 even impact purchase churn. The final marketing intervention included (February 2017) is a perfect example of a very good campaign: it actually decreases non-PU churn, so not only has it attracted new users, it also seems to keep all (new and old) players engaged.

Table 3

Monthly forecast MAE: mean and standard deviation (SD) for the ARIMA and local level models

Time series	ARIMA mean	ARIMA SD	Local level mean	Local level SD
New users SD	440.72	270.73	483.20	282.71
Conversion to PU	4.6 $\times$ 10 ${}^{-4}$	2.5 $\times$ 10 ${}^{-4}$	5.1 $\times$ 10 ${}^{-4}$	2.5 $\times$ 10 ${}^{-4}$
PU churn	1.7 $\times$ 10 ${}^{-3}$	5.8 $\times$ 10 ${}^{-4}$	1.8 $\times$ 10 ${}^{-3}$	5.8 $\times$ 10 ${}^{-4}$
Non-PU churn	1.4 $\times$ 10 ${}^{-2}$	7.0 $\times$ 10 ${}^{-3}$	5.2 $\times$ 10 ${}^{-2}$	6.8 $\times$ 10 ${}^{-3}$
Purchase churn	3.3 $\times$ 10 ${}^{-3}$	1.9 $\times$ 10 ${}^{-3}$	3.3 $\times$ 10 ${}^{-3}$	1.6 $\times$ 10 ${}^{-3}$

Table 4

Mean RMSE for all successive monthly forecasts for new users (top), conversion to PU (second row), PU churn (third row), non-PU churn (fourth row) and purchase churn (bottom). ARIMA is shown with a solid line and local level with a dashed lined

Time series	ARIMA mean	ARIMA SD	Local level mean	Local level SD
New users SD	634.64	461.44	677.91	463.86
Conversion to PU	7.1 $\times$ 10 ${}^{-4}$	5.0 $\times$ 10 ${}^{-4}$	8.2 $\times$ 10 ${}^{-4}$	5.2 $\times$ 10 ${}^{-4}$
PU churn	2.1 $\times$ 10 ${}^{-3}$	7.0 $\times$ 10 ${}^{-4}$	2.2 $\times$ 10 ${}^{-3}$	6.8 $\times$ 10 ${}^{-4}$
Non-PU churn	1.9 $\times$ 10 ${}^{-2}$	9.4 $\times$ 10 ${}^{-3}$	5.6 $\times$ 10 ${}^{-2}$	8.9 $\times$ 10 ${}^{-3}$
Purchase churn	5.1 $\times$ 10 ${}^{-3}$	3.7 $\times$ 10 ${}^{-3}$	4.9 $\times$ 10 ${}^{-3}$	3.0 $\times$ 10 ${}^{-3}$

Figure 7.

No promotion interventions or in-game covariates were found to be significant for the new users series as expected (only players would notice any of them). Many of the promotion interventions have also a noticeable impact in purchase churn (50 days later), but a lot of them do not, pointing at the promotions that made lasting conversions. For example, considering promotion interventions shown in Tables 1 and 2, the most successful one would have been that of September 2016 (highest impact in conversion and no effect in purchase churn), while the least successful one that of March 2015.

Only a handful of unknown interventions needed to be defined, most of them concerning purchase churn, which appears to be the series with dynamics less explained by the information available. Some of the interventions though, do seem to point at some clear effect negatively affecting engagement: they impact negatively conversion to PU while increasing both churn series and purchase churn. These could correspond to buggy releases or server failures that annoyed players. It could be the case of the unknown intervention shown in Tables 1 and 2 for February 9, 2017 that increases churn and purchase churn while negatively affecting PU conversion.

In regards to the actual marketing and promotion campaign planning revealed by the intervention definition process, it suggests that a lot was going on on both sides during the first months of the game, which makes sense after a new launch. After that, important new player acquisition campaigns seem to have run approximately every second month, except for the summer months that have campaigns on and off during the whole period. Promotion campaigns after the beginning of 2015 were shorter and sparser. This changed towards the end of 2016. Specially during 2017 there is an unprecedented length, intensity and frequency of promotion campaigns, with shorter and longer promotions constantly starting or ending. This also makes sense in that it seems to be an effort to maintain the total revenue generated by the game after it starts losing gas (in-game sales, DAUs and total playtime start decreasing the second half of 2016, and these campaigns appear to manage to bring purchases back to their previous level).

The mean values and standard deviation of the monthly MAE and RMSE of forecasts can be found in Tables 3 and 4 respectively. Figure 7 shows the monthly values of RMSE for the daily forecasts produced by both models for each series. Note the goal of this validation metrics is merely to compare both models and to assess how their performance varies depending on the month and on the amount and importance of the missing interventions in the modeling.

As Fig. 7 shows, forecast accuracy is comparable but tends to be slightly worse for the local level models as compared to ARIMA, except in the case of non-PU churn where its performance is significantly worse. This could be related to its much higher local level signal-noise ratio (ratio between the local level and the irregular or residual variances) of this model as compared to that of other series. Higher values mean less lags into the past observations are taken into account for the forecasting [38], which could make the correct level of the series harder to capture. High peaks in Fig. 7 correspond to months with many days of and/or high impacting campaigns, whose effect will be captured by interventions once that month becomes part of the training period.

In general these findings match those found by Harvey [38] and Andrews [45] (which offer discussions on the similarities and differences of ARIMA and UC, and a comparison of their performance). The results are typically very similar, while UC models are simpler and need much less human intervention. The difference for non-PU churn might arise from the daily nature of the series (as compared to the weekly and monthly series examined in the references noted above). This could account for a larger number of outliers (that are not smoothed out through average or aggregation), that could in turn degrade the forecasts significantly particularly for large signal-noise ratios. Studying how this type of modeling behaves depending on the time granularity used seems like an interesting path to explore.

7. Summary and conclusions

Time series modeling of conversion rates between different groups of players can provide insights into the effect of different in-game and external events, and a way of detecting relevant missing information. It is particularly a very interesting approach to understand how different elements of in-game dynamics affect differently different types of players.

Two different SSM approaches were considered and compared: ARIMA and UC. The latter is more robust and needs much less human intervention in regards to model definition, while still providing the same covariate selection and very similar parameter estimation than the statistically more complex ARIMA counterpart. If the only or main interest is to classify in-game or external events in terms of their effects on the different conversion rates, in order to better understand game dynamics and improve game planning, a local level UC model would be the preferred option. If the focus is predictive accuracy, the picture varies slightly. While both models typically have very similar forecasting performance, ARIMA tends to do better in more cases, and can very drastically outperform local level models for some transitions (non-PU churn in this case). Finally, if the detection of missing campaigns or other pieces of missing relevant information is needed, again here the ARIMA models do a much better job, providing a plausible scenario consistent across all transition rates.

Both models (and particularly so UC) are not costly to re-estimate, so a plausible setup would be to combine the monthly or weekly forecasts for planning and resource allocation, with daily updated estimations when a new data point is available (and possibly updated forecasts if these could be useful for updated planning). This would allow for early detection of anything unusual going on (e.g. server failures or buggy releases). This anomalous behaviour detection could be carried out using the residuals at the end of the estimated series (large absolute values or several successive values of the same signed) and/or large discrepancies between very short range forecasts and actual observed values. This process could be fully automatic, raising alarms only when unexpected deviations occur. It would also provide an immediate assessment of how well or bad are in-game events doing as compared to similar ones in the past.

Future extensions of this work will include increasingly complex player type landscapes, trying additional time series models to evaluate their performance, and assessing the impact of using the resulting forecasts in individual player behavior modeling.

8. Software used

All analysis and predictions were performed with Python 3.6.7, making use of the datetime, numpy [46], pandas [47] and statsmodels [48] libraries. Plots were produced using the matplotlib [49] library.

References

Fields

, Mobile and Social Game Design: Monetization Methods and Mechanics, 2nd edn, CRC Press, 2014, 2–64.

Guitart

del Río

A.F.

and Periáñez

Á.

, Understanding Player Engagement and In-Game Purchasing Behavior with Ensemble Learning, in: GAME-ON’2019 AI and Simulation in Games Grigg

, ed., eurosis, 2019, pp. 78–85. ISBN 978-94-92859-08-02.

Periáñez

Á.

Saas

Guitart

and Magne

, Churn Prediction in Mobile Social Games: Towards a Complete Assessment Using Ensembles, in: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2016, pp. 564–573. doi: 10.1109/DSAA.2016.84.

Bertens

Guitart

and Periáñez

Á.

, Games and Big Data: A Scalable Multi-Dimensional Churn Prediction Model, in: 2017 IEEE Conference on Computational Intelligence and Games (CIG), 2017, pp. 33–36. doi: 10.1109/CIG.2017.8080412.

Kim

K.-J.

Yoon

Jeon

Yang

S.-i.

Lee

S.-K.

Lee

Jang

Kim

D.-W.

Chen

P.P.

Guitart

Bertens

Periáñez

Á.

Hadiji

MÃ¼ller

Joo

Lee

and Hwang

, Game Data Mining Competition on Churn Prediction and Survival Analysis using Commercial Game Log Data, IEEE Transactions on Games, 2018), 1–1.

Chen

P.P.

Guitart

and Periáñez

Á.

, The winning solution to the IEEE CIG 2017 game data mining competition, Machine Learning Knowledge Extraction 1(1) (2019), 252–264.

Guitart

Tan

S.H.

del Río

A.F.

Chen

P.P.

and Periáñez

Á.

, From Non-Paying to Premium: Predicting User Conversion in Video Games with Ensemble Learning, in: FDG ’19: Proceedings of the 14th International Conference on Foundations of Digital Games, ACM, 2019, pp. 1–9. doi: 10.1145/3337722.3341855.

Guitart

Chen

P.P.

Bertens

and Periáñez

Á.

, Forecasting Player Behavioral Data and Simulating in-Game Events, in: 2018 IEEE Conference on Future of Information and Communication Conference (FICC), 2018.

Cricenti

A.L.

Branch

P.A.

and Armitage

G.J.

, Time-series modelling of server to client IP packet length in first person shooter games, in: 2007 15th IEEE International Conference on Networks, IEEE, 2007, pp. 507–512.

10.

Drachen

Sifa

Bauckhage

and Thurau

, Guns, swords and data: Clustering of player behavior in computer games in the wild, in: 2012 IEEE conference on Computational Intelligence and Games (CIG), IEEE, 2012, pp. 163–170.

11.

Saas

Guitart

and Periáñez

Á.

, Discovering playing patterns: Time series clustering of free-to-play game data, in: Computational Intelligence and Games (CIG), 2016 IEEE Conference on, 2016, pp. 1–8.

12.

Fernández del Río

Chen

P.P.

and Periáñez

Á.

, Profiling Players with Engagement Predictions, in: 2019 IEEE Conference on Games (CoG), IEE, 2019, pp. 1–4. doi: 10.1109/CIG.2019.8848074.

13.

Sifa

Runge

Bauckhage

and Klapper

, Customer Lifetime Value Prediction in Non-Contractual Freemium Settings: Chasing High-Value Users Using Deep Neural Networks and SMOTE, in: Proceedings of the 51st Hawaii International Conference on System Sciences, 2018, pp. 923–932. doi: 10.24251/HICSS.2018.115.

14.

Drachen

Pastor

Liu

Fontaine

D.J.

Chang

Runge

Sifa

and Klabjan

, To Be or Not to Be …Social: Incorporating Simple Social Features in Mobile Game Customer Lifetime Value Predictions, in: Proceedings of the Australasian Computer Science Week Multiconference (ACSW), 2018, Article no. 40. doi: 10.1145/3167918.3167925.

15.

Chen

P.P.

Guitart

Periáñez

Á.

and Fernández del Río

, Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models, in: IEEE International Conference on Big Data, 2018, pp. 2134–2140.

16.

Sifa

Hadiji

Runge

Drachen

Kersting

and Bauckhage

, Predicting purchase decisions in mobile free-to-play games, in: Proceedings of the Eleventh AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-15), AAAI, 2015, pp. 79–85. https://www.aaai.org/ocs/index.php/AIIDE/AIIDE15/paper/view/11544.

17.

Bertens

Guitart

Chen

P.P.

and Periáñez

Á.

, A Machine-Learning Item Recommendation System for Video Games, in: 2018 IEEE Conference on Computational Intelligence and Games (CIG), 2018, pp. 1–4.

18.

Chen

Yuan

and Shu

, Forecasting crime using the ARIMA model, 2008, 627–630. doi: 10.1109/FSKD.2008.222.

19.

Permatasari

C.I.

Sutopo

and Hisjam

, Sales forecasting newspaper with ARIMA: A case study, 2018. doi: 10.1063/1.5024076.

20.

Shakti

Hassan

Zhenning

Caytiles

and Iyenger

N.C.S.N.

, Annual automobile sales prediction using ARIMA model, International Journal of Hybrid Information Technology 10 (2017), 13–22. doi: 10.14257/ijhit.2017.10.6.02.

21.

Adebiyi

Adewumi

and Ayo

, Stock price prediction using the ARIMA model, 2014. doi: 10.1109/UKSim.2014.67.

22.

Mondal

Shit

and Goswami

, Study of effectiveness of time series modeling (ARIMA) in forecasting stock prices, International Journal of Computer Science, Engineering and Applications 4(2) (2014). doi: 10.5121/ijcsea.2014.4202.

23.

Contreras

Espinola

Nogales

F.J.

and Conejo

A.J.

, ARIMA models to predict next-day electricity prices, IEEE Transactions on Power Systems 18(3) (2003), 1014–1020. doi: 10.14257/ijhit.2017.10.6.02.

24.

Zhang

Young

A.A.

and Li

, Applications and comparisons of four time series models in epidemiological surveillance data, PLOS ONE 9(2) (2014), 1–16. doi: 10.1371/journal.pone.0088075.

25.

C.C.

and Yee

, Time Series Analysis and Forecasting of Dengue Using Open Data, in: Advances in Visual Informatics, B.Z.H. et al., ed., Springer, 2015. doi: 10.1007/978-3-319-25939-0-5.

26.

Zeng

Huang

Wang

Zhang

Tang

and Zhou

, Time series analysis of temporal trends in the pertussis incidence in mainland China from 2005 to 2016, Nature Scientific Reports 6 (2016). doi: 10.1038/srep32367.

27.

Bjørnstad

O.N.

and Grenfell

B.T.

, Noisy clockwork: time series analysis of population fluctuations in animals, Science 293(5530) (2001), 638–643. doi: 10.1126/science.1062226. https://science.sciencemag.org/content/293/5530/638.

28.

Tolimieri

Holmes

E.E.

Williams

g.D.

Pacunski

and Lowry

, Population assessment using multivariate time-series analysis: a case study of rockfishes in puget sound, Ecology and Evolution 7 (2017). doi: 10.1002/ece3.2901.

29.

Elghafghug

Vanderstichel

St-Hilaire

and Stryhn

, Using state-space models to predict the abundance of juvenile and adult sea lice on atlantic salmon, Epidemics 24 (2018), 76–87. doi: 10.1016/j.epidem.2018.04.002.

30.

Tektas

, Weather forecasting using ANFIS and ARIMA MODELS, Environmental Research, Engineering and Management 51 (2010). doi: 10.5755/j01.erem.51.1.58.

31.

Faruk

D.O.

, An hybrid neural network and ARIMA model for water quality time series prediction, Engineering Applications of Artificial Intelligence 23 (2010), 586–594. doi: 10.5755/j01.erem.51.1.58.

32.

Box

G.E.P.

and Jenkins

G.M.

, Time Series Analysis: Forecasting and Control, Holden-Day, 1976.

33.

Brockwell

and Davis

R.A.

, An Introduction to Time Series and Forecasting, Vol. 39, Springer, 2002. doi: 10.1007/978-1-4757-2526-1.

34.

Hamilton

J.D.

, Time Series Analysis, 1st edn, Princeton University Press, 1994. ISBN 0691042896. http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0691042896.

35.

Shumway

R.H.

and Stoffer

D.S.

, Time Series Analysis and Its Applications: With R Examples, Springer Texts in Statistics, Springer New York, 2010. ISBN 9781441978646. https://books.google.co.jp/books?id=dbS5IQ8P5gYC.

36.

Kalman

R.E.

, A new approach to linear filtering and prediction problems, Journal of basic Engineering 82(1) (1960), 35–45.

37.

Harvey

, Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge University Press, 1991. https://EconPapers.repec.org/RePEc:cup:cbooks:9780521405737.

38.

Harvey

, Forecasting with Unobserved Components Time Series Models, in: Handbook of Economic Forecasting, Vol. 1, 1st edn Elliott

Granger

and Timmermann

, eds, Elsevier, 2006, pp. 327–412, Chapter 07. https://EconPapers.repec.org/RePEc:eee:ecofch:1-07.

39.

Akaike

, A new look at the statistical model identification, IEEE Transactions on Automatic Control 19(6) (1974), 716–723.

40.

Schwarz

, Estimating the dimension of a model, The Annals of Statistics 6(2) (1978), 461–464. doi: 10.1214/aos/1176344136. http://projecteuclid.org/euclid.aos/1176344136.

41.

Hannan

and Quinn

, The determination of the order of an autoregression, Journal of the Royal Statistical Society. Series B: Statistical Methodology 41(2) (1979), 190–195.

42.

Ljung

G.M.

and Box

G.E.P.

, On a measure of lack of fit in time series models, Biometrika 65(2) (1978), 297–303. doi: 10.1093/biomet/65.2.297.

43.

Jarque

C.M.

and Bera

A.K.

, Efficient tests for normality, homoscedasticity and serial independence of regression residuals, Economics Letters 6(3) (1980), 255–259. https://EconPapers.repec.org/RePEc:eee:ecolet:v:6:y:1980:i:3:p:255-259.

44.

Kreyszig

and Norminton

E.J.

, Advanced Engineering Mathematics, Tenth edn, Wiley, Hoboken, NJ, 2011. ISBN 0470458364.

45.

Andrews

R.L.

, Forecasting performance of structural time series models, Journal of Business and Economic Statistics 12(1) (1994), 129–133. https://www-jstor-org.web.bisu.edu.cn/stable/1391929.

46.

Oliphant

, NumPy: A guide to NumPy, 2006. http://www.numpy.org/.

47.

McKinney

, Data Structures for Statistical Computing in Python, in: Proceedings of the 9th Python in Science Conference van der Walt

and Millman

, eds, 2010, pp. 51–56.

48.

Seabold

and Perktold

, Statsmodels: Econometric and statistical modeling with python, in: 9th Python in Science Conference, 2010.

49.

Hunter

J.D.

, Matplotlib: a 2D graphics environment, Computing In Science & Engineering 9(3) (2007), 90–95.

50.

Herniter

, A probablistic market model of purchase timing and brand selection, Management Science 18(4–part–ii) (1971), P102–P113.

51.

Batislam

E.P.

Denizel

and Filiztekin

, Empirical validation and comparison of models for customer base analysis, International Journal of Research in Marketing 24(3) (2007), 201–209.

52.

Platzer

and Reutterer

, Ticking away the moments: timing regularity helps to better predict customer activity, Marketing Science 35(5) (2016), 779–799.

53.

Platzer

, Customer Base Analysis with BTYDplus, 2016, Available at https://rdrr.io/cran/BTYDplus/f/inst/doc/BTYDplus-HowTo.pdf.

54.

Wheat

R.D.

and Morrison

D.G.

, Estimating purchase regularity with two interpurchase times, Journal of Marketing Research 27(1) (1990), 87–93.

55.

Chatfield

and Goodhardt

G.J.

, A consumer purchasing model with erlang inter-purchase times, Journal of the American Statistical Association 68(344) (1973), 828–835.

56.

Gupta

, Stochastic models of interpurchase time with time-dependent covariates, Journal of Marketing Research 28(1) (1991), 1–15.

57.

Cox

D.R.

, Regression models and life-tables, Journal of the Royal Statistical Society, Series B (Methodological) 34(2) (1972), 187–220.

58.

Bengio

, Learning deep architectures for AI, Foundations and Trends® in Machine Learning 2(1) (2009), 1–127.

59.

Clark

T.G.

Bradburn

M.J.

Love

S.B.

and Altman

D.G.

, Survival analysis part i: basic concepts and first analyses, British Journal of Cancer 89(2) (2003), 232–238. doi: 10.1038/sj.bjc.6601118.

60.

Shaw

and Stone

, Database Marketing Strategy and Implementation, John Wiley & Sons, Inc., 1991.

61.

McCarty

J.A.

and Hastak

, Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression, Journal of Business Research 60(6) (2007), 656–662. doi: 10.1016/j.jbusres.2006.06.015.

62.

Stone

and Shaw

, Database marketing for competitive advantage, Long Range Planning 20(2) (1987), 12–20.

63.

Hoekstra

J.C.

and Huizingh

E.K.

, The lifetime value concept in customer-based marketing, Journal of Market-Focused Management 3(3–4) (1999), 257–274. doi: 10.1023/A:1009842805871.

64.

Shaw

and Stone

, Database Marketing, Gower, 1988.

65.

Farris

P.W.

Bendle

N.T.

Pfeifer

P.E.

and Reibstein

D.J.

, Marketing Metrics: The Definitive Guide to Measuring Marketing Performance, 2nd edn, Pearson Education, Upper Saddle River, New Jersey, 2010.

66.

Fader

P.S.

and Hardie

B.G.S.

, A note on deriving the Pareto/NBD model and related expressions, 2005, Available at http://www.brucehardie.com/notes/009/pareto_nbd_derivations_2005-11-05.pdf.

67.

Fader

P.S.

Hardie

B.G.S.

and Lee

K.L.

, RFM and CLV: using iso-value curves for customer base analysis, Journal of Marketing Research 42(4) (2005), 415–430. doi: 10.1509/jmkr.2005.42.4.415.

68.

Pfeifer

P.E.

Haskins

M.E.

and Conroy

R.M.

, Customer lifetime value, customer profitability, and the treatment of acquisition spending, Journal of Managerial Issues 17(1) (2005), 11–25. https://www-jstor-org.web.bisu.edu.cn/stable/40604472.

69.

Hothorn

Hornik

and Zeileis

, Unbiased recursive partitioning: a conditional inference framework, Journal of Computational and Graphical Statistics 15(3) (2006), 651–674. doi: 10.1198/106186006X133933.

70.

Hothorn

Bühlmann

Dudoit

Molinaro

and Van Der Laan

M.J.

, Survival ensembles, Biostatistics 7(3) (2005), 355–373.

71.

Kaplan

E.L.

and Meier

, Nonparametric estimation from incomplete observations, Journal of the American Statistical Association 53(282) (1958), 457–481.

72.

Zeileis

Hothorn

and Hornik

, Model-based recursive partitioning, Journal of Computational and Graphical Statistics 17(2) (2008), 492–514.

73.

Mogensen

U.B.

Ishwaran

and Gerds

T.A.

, Evaluating random forests for survival analysis using prediction error curves, Journal of Statistical Software 50(11) (2012), 1.

74.

Dwyer

F.R.

, Customer lifetime valuation to support marketing decision making, Journal of Direct Marketing 11(4) (1997), 6–13. doi: 10.1002/(SICI)1522-7138(199723)11:4<6::AID-DIR3>3.0.CO;2-T.

75.

Berger

P.D.

and Nasr

N.I.

, Customer lifetime value: marketing models and applications, Journal of Interactive Marketing 12(1) (1998), 17–30. doi: 10.1002/(SICI)1520-6653(199824)12:1<17::AID-DIR3>3.0.CO;2-K.

76.

Hothorn

Hornik

Strobl

and Zeileis

, Party: A laboratory for recursive partytioning, 2010.

77.

Hothorn

Hornik

Strobl

Zeileis

and Hothorn

M.T.

, Package ‘party’, Package Reference Manual for Party Version 0.9–998 16 (2015), 37.

78.

Therneau

T.M.

and Lumley

, Package ‘survival’, Verze, 2015.

79.

Sing

Sander

Beerenwinkel

and Lengauer

, ROCR: visualizing classifier performance in R, Bioinformatics 21(20) (2005), 3940–3941.

80.

Robnik-Sikonja

and Savicky

, CORElearn – Classification, Regression, Feature Evaluation and Ordinal Evaluation, The R Project for Statistical Computing, 2012.

81.

Robnik-Sikonja

Savicky

and Robnik-Sikonja

M.M.

, Package ‘CORElearn’, 2013.

82.

Alfons

, cvTools: cross-validation tools for regression models, R Package Version 0.3 2(5) (2012).

83.

Gerds

, pec: Prediction error curves for risk prediction models in survival analysis, R package version 2.4–4, 2014.

84.

Kuhn

Wing

Weston

Williams

Keefer

Engelhardt

Cooper

Mayer

Kenkel

Team

et al., caret: Classification and regression training, R package version 6.0–21, CRAN: Wien, Austria, 2014.

85.

Wang

and Reddy

C.K.

, Machine learning for survival analysis: A survey, arXiv preprint arXiv:1708.04649, 2017.

86.

Chamberlain

B.P.

Cardoso

Liu

C.H.B.

Pagliari

and Deisenroth

M.P.

, Customer Lifetime Value Prediction Using Embeddings, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017, pp. 1753–1762. doi: 10.1145/3097983.3098123.

87.

Schmittlein

D.C.

Morrison

D.G.

and Colombo

, Counting your customers: sho-are they and what will they do next, Management Science 33(1) (1987), 1–24. doi: 10.1287/mnsc.33.1.1.

88.

Dziurzynski

Wadsworth

and McCarthy

, BTYD: implementing buy ’til you die models, URL http://CRAN.R-project.org/package=BTYD.R package version 2 (2014).

89.

Wadsworth

, Buy’Til You Die-A Walkthrough, 2012.

90.

Delignette-Muller

M.L.

Dutang

et al., Fitdistrplus: an R package for fitting distributions, Journal of Statistical Software 64(4) (2015), 1–34.

91.

Fleming

T.R.

and Lin

, Survival analysis in clinical trials: past developments and future directions, Biometrics 56(4) (2000), 971–983.

92.

Hougaard

, Fundamentals of survival data, Biometrics 55(1) (1999), 13–22.

93.

and Ma

, Survival analysis in medicine and genetics, CRC Press, 2013.

94.

, Predicting customer churn in the telecommunications industry – An application of survival analysis modeling using SAS, SAS User Group International (SUGI27) Online Proceedings, 2002, 114–127.

95.

Stepanova

and Thomas

, Survival analysis methods for personal loan data, Operations Research 50(2) (2002), 277–289.

96.

El-Nasr

M.S.

Drachen

and Canossa

, Game analytics, Springer, 2016.

97.

Luton

, Free2Play: Making Money from Games You Give Away, New Riders, San Francisco, California, 2013.

98.

Davidovici-Nora

, Innovation in business models in the video game industry: free-to-play or the gaming experience as a service, The Computer Games Journal 2(3) (2013), 22–51. doi: 10.1007/BF03392349.

99.

Voigt

and Hinz

, Making digital freemium business models a success: predicting customers’ lifetime value via initial purchase information, Business & Information Systems Engineering 58(2) (2016), 107–118. doi: 10.1007/s12599-015-0395-z.

100.

Chawla

N.V.

Bowyer

K.W.

Hall

L.O.

and Kegelmeyer

W.P.

, SMOTE: synthetic minority over-sampling technique, Journal of Artificial Intelligence Research 16 (2002), 321–357. doi: 10.1613/jair.953.

101.

Kumar

Ramani

and Bohling

, Customer lifetime value approaches and best practice applications, Journal of Interactive Marketing 18(3) (2004), 60–72. doi: 10.1002/dir.20014.

102.

Chen

and Guestrin

, XGBoost: A scalable tree boosting system, in: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016, pp. 785–794. doi: 10.1145/2939672.2939785.

103.

Dietterich

T.G.

, Ensemble methods in machine learning, in: International Workshop on Multiple Classifier Systems, Springer, 2000, pp. 1–15.

104.

Friedman

J.H.

, Greedy function approximation: a gradient boosting machine, Annals of statistics, 2001, 1189–1232.

105.

Meyer

Dimitriadou

Hornik

Weingessel

and Leisch

, e1071: Misc Functions of the Department of Statistics (e1071), TU Wien. R package version 1.6–3, Retrieved from, 2014.

106.

Simm

de Abril

I.M.

and Sugiyama

, Tree-Based Ensemble Multi-Task Learning Method for Classification and Regression, IEICE Transactions on Information and Systems 97(6) (2014), 1677–1681. http://CRAN.R-project.org/package=extraTrees.

107.

Drucker

Burges

C.J.

Kaufman

Smola

A.J.

and Vapnik

, Support vector regression machines, in: Advances in Neural Information Processing Systems 9 (NIPS 1996), 1997, pp. 155–161. http://papers.nips.cc/paper/1238-support-vector-regression-machines.pdf.

108.

Tkachenko

, Autonomous CRM control via CLV approximation with deep reinforcement learning in discrete and continuous action space, 2015.

109.

Kim

Choi

Lee

and Rhee

, Churn prediction of mobile and online casual games using play log data, PLoS ONE 12(7) (2017), e0180735. doi: 10.1371/journal.pone.0180735.

110.

Schmidhuber

, Deep learning in neural networks: an overview, Neural Networks 61 (2015), 85–117. doi: 10.1016/j.neunet.2014.09.003.

111.

Prechelt

, Early stopping – But when? in: Neural Networks: Tricks of the Trade Orr

G.B.

and Müller

K.-R.

, eds, Lecture Notes in Computer Science, Springer Berlin Heidelberg, 1998, pp. 55–69. doi: 10.1007/3-540-49430-8-3.

112.

Bradburn

M.J.

Clark

T.G.

Love

S.B.

and Altman

D.G.

, Survival analysis part II: multivariate data analysis – an introduction to concepts and methods, British Journal of Cancer 89(3) (2003), 431–436. doi: 10.1038/sj.bjc.6601118.

113.

Demediuk

Murrin

Bulger

Hitchens

Drachen

Raffe

W.L.

and Tamassia

, Player retention in league of legends: a study using survival analysis, in: Proceedings of the Australasian Computer Science Week Multiconference (ACSW), 2018, Article no. 43. doi: 10.1145/3167918.3167937.

114.

Guo

and Huang

, An extended support vector machine forecasting framework for customer churn in e-commerce, Expert Systems with Applications 38(3) (2011), 1425–1430. doi: 10.1016/j.eswa.2010.07.049.

115.

Tran

V.T.

Pham

H.T.

Yang

B.-S.

and Nguyen

T.T.

, Machine performance degradation assessment and remaining useful life prediction using proportional hazard model and support vector machine, Mechanical Systems and Signal Processing 32 (2012), 320–330. doi: 10.1016/j.ymssp.2012.02.015.

116.

Tsymbalov

, Churn Prediction for Game Industry Based on Cohort Classification Ensemble, in: Proceedings of the Third Workshop on Experimental Economics and Machine Learning, 2016, pp. 94–100. http://ceur-ws.org/Vol-1627/paper8.pdf.

117.

Glorot

and Bengio

, Understanding the difficulty of training deep feedforward neural networks, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2010, pp. 249–256. http://proceedings.mlr.press/v9/glorot10a.html.

118.

Kingma

D.P.

and Ba

, Adam: A method for stochastic optimization, in: Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), 2015, arXiv:1412.6980.

119.

Hyndman

R.J.

and Koehler

A.B.

, Another look at measures of forecast accuracy, International Journal of Forecasting 22(4) (2006), 679–688. doi: 10.1016/j.ijforecast.2006.03.001.

120.

Willmott

C.J.

and Matsuura

, On the use of dimensioned measures of error to evaluate the performance of spatial interpolators, International Journal of Geographical Information Science 20(1) (2006), 89–102.

121.

Armstrong

J.S.

, Long-range forecasting, Wiley, New York, 1985.

122.

Glorot

Bordes

and Bengio

, Deep sparse rectifier neural networks, in: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2011, pp. 315–323.

123.

Scherer

Müller

and Behnke

, Evaluation of pooling operations in convolutional architectures for object recognition, in: Artificial Neural Networks – ICANN 2010, Lecture Notes in Computer Science, Springer, 2010, pp. 92–101.

124.

LeCun

and Bengio

, Convolutional networks for images, speech, and time series, in: The Handbook of Brain Theory and Neural Networks Arbib

M.A.

, ed., MIT Press, 1995, pp. 276–279.

125.

Turaga

S.C.

Murray

J.F.

Jain

Roth

Helmstaedter

Briggman

Denk

and Seung

H.S.

, Convolutional networks can learn to generate affinity graphs for image segmentation, Neural computation 22(2) (2010), 511–538.

126.

LeCun

Boser

Denker

J.S.

Henderson

Howard

R.E.

Hubbard

and Jackel

L.D.

, Backpropagation applied to handwritten zip code recognition, Neural Computation 1(4) (1989), 541–551.

127.

Szegedy

Liu

Jia

Sermanet

Reed

Anguelov

Erhan

Vanhoucke

and Rabinovich

, Going deeper with convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1–9.

128.

Babu

G.S.

Zhao

and Li

X.-L.

, Deep Convolutional Neural Network Based Regression Approach for Estimation of Remaining Useful Life, in: Database Systems for Advanced Applications. (DASFAA), Lecture Notes in Computer Science, 2016, pp. 214–228.

129.

Tsantekidis

Passalis

Tefas

Kanniainen

Gabbouj

and Iosifidis

, Forecasting stock prices from the limit order book using convolutional neural networks, in: 2017 IEEE 19th Conference on Business Informatics (CBI), Vol. 1, IEEE, 2017, pp. 7–12.

130.

Yang

J.B.

Nguyen

M.N.

San

P.P.

X.L.

and Krishnaswamy

, Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition, in: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), 2015, pp. 3995–4001.

131.

Hochreiter

and Schmidhuber

, Long Short-Term Memory, Neural Computation 9(8) (1997), 1735–1780.

132.

Patil

Huard

and Fonnesbeck

C.J.

, PyMC: bayesian stochastic modelling in python, Journal of Statistical Software 35(4) (2011), 1–81.

133.

Koller

and Friedman

, Probabilistic graphical models: principles and techniques, MIT press, 2009.

A time series approach to player churn and conversion in videogames

Abstract

Keywords

1. Introduction

2. Related works

3. Time series state space modeling

3.2 Autoregressive Integrated Moving Average models

5.1 Model selection

1 Residuals refer to the unexplained part of the series after modeling. They should correspond to the random noise term described in Section 3.

5.5 Forecasting and verification

Table 1 ARIMA estimates for a selection of parameters for the different series

8. Software used

References

¹
Residuals refer to the unexplained part of the series after modeling. They should correspond to the random noise term described in Section 3.

Table 1
ARIMA estimates for a selection of parameters for the different series