Abstract
While it is relatively easy to start an online advertising campaign, obtaining a high Key Performance Indicator (KPI) can be challenging. A large body of work on this subject has already been performed and platforms known as DSPs are available on the market that deal with such an optimization. From the advertiser’s point of view, each DSP is a different black box, with its pros and cons, that needs to be configured. In order to take advantage of the pros of every DSP, advertisers are well-advised to use a combination of them when setting up their campaigns. In this paper, we propose an algorithm for advertisers to add an optimization layer on top of DSPs. The algorithm we introduce, called SKOTT, maximizes the chosen KPI by optimally configuring the DSPs and putting them in competition with each other. SKOTT is a highly specialized iterative algorithm loosely based on gradient descent that is made up of three independent sub-routines, each dealing with a different problem: partitioning the budget, setting the desired average bid, and preventing under-delivery. In particular, one of the novelties of our approch lies in our taking the perspective of the advertisers rather than the DSPs. Synthetic market data is used to evaluate the efficiency of SKOTT against other state-of-the-art approaches adapted from similar problems. The results illustrate the benefits of our proposals, which greatly outperforms the other methods.
Keywords
Introduction
Online advertising is a vast market, worth several billion USD per year [11]. It is easy to understand the importance of optimization in such a market: every percent of increase in efficiency has a value on the order of millions of USD. From the advertiser’s point of view, however, optimization is often very difficult due to the constraints imposed by the structure of the ecosystem of online advertising. Let us see why.
There are currently two main paradigms in the market: sponsored search and Real-Time Bidding (RTB) auctions. Sponsored search, the main source of revenues for search engines, consists of showing relevant ads whenever a user inputs a query. For example, typing “football shoes” in the search bar will provide the user with several links to online shops selling sports apparel. In order to appear amongst the sponsored links, an advertiser must place a bid on the queries or keywords to which it wants to be connected.
Typically, advertisers pay if and only if their sponsored link is clicked on; this creates a collaborative effort between search engine and advertiser to show the most promising ad. Sponsored search has been the subject of lots of research papers over the years, dealing with topics like budget optimization [8, 12] and click-through rate prediction [21, 28], for example.
The RTB paradigm, which is the focus of our optimization work, is inherently different. In the case, advertisers participate in auctions to buy the available ad space on a website. The winning advertiser is allowed to display its ad (in the technical jargon, it has bought an “impression”). Unlike sponsored search, advertisers pay for all impressions they buy, even if they do not “generate a click”, i.e., users do not click on them to be redirected to the advertiser’s page. This changes everything as it removes any interest for the auctioneer to find an ad that is a good match to the current inventory, a task that is now completely left to the advertisers. As a consequence of these differences, the optimization results obtained in sponsored search are not directly applicable to RTB campaigns. As we already mentioned, this optimization process is made difficult by the very structure of the market that, in its simplest approximation, looks as follows (cf. Fig. 1).
A simple, approximate representation of the RTB auctions market structure. The proposed algorithm, SKOTT, would work as an interface between advertisers and DSPs.
The central entity is the AdExchange, whose job is to run real-time bidding auctions and assign all available inventory (i.e., the screen space on which the advertising should be published) to its corresponding winning advertiser; On one side of AdExchanges there are Supply Side Platforms (SSPs), which provide the inventory and are in direct contact with the publishers (e.g., the owner of a web page); On the other side, on which we focus, are Demand Side Platforms (DSPs). DSPs bid on the available inventory on behalf of the advertisers according to the advertisers’ necessities.
Each step of this chain brings its own constraints and optimizations. For example, a DSP might work only with certain AdExchanges, effectively limiting the amount of inventory available to the advertiser, but it might also offer better performances on some particular indicators due to internal optimization algorithms.
It is easy for an advertiser to set up a campaign with a DSP and monitor its effectiveness by calculating certain Key Performance Indicators (KPIs). On the flip side, the use of a DSP prevents the advertiser from directly determining the bid on each individual auction, only allowing it to fix some average parameters to associate with larger sets of impressions. We call each of these abstract entities made of a set of impressions and its corresponding parameters a media object.
Each media object can almost be treated as a separate entity with its own associated budget that can change over time. The only global constraint is that the total budget of the campaign is fixed. A single media object makes the optimization easy to handle but inefficient because it treats all impressions in the same way, bidding roughly the same amount of money for all impressions and showing the same advertising to all users. A correct choice of the parameters of the media objects can therefore have a huge impact on the global optimization of the campaign.
But this setup leaves advertisers with many questions: What is the most appropriate DSP to use amongst the many available on the market? How to parametrize it? How much budget to assign to each of its media objects? These questions are often answered by human experts that base their decisions on past experience, intuition, and some off-line data analysis. However, there is no guarantee that the goals set by the experts are reachable, nor that they are optimal.
In recent decades, researchers have focused mainly on market models [10, 18, 26] and bidding algorithms for DSPs [5, 27, 9, 16, 20]. However, to the best of our knowledge, no paper tackles the optimal management of a campaign from the point of view of an advertiser using a DSP. The new constraints that come with such a perspective make other optimization algorithms studied in the literature difficult to compare with ours. For example, our need to partition the budget arises from the necessity to work with media objects: when an impression arrives that is a good fit to a particular media object, we need to be sure that the media object has a sufficient budget to spend. This problem does not exist when one can decide how much money to spend on a single impression basis, with the only constraint of the total campaign budget.
The main contribution of this paper is the a new algorithm, the SKOTT algorithm, that solves this understudied problem: SKOTT automatically handles advertising campaigns, finding the best parameters to put inside each DSP in order to maximize the performance. It also allows the contemporary use of different DSPs that are put in competition to further increase the performance. Therefore, not only does it give a recipe to fully take advantage of any single DSP, but it also adds a new layer of optimization on top of them. The algorithm reacts quickly to market variations and scales linearly with respect to the number of media objects.
The remainder of this paper is organized as follows. In Section 2 we discuss a few algorithms that deal with a similar problem and have been an inspiration for our work. In Section 3 we state the problem at hand and our goals. Sections 4.1–4.3 deal with the three independent sub-routines that compose SKOTT. The conducted experiments and results obtained are discussed in Section 5. Conclusions are drawn in Section 6. Finally, appendices give details on the creation of the synthetic market data, the algorithms we chose for comparison of the results, and the technical part of the implementation.
The problem of how to spend a budget in order to maximize the profit during an advertising campaign has never been studied, to the best of our knowledge, from the point of view of an advertiser using DSPs. Nevertheless, there are many works in the literature that are relevant because they try to solve a similar problem. We will cite and discuss some of them in this section, mainly to highlight what is the difference in our approach.
In [8] the authors propose to use a randomized uniform strategy for choosing how much to bid on every keyword in a sponsored search. That could be applied to our problem of Real-Time Bidding auctions through the use of DSPs problem by analogy, associating every keyword with a different media object. (We refer to the introduction and to the problem statement for the definition of media objects.) However, their model assumes a complete knowledge of the bidding landscape, that is, the probability distribution of the winning bids for each impression. This is information that advertisers don’t have in the case of RTB auctions through DSPs. Also, the model in [8] requires the bidding landscape to be static, a hypothesis that we don’t require.
The problem of collecting the largest possible reward of an advertising campaign with a constraint on the budget can be also written in the linear programming formalism. This is done, for example, in [5]. There, however, the authors: 1. take the point of view of a DSP, and 2. try to optimize the revenue of publishers and not of advertisers. In particular, they assign each piece of inventory to a different advertising campaign assuming that the total Cost-Per-Click (CPC) of each campaign is fixed. This last point is clearly not the case when we look at it from the advertiser’s side because, as we said, we consider the case where advertisers pay for the inventory they buy regardless of it being clicked on. This implies that the CPC is then determined by the ratio between the average of Cost-Per-Impression (CPM) and the Click-Through Rate (CTR) of a media object, therefore it’s not constant, as we will thoroughly see when we analyze the SKOTT algorithm.
A similar work, explicitly based on [5], is [16], where a linear programming algorithm is proposed for DSPs wanting to maximize their revenues. Besides the different perspective, the authors assume a fixed CTR, known in advance. This is not required in our approach, where we infer the CTR from market data in real time and the only assumption we make on its analytical form is that it varies slowly with the bid. A linear programming approach inspired by these works will be tested against SKOTT in the simulations.
A third approach to the problem is via reinforcement learning. In this case, the bidder is considered as an agent that learns how much to bid for every individual auction. To direct its choice, it is aware of the budget constraints, the goals of the campaign, and all context information from the impression it is trying to buy. It is studied for example in [4] where, due to the huge size of the space of possible actions to take, the authors help the decision process by using a model-based approach. This approach is extremely interesting but ineffective in our case for two main reasons. First of all, the constraints under which we work prevent us from accessing individual auctions and, secondly, we obtain information not in real time but in batches that come in at larger time scales (roughly an hour). Nevertheless, an approach inspired by reinforcement learning is tested against SKOTT in the simulations: A multi-armed bandit algorithm [1, 3] is used to allocate the budget to the different media objects according to their results.
Problem statement
As stated in the introduction, advertisers can participate in RTB auctions by using a DSP. Typically, several DSPs are employed in order to increase the amount of people reached and to better respond to business necessities. Each DSP needs to be configured. The details of the configuration might differ from DSP to DSP, but there is a central core of abstraction that is common to all of them. We call it a media object: it is a set of instructions given by the advertisers, some of which are qualitative and set once and for all at instantiation while others are quantitative and can be changed at any moment. An example of the former is the creative associated to the media object, i.e., the actual advertising being shown.
Another example of qualitative instructions are the filters on incoming auctions that select on which impressions to place a bid depending on the user and inventory characteristics. The hourly budget and the base bid for the auctions, alternatively, belong to the latter. It is important to note that the media object is the most precise layer of abstraction that is accessible to advertisers. The only influence that they can have on the auctions, and therefore their only possibility for optimization, lies in the parameters of the media objects.
We consider an advertising campaign to be defined by: a total budget
During the campaign lifetime, the advertiser receives information on the behavior of every media object from the different DSPs. Each data point contains hourly information on the impressions bought, the clicks generated, the money spent, and possibly other such quantities. Since advertisers don’t have access to the auctions individually but only through the media objects, they should only consider the average effect of their optimizations over all impressions. Therefore, for each media object we take an hourly average of the information received. In practice, we consider only the Click-Through Rate (CTR, the ratio of clicks generated to impressions bought) and Cost-Per-Click (CPC, the ratio of money spent to clicks generated).
The main goal of our algorithm is to change the media objects parameters in such a way as to optimize a certain KPI while keeping the desired delivery over time. In order to demonstrate a practical case, we have chosen to optimize the total number of clicks generated in the campaign. This is often a valid indicator to optimize because a user that is interested to purchase something from the advertiser’s website will probably click directly on the ads, while only a small fraction of the people that click on the ads will actually make a purchase. The click is then correlated to the monetary return of the campaign while not being as rare as an actual purchase.
The design choices: SKOTT
SKOTT is an iterative algorithm made up of three subroutines: budget partitioning which rewards high-quality media objects by giving them more money; base bid setting which controls the bid of each media object separately with the goal of increasing the media object’s quality; and pacing control which prevents under-delivery. Figure 2 provides a schematic view of the different steps of the algorithm. The three sub-routines act independently one from another and can therefore be analyzed in any order.
A schematic view of the SKOTT algorithm.
In this section we deal with the budget partitioning that defines which percentage of the hourly budget should be allocated to each media object.
A budget partition is a vector of
The ideal algorithm for budget partitioning should:
return a list of non-negative weights that sum to one at every decision epoch optimize a specific KPI (in our case the total number of clicks), promptly react to changes in the market, be they sudden or slow.
A very important point to consider when devising the algorithm is that the data must be bought through winning auctions. Reducing the budget of a media object well below the expected CPC will result in no clicks being bought, thereby gaining little to no useful information to estimate the quality of the media object. An advertiser may spend some time and money to explore the market randomly then concentrate their money on the best performing media objects. This would likely lead to an increase in the return of the campaign on average, but the price to pay is a high risk of getting stuck on a sub-optimal media object. A more dynamic algorithm that keeps exploring over time seems therefore a more reasonable choice.
There is a balance to strike between exploration and exploitation. The former is expensive, but mitigates risks and gives a better long-term investment. The latter increases the short-term reward, but might prove catastrophic over the long-term, locking the investment on media objects that are ultimately bound to fail.
Algorithm for budget partitioning.
The algorithm we propose is a variation of the exponentiated gradient descent method originally proposed in [14]. At each iteration, it updates the weight vector
Given a vector of weights at epoch
Here,
The update rule defined by the exponentiated gradient descent is the following:
where
The rest of this section is devoted to write what is the value of
where
Since
Let us notice that
Let us also mention that, since the quality of the media objects depends on external factors, a rescaling is needed to ensure the relative importance of the regularization parameter (hence the uselessness of the global multiplicative factor
which ensures all the elements of the vector to be positive and not larger than 1.
Under these conditions, we can rewrite the derivative of Eq. (1) as:
Due to the stochastic nature of the data coming from the market, there are a few corrections to make to the model of the quality vector in order to improve the precision and the stability of the results.
Let us consider the quality factor as defined in Eq. (5). The problem is that clicks are extremely rare: A typical CTR is 0.1%, meaning that only one impression out of a thousand generates a click. However, it is always possible, albeit rare, that a media object buys a small amount of impressions and generates a click. This is, of course, just sampling noise due to the very nature of the quantities we are dealing with. However, if not taken into account, it would dominate the response of the algorithm and lead to very unstable situations. Even worse, it could lock the algorithm into a strategy whereby it puts all its money into a single, sub-optimal media object for a long time. To deal with that we make two corrections.
First of all, we put a hard bound on the gradient between
We claim that a better estimation of the value of the quality of a strategy can be done using a cumulative discounted version of the clicks and budgets, i.e., a variable that takes into account not only the latest data but also past data weighted by a discount factor
where
We have said that an important feature of our algorithm is the relevance of the regularization parameter, that decides the trade-off between exploration and exploitation. Here, we explain what we chose in our simulations and why.
The regularization parameter that we used is defied as:
where
The interest in rescaling with
The presence of the term
In this section, we present the algorithm that dynamically changes the base bid of a media object. The base bid of a media object represents a sort of default value that is adjusted by the DSP depending on how valuable it deems a certain piece of inventory for said media object. Many DSPs make this adjustment by multiplying the base bid by a score calculated from data about a specific item of inventory. Typically, however, the base bid will represent the average bid offered during the campaign.
Clearly, a high base bid will lead to chronic overbidding. This is indicated by the fact that the average cost per impression is significantly below the bid, assuming that the inventory is priced based on the second highest bid (as is overwhelmingly the case, cf. [25]). Overbidding is very risky because it might lead to a very large expense on non-valuable impressions if another player in the market is making the same mistake. Conversely, a low base bid can cause the inability of a media object to buy inventory deemed valuable by the DSP, leading to an under-utilized budget.
Still following the example for which our goal is to maximize the number of clicks, we write a vectorial loss function
where
In the following, we analyze separately the functions that appear on the right hand side of Eq. (10), starting with the CPC. The full result is presented at the end of the section and is also resumed in Fig. 4.
Algorithm for setting the base bid.
One iteration of Nadam, adapted from [6].
The CPC can be rewritten as:
where CPM is the average Cost Per Mille, that is, the average price to pay for a thousand impressions, and CTR is the Click-Through Rate that we have already introduced. First of all, we assume the CTR to be independent of the base bid and we estimate it from the market data as:
where
We now have to find the CPM. Following [27], let us assume that the probability of winning an auction with a base bid of
where
that gives the percentage of inventory whose winning bid is exactly
RTB auctions often employ a second-price model to enforce truthful bidding [23, 17, 7]. In such a situation, and remembering that the bid is typically expressed in total offer per thousand impressions, the CPM is given by the total money spent divided by the total number of impressions bought:
where
We can notice the logarithmic increase of the average CPM at infinite bids representing competitors placing extremely high bids to acquire inventory, a strategy that gets rarer and rarer with increasing bids. From Eq. (15) we can estimate the value of the parameter
The CPC as a function of all the basic quantities of the problem then reads:
To obtain the derivative of the CPC part of the loss function we thus need only derive the CPM in Eq. (15). Calculations lead to:
In Eq. (13) we have made an assumption about the probability of winning an auction based on the base bid
In the case of under-delivery, a media object buys the entire inventory that is available to it (because if more was available, it would buy it with the remaining money). This quantity can be estimated as:
where
where the factor 1 000 comes from the fact that the bid are expressed in offer per thousand impressions. The derivative of Eq. (20) with respect to the bid is given by:
We could have found the first equality also applying the fundamental theorem of calculus to Eq. (20), while the second equality comes from the substitution
In case of good delivery, instead, some pieces of inventory are not bought by the media object. A change in the base bid would most probably modify the number of such pieces of inventory but won’t change the total amount of money spent. Therefore, in this case,
In order to discriminate between the two delivery regimes, we use a Heaviside step function
We can now give the gradient with respect to the bids of the loss function
which, with the results found so far, becomes
We notice that this equation is always well-defined, except when
With this loss function, we perform a Nadam gradient descent [6, 13] and then bound the result to be in between a minimal and a maximal bid set by the client. Unlike in budget partitioning, we choose an additive gradient descent because we don’t need any normalization.
As a last remark on the base bid setting, there is currently a resurgence in first price auctions. Our method is still applicable even in this situation, provided a few changes are made to the form of the equations. In particular, Eqs (15) and (20) would read respectively:
giving rise to different, but nevertheless well-defined update rules.
Recently, another research paper that deals with the bidding algorithm was published [20]. While there are similarities between their approach and ours, we chose to maximize directly the number of clicks instead of defining another utility function that needs other hyperparameters such as the monetary value of each click. Moreover, the method we propose in this paper does not need to have one data point per impression (an information that we assume is not at our disposal) but only the average over a certain period of time.
The third and last sub-routine is the one that controls the delivery ratio. It checks that the total amount of money spent in the campaign so far follows the desired profile. If that’s not the case, it increases the total budget available for the next epoch. Notice that our goal is not to determine what is the best delivery profile of the campaign over time, but only to stick to it as well as possible. This sub-routine is the simplest one since it only sets a single scalar parameter, unlike the previous two who sets a vectorial one.
Algorithm for pacing control.
Typically, advertisers want to control exactly how much money they spend during the campaign. For example, the simplest delivery profile is the uniform one, where the ideal amount of money spent until
Before the budget partitioning and base bid setting sub-routines can react to the under-delivery and adapt their parameters, the actual delivery of the campaign will have lost ground to the ideal one. It is desirable then to take some measures in order to catch up with the ideal spent as soon as possible.
The hourly budget
where
A schematic view of the algorithm is presented in Fig. 6.
We tested our algorithm on a simulated environment. (More on the characteristics of the market simulator in Section A.) We will show the results as follows: First we will compare different budget partitioning algorithms while keeping no optimization on the base bids or on the pacing. Then, we will compare base bid setting algorithms on top of the budget partitioning we presented in this paper. Finally, we will show the advantages of introducing the pacing control algorithm on top of the budget partitioning and base bid setting algorithm we chose.
The comparison of budget partitioning algorithms
We compared our algorithm to three other algorithms: (1) A vanilla algorithm (codenamed vnl) that does absolutely nothing. (2) A multi-armed bandit algorithm inspired by [1, 3], codenamed mab. (3) A linear optimization algorithm inspired by [5], codenamed lop, that maximizes the clicks under the constraints of the total available budget and an interval of admitted budgets for every media object. More information about these algorithms can be found in Section D.
Optimization results for budget partitioning
Optimization results for budget partitioning
The evolution of the budget partitioning for 10 media objects, according to the four different algorithms we test, taken over a single run of the optimization. The solid lines show a smoothed version of the dotted lines for better visualization.
The four metrics we use to evaluate the algorithms: money spent, clicks obtained, CPC, and divergence from the uniform distribution as obtained from an average of 20 optimizations on different randomly chosen starting points. The solid lines show a smoothed version of the dotted lines for better visualization.
Figure 7 gives the comparisons of these algorithms. We present a simulation with day parting, i.e., with a different algorithm running for each hour of the day. The total number of epochs is 30, the number of days in a month. The first thing we want to point out is that lop is quite greedy, as we expected, while mab is almost like vnl. This is due to the fact that only one media object per epoch is updated, giving just 30 small kicks to the initial situation. Our proposed algorithm, skt1, seems to strike in between: It moves quickly without becoming greedy.
To quantify this result, we measure greediness by calculating the KL-divergence [15] of the proposed budget repartition with respect to the uniform distribution
A value of
These qualitative discussions find their quantitative conclusions in Fig. 8 and in Table 1: The first column of numerical values (
If one considers only the total number of assigned clicks as the metric to measure the performance of the algorithms, lop wins over skt1 by a small margin. Also, its total calculated CPC is slightly lower, meaning that every click costs less money. However, on the bottom right panel of Fig. 8, we can see how greedy lop is. This reflects on the total amount of money spent on the top left panel: if the desired media object doesn’t have enough available inventory, this algorithm can not react decisively. Nothing grants us that this situation won’t happen in real life with even more damaging results, in particular, a severe under-delivery of the budget. On the contrary, skt1 manages to spend almost all the available budget (represented by the purple line) even without an explicit optimization on the bid and the pacing. The increased adaptability makes it more resistant to the real market test and thus more valuable.
We now choose the skt1 algorithm for the budget partitioning and study the effect of different base bid setters. This time, we compare the new skt2 algorithm to other two algorithms: again vnl where bids are not changed, and also pst, an algorithm that uses a predetermined set of rules. We also add in the analysis the comparison of skt2 on top of skt1 with the full SKOTT algorithm, which also contains the pacing control sub-routine.
Let us explain first a little bit about pst. The predetermined set of rules analyzes the CPC first: if it is higher than a certain goal set at the beginning of the campaign, the bid is reduced by a certain fixed multiplicative constant unless the media object is under-delivering. In that case pst will still try to slightly increase the bid. We can see many inconveniences in this approach compared to the algorithm we presented in Section 4.2: first of all we need an additional external parameter, that is, the goal CPC. Furthermore, it makes very little use of the data from the market, reducing the adaptability.
The evolution of the base bid for 10 media objects, according to the four different algorithms we test, taken over a single run of the optimization. The solid lines show a smoothed version of the dotted lines for better visualization.
Optimization results for base bid and pacing
Same as Fig. 8, but for the base bid setting algorithms. The solid lines show a smoothed version of the dotted lines for better visualization.
These limitations have an effect on the results, as can be seen from Fig. 10 and from Table 2. Differently from before, the baseline for the column (
From the top left corner of Fig. 10 one can see that the amount of money spent oscillates quite a bit at the beginning of our proposed algorithm. We see this stems from a similar oscillation in the algorithm’s attempts to find the appropriate bids, as can be seen from Fig. 9: the peaks of money spent correspond to bids slightly higher than the optimal and vice versa. Finally, we see that adding the third and last part of the algorithm manages to spend almost the entire initial budget, obtaining a small increase in clicks at the expense of a slightly higher CPC.
It is interesting to notice that the last plot, containing the KL-divergence, shows that the budget repartition changes slightly depending on which bidding algorithm we choose. This is understandable because, by changing the base bid, we actually modify the perceived quality of the media object and thus generate different inputs for the different iterations of the budget partitioning algorithm. However, these modifications are small enough to be neglected at first order.
We have introduced a method for advertisers to optimize the management of Demand Side Platforms when running an advertising campaign composed of many separate media objects. The method, that we call SKOTT algorithm, is an iterative method that makes only a few general assumptions on the mathematical model of the market. We present it here applied to a campaign for the optimization of the number of generated clicks.
The SKOTT algorithm is composed of three complementary parts: Firstly, the best partitioning for the budget across all media objects is calculated. This is achieved by estimating the quality of each media object and trying to obtain the maximum number of clicks through an exponentiated gradient descent method. Second, the best base bid for each media object is calculated. Here we use the assumption and corresponding evidence from [27] that relates the bid to the probability of winning the corresponding auction. We expand on this assumption to propose a model relating variation in bids to variation in the number of clicks obtained by each media object. We finally apply a Nadam technique [6, 13] on the market data to find the best base bids for maximizing the number of clicks. The third and last part determines the amount of budget to use at every epoch in order to stay as close as possible to the desired spend profile.
The proposed algorithm has been tested on a simulated environment that we created for the occasion and that we present in the appendices. Under these circumstances, the proposed algorithm gives impressive results, more than doubling the total amount of obtained clicks in the considered experiments.
Footnotes
Acknowledgments
We thank Soufian Aboulfaouz for numerous discussions (mainly concerning the market analysis as well as the mab and lop algorithms), and especially, Yiming Wu for carefully reading and commenting the manuscript. We also want to express our appreciation to MediaMath for allowing us to use their platforms and for their technical assistance.
Appendix
Model of the market
We created a back-test platform for analyzing different campaign management algorithms. The work-flow of the platform is conceptually divided in five steps:
Parameters are chosen that describe the problem we have at hand. Data is created using these parameters. Loss functions are chosen. All algorithms are launched independently. They work on the same data and their goal is to maximize the total number obtained during the campaign. The results of the different algorithms are compared: we plot budget repartition, base bids, greediness of the algorithms, cumulative CPC, spend profiles, and collected clicks over time.
Dealing with time variations
As already mentioned in Section A, there can be variations in the quality of media objects with time. While the causes are various, the main effect is the double periodicity induced by the day-night cycle and the weekly cycle [24]. In particular, there is a big drop in the volume of impressions dealt and the number of clicks at night that often leads advertisers to forgo buying impressions at these times.
But variations are not always periodic, nor globally affecting all the media objects at the same time. A change in the relative quality of the media objects can happen over time due to external factors. A typical example could be a media object advertising a live event: the distance in time from the day of the event is an important parameter for users to decide whether to buy a ticket.
Finally, some changes might be due to correlations not considered in our model. For example, a change in the base bid that we offer during auctions might lead to a modification of the CTR of the impressions we are able to buy.
The solution we take in order to deal with such issues is to launch 24 different algorithms, one for every hour. The advantage is clear: in case some media objects are turned off at a certain moment of the day, they won’t affect the perceived quality of the media object during other hours. However, there are obvious disadvantages as well: we discard data that might still give valuable information and convergence is 24 times slower.
For the aperiodic modifications over time, we want to have fast responses to the changes in the market. Since information on the changes is obtained through the purchase of impressions, we try to keep the algorithm as far from greedy as possible while still increasing the number of obtained clicks.
Data pre-processing
In the real world, when we receive the data, there are often missing values that need to be filled before starting the optimization. Here, we fill the missing values using a combination of three different approaches: a) backward filling; where we propagate the next valid observation, b) linear interpolation method; which is a method of curve fitting using linear polynomials to construct the missing values, and c) weighted moving averaging approach; which is an averaging that has multiplying factors to give different weights to data points at different positions.
Let
Next, we fill the nans up to epoch
Last, we fill the nans between the epoch
All the missing values
A quick glance at the competing algorithms
In Section 5 we mention three algorithms that we use as a comparison against our own. The vnl algorithm needs no explanation, because it corresponds to a non-optimized campaign in which the initial parameters are kept constant. On the other hand, mab and lop deserve a few words, which we will spend in the next paragraphs.
