Optimal bidding strategies in online display advertising by marcotav65

Optimal Bidding Strategies in Online Display Advertising Prior Solutions March 7, 2018 Abstract

Contents 1

Introduction

Generalities

Outline of the problem

A simpler strategy as warm-up

Non-linear bidding strategy 5.0.1

Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

5.1

How to simplify the problem . . . . . . . . . . . . . . . . . . . . . .

viii

5.2

Tuning Îť . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Data 6.1

x Description of the Logs . . . . . . . . . . . . . . . . . . . . . . . . .

Experiment

xiii

Training/testing

xiii

8.1

Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

8.2

Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

8.3

Impact of budget . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Real-time bidding

B Predicting Winning Price in RTB with Censored Data

xvi

C Click-through rate model

xviii

D Bid Landscape Forecasting

xviii

E Conversions and the problem with sparcity

xviii

Introduction

In these notes I describe a novel bid optimization strategy for display advertising campaigns based on real-time bidding (RTB) (see appendix A). When a campaign is launched, the advertiser uploads ad creatives, defines target rules (such as user segmentation1 , time and location) and the lifetime budget. Our goal here is to build an optimal bidding engine that maximizes, for a given budget, some chosen key performance indicator (KPI), such as the number of conversions or clicks, for a given budget.

Generalities

Outline of the problem

The bidding strategy is a function that maps the individual impression evaluation to a bid value: b : individual impression evaluation → bid value

(1)

Examples of individual impression evaluations are the click-through rate (CTR) and the conversion rate (CVR) which are defined as: #clicks × 1000 #impressions #conversions CVR = × 1000. #clicks CTR =

(2) (3)

Note that the CVR in the data is the conversion rate per click and not per impression (which is more common). The campaign’s individual impression evaluation is chosen by the advertisers. Both CTR and CVR are estimated in real time based on the impression features (see Fig. 6). The bid request corresponding to some impression can be represented by a high-dimensional vector x. Examples of entries are: • Auction features such as Ad slot ID which shows the location where the impresssion will show up

• Ad features such as creative ID

1 Examples

of segments are shoes & bags, travelling, clothing and others.

• Region ID • Paying price a.k.a. market price, which is the auction winning price i.e. is the highest bid from the competitors and it is what the advertiser will pay for the impression • User feedback on the ad impressions (number of clicks and conversions) The vector x is then: (4)

x = (Log type : Ad slot ID : Paying price ID : Region ID ...)T

The outputRTB is the bidding price b(k(x)) where the argument k(x) is some predicted workflow: less than 100ms

key performance indicator (KPI) chosen by the advertiser, such as CTR and CVR. We Data Management Platform

User Information User Demography: Male, 25, Student, etc. User Segmentations: Ad science, London, etc.

Webpage

1. Bid Request

Demand-Side Platform

(user, context)

2. Bid Response (ad, bid)

Advertiser

0. Ad Request

4. Win Notice

Ad Exchange 3. Ad Auction

5. Ad (with tracking)

User

(paying price)

6. User Feedback (click, conversion, etc.)

Figure 1: Real-time bidding (RTB) mechanism for display advertising (simplified). . denote by px (x) the prior distribution of the feature vectors matching the campaign target rules. For each campaign, the historic bidding and feedback data (CTR or CVR) are used by the advertiser to predict the KPI for the ad impression being auctioned. We denote the predicted KPI of a bid request x as k(x) (different advertisers might consider different KPIs). If for example, the goal of a campaign is to maximise the total number of clicks, then k(x) is the predicted CTR (pCTR) for that impression. If, instead, the goal is the maximization of the total number of conversions, then k(x) is the pCVR for that impression. It is shown here that the optimal bidding function is a non-linear, concave, function of impression evaluation metrics, such as CTR and CVR. As we shall see, in comparison with the linear bidding function/strategy that is widely used in the industry nowadays, this non-linear concave bid function b(k(x)) leads to much better results and to novel (in relation to the linear strategy) recommendations for buying impressions. The function b(k(x)) will be obtained from the analysis of the winning probability, or win-rate w(b(k(x)), x). iii

The procedure is schematically as follows. Let’s use conversions and the conversion rate (CVR) to exemplify: (1) We receive a bid request which is a log with features such as the one shown in Figure 6, represented by a vector x0 , (2) we then use the training dataset to build a predictive model for the CVR and then use it to calculate pCVR for that particular impression corresponding to that particular bid request log x0 , (3) the pCVR is then plugged into the bidding engine and the output is the bid price associated with that pCVR. In section 5 it will be argued that the arguments of the win-rate function and the bidding function can be simplified to: w(b(k(x)), x) → w(b),

b(k(x), x) → b(k)

(5)

It must be noted that, though the argument of the bidding function will be just the predicted KPI k, in this case pCVR, b = b(pCVR)

(6)

and therefore b does not depend explicitly on the features vector, the other factors such as the budget, winning rate, market price and prior distribution, are implicitly embodied in the functional form of the bidding function since function b(k) is obtained using a functional optimization technique which takes into account all those factors.

Figure 2: Inputs and output of the bidding engine. Schematically, the problem is to find max ktarget (b(pk)) subject to cost ≤ B, b(...)

(7)

where b(...) is the optimal bidding function, ktarget is some chosen KPI to be maximized and pk is some predicted KPI rate corresponding to ktarget (e.g. if the latter is the number of conversions, the former is the predicted conversion rate pCVR) and B is the campaign’s budget. This is a functional optimization problem. A few examples are: • Maximize the number of clicks iv

• Maximize the number of conversions • Maximize a combination of clicks and conversions. In fact, in the iPinYou competition [3], the KPI was the following linear combination: #clicks + N × #conversions where the parameter N was included to show the relative importance of clicks and conversions. To predict the CTR or CVR one can use machine learning algorithms applied to the training dataset.

A simpler strategy as warm-up

I will briefly describe the linear bidding strategy since it is the most used strategy in the market nowadays (the same strategy is used throughout the whole lifetime of the campaigns). This strategy is based on a celebrated theorem from game theory according to which, for strategic competitors, the dominant strategy for pure Vickrey or second-price auctions is truth-telling, which means that bidders will bid their private values. Suppose the advertiser’s valuation of a conversion (or click) is equal to some target ktarget (x). As a participant in a real-time auction, the advertiser subscribes to a stream Table 3: Dataset statistics Training Data Adv. 1458 2259 2261 2821 2997 3358 3386 3427 3476 Total

Period 6-12 Jun. 19-22 Oct. 24-27 Oct. 21-23 Oct. 23-26 Oct. 6-12 Jun. 6-12 Jun. 6-12 Jun. 6-12 Jun. -

Bids 14,701,496 2,987,731 2,159,708 5,292,053 1,017,927 3,751,016 14,091,931 14,032,619 6,712,268 64,746,749

Imps 3,083,056 835,556 687,617 1,322,561 312,437 1,742,104 2,847,802 2,593,765 1,970,360 15,395,258

Clicks 2,454 280 207 843 1,386 1,358 2,076 1,926 1,027 11,557

Convs 1 89 0 450 0 369 0 0 26 935

Cost 212,400 77,754 61,610 118,082 19,689 160,943 219,066 210,239 156,088 1,235,875

Win Ratio 20.97% 27.97% 31.84% 24.99% 30.69% 46.44% 20.21% 18.48% 29.35% 23.78%

CTR 0.080% 0.034% 0.030% 0.064% 0.444% 0.078% 0.073% 0.074% 0.052% 0.075%

CVR 0.041% 31.786% 0.000% 53.381% 0.000% 27.172% 0.000% 0.000% 2.532% 8.090%

CPM 68.89 93.06 89.60 89.28 63.02 92.38 76.92 81.06 79.22 80.28

eCPC 86.55 277.70 297.64 140.07 14.21 118.51 105.52 109.16 151.98 106.94

Test Data Adv. 1458 2259 2261 2821 2997 3358 3386 3427 3476 Total

Period 13-15 Jun. 22-25 Oct. 27-28 Oct. 23-26 Oct. 26-27 Oct. 13-15 Jun. 13-15 Jun. 13-15 Jun. 13-15 Jun. -

Imps 614,638 417,197 343,862 661,964 156,063 300,928 545,421 536,795 523,848 4,100,716

Clicks 543 131 97 394 533 339 496 395 302 3,230

Convs 0 32 0 217 0 58 0 0 11 318

Cost 45,216 43,497 28,795 68,257 8,617 34,159 45,715 46,356 43,627 364,243

CTR 0.088% 0.031% 0.028% 0.060% 0.342% 0.113% 0.091% 0.074% 0.058% 0.079%

CVR 0.000% 24.427% 0.000% 55.076% 0.000% 17.109% 0.000% 0.000% 3.642% 9.845%

CPM 73.57 104.26 83.74 103.11 55.22 113.51 83.82 86.36 83.28 88.82

eCPC 83.27 332.04 296.87 173.24 16.17 100.77 92.17 117.36 144.46 112.77

N 0 1 0 1 0 2 0 0 10 -

and Wednesday. In the second price auctions, the second highest bids are (ii) The mobile users (on Andriod or iOS) are more likely to defined as the market price for the winner. If his/her bid Figure 3: Dataset statistics of iPinYou competition the click the ads from Advertiser 1458 while PC users (on Mac is higherfor than theimpression/click/conversion market price, the advertiser wins this and Windows) prefer the ads from Advertiser 3358. auction and pays the market price. Market price is always logs (iii) The ad CTR from two advertisers are both volatile modelled as a stochastic variable because it is almost imacross di↵erent region locations, and the trend are di↵erent. possible to analyse the strategy of each of the thousands of (iv) Ad slot size is correlated with the slot locations in the . auction participators [2]. A higher market price reflects a webpage and the design of creatives. We can see the banner more competitive the environment. (1000 ⇥ 90) and standard (300 ⇥ 250) slots generally have Here we have an investigation of the market price of adofhighest bid requests, of them representatingvertiser an opportunity to show an adofto some the CTR for botheach advertisers. 1458 and 3358 from the perspective a DSP. The (v) Ad exchanges call for bids and host the auctions. Difmarket price mean and standard error against di↵erent feaferent publishers (or their supply-side platforms, i.e. SSPs) tures are depicted in Figure 3, where we can see that just like connect di↵erent exchanges, thus the CTR distribution on the CTR plot, the market price has di↵erent trends against these exchanges are di↵erent. the same domain features on these two advertisers. For exv (vi) The CTR against di↵erent user tags is depicted in ample, for advertiser 1458 the bid competitiveness in the a log scale because the di↵erence is fairly large. For exmorning is higher than that in the afternoon and evening, ample, the vertical e-commerce advertiser 1458 could rewhile it is inverse for advertiser 3358. In addition, for adceive the CTR as high as 30% on the users with the tag vertiser 1458 the competitiveness in ad exchange 1 is higher 38 (In-market/clothing, shoes&bags) while only around than that in ad exchange 2 and 3, while for advertiser 3358, 0.1% CTR on the other users. The same volatility happens the ad exchange 2 is the most competitive one. to advertiser 3358. It shows the importance of user segmenComparing the market price and CTR distribution on intation for predicting their response for a specific ad camdividual features, we find that the ratio of standard error to paign. Therefore, the advertisers can refine their targeting its mean of market prices is smaller than that of CTR. This rules and bidding strategies based on the ad performance on is mainly because the click observations are binary value di↵erent user segmentations. Such user segmentation data is while market prices are integers. often provided by a third-party data management platform (DMP) or DSPs themselves.

particular user. The expected value of the click or conversion (supposing the advertiser won the auction) is simply: vlin (k(x)) = k(x) Ă&#x2014; ktarget (x)

(8)

where k(x) is the predicted KPI. If the bidding is truthful blin = vlin and therefore: blin (k(x)) = k(x) Ă&#x2014; ktarget (x)

(9)

It should be noted that the use of this theorem changes the ill-defined problem of finding an optimal bidding price into a well-defined problem, which is to predict the value of some KPI i.e. to find k(x). If the KPI is the number of clicks, the problem is wellknown and can be solved using a large variety of models (see next section). However, if the KPI is the number of conversions, there are issues with the sparcity of the data and one has to model the conversions at different hierarchical methods, as will be discussed later (see appendix E).

Non-linear bidding strategy

This rationale above, however, is not valid if the budget is fixed or if the advertiser is participating in more than one second-price auction. In Ref. [9] the competitive landscape in Ad Exchanges is studied. It is shown that, in repeated budget-constrained auction games, i.e. when advertisers have budget limits and participe in multiple Vickrey auctions over the lifetime of the campaign, dynamic interactions among them must be taken into account when the bidding landscape is characterized and truth-telling is not the a dominant strategy anymore. Though the model used in Ref. [9] leads to useful insights, the approach taken here is more pragmatic and uses simple statistical methods to model the market price and winning probability, instead of assuming the advertisers have well-defined strategies and will bid their private value. In this analysis, the bid price b will depend on many other practical factors, and not only on the predicted KPI value of the impression being auctioned. Among those factors are the budget constraint, the auction winning probability (or win-rate), the market price distribution, the feature vector of this particular impression and the prior distribution of x. The problem then becomes significantly more convoluted (as will be discussed later, for very weak budget constraints, the theorem about Vickrey auctions can be used and the corresponding linear strategy works reasonably well). Market prices are modelled as a stochastic variable since it is virtually impossible to analyse the bidding strategy of all auction participators. To predict them we can either use the data statistics or a regression model with censored data. The first option is more straighforward and in simpler to use in practice. The second option however vi

is more thorough and will be discussed in appendix B where it is heuristically derived an algorithm that learns the distribution of winning bids based on partial information (such as this case) and suggests bidding actions to maximize the number of impressions given a budget constraint. By analytically solving the optimization problem, it will shown that the win-rate function w(b) is much more important than the distribution of the features vector x in the process of building the optimal bidding function b (the x probability distribution is found to be only weakly correlated with the form of b). Before deriving the new bidding function taking into consideration this information, we need to obtain a mathematical expression for the winning-rate. Following [2] we will derive an expression for the winning rate using functional optimization. The predicted KPI will be denoted by k(x). For example, if the campaign goal is to maximise conversions, k(x) denotes the predicted conversion rate pCVR for that impression. We also denote by pk (k) the prior distribution of the predicted KPI per bid request. Note that when the campaign starts there is no available data. To deal with this issue, one possibility is to spend a small fraction of the budget, bid randomly and learn some statistics. In [5] for example, a bid landscape prediction, i.e. the auction volume forecast, is used to estimate the auction statistics for a given setting and budget constraint. The estimated number of bid requests for the target rules during the lifetime T is denoted by N (T ). Hence we see that the approach has two stages, the first one is learning the statistics to obtain px (x) and N (T ), and the second is to find the functional form for the winning rate and for the corresponding bid. To derive a functional form for the winning rate w(b(k(x))) let us formulate the problem more precisely. Let us suppose the goal of the campaign is to maximize some specific KPI, say, the number of clicks or conversions. Now, for concreteness, letâ&#x20AC;&#x2122;s consider CVR to be the quantity inside the argument of the bid function a write k(x) = n(x) . The functional optimization problem can then be stated as: Z b(n(x)) = arg max N (T )

dx n(x)w(b(n(x)))px (x)

b(n(x))

Z subject to N (T ) 5.0.1

Remarks

A few remarks are in order:

vii

dx b(x)w(b(n(x)))px (x) â&#x2030;¤ B

(10)

1. The product of the first two factors inside the integrand of b(n(x)) is: n(x)w(b(x)) = conversion rate × Pr(winning auction)

(11)

which is the probability of conversion provided the auction was won. This is for a given feature vector x. The integral sum over feature vectors. The integral is then multiplied by the number of auctions N (T ) during the interval T . 2. This is a functional optimization problem i.e. we are after a functional w(b(x)) = f (b(x)) or “a function of a function” of x and x is not explicitly present in w(b(x)). 3. The function px (x) is the probability distribution of the feature vectors x. It is the probability (density) that the bid request will be, say, for an impression placed in the Desktop news feed, on a Monday, at 11pm, etc. Naturally, this will be available only after the learning phase. 4. The product b(x)w(x) inside the integrand on the second line of Eq. (10) is the expected value of the bid for a given feature vector x. Integrating over x and multiplying by the number of bid requests, we sum over all bids, and this sum must be restricted by some budget B. Note that, due to the reserve price setting, the cost is usually higher than the second highest bid and we use the bid price b(x)w(x) as the upper bound of the cost of winning. 5. We note that, as stated, the problem is not explicitly dynamical, but the timedependency is included by considering date and time as features. 6. Notice that by simultaneously maximizing the number of clicks and fixing a budget, we are indirectly minimizing the predicted cost of each click or conversion i.e. CPC or CPA . These are extremely important metrics in display advertising. 7. The budget constraint and the campaign’s lifetime are taken into account. The former by the condition to which the optimization is subject and the latter by the factor N (T ).

5.1

How to simplify the problem

The problem can be simplified by writing the probability distributions of x in terms of the prior probability distribution of n(x) which we will denote by pn (n). Their ratio is the gradient ∇n(x) and changing variables inside the integral the optimization problem reads:

viii

Z b(n) = arg max N (T )

dn nw(b(n))pn (n)

b(n)

Z subject to N (T )

(12)

dn b(n)w(b(n))pn (n) ≤ B

The problem can be solved by writing down a Langrangian and using basic calculus of variations. The constrained Lagrangian would read: Z Z L(b(n), λ) = dn nw(b(n))pn (n) − λ dn b(n)w(b(n))pn (n) −

B N (T )

Optimization of L leads to canonical Euler-Lagrange differential equations which applied to L gives: npn (n)∂b(n) w(b(n)) − λpn (n) w(b(n)) + b(n)∂b(n) w(b(n)) = 0 (13) (14)

λw(b(n)) − [n − λb(n)] ∂b(n) w(b(n)) = 0 This system has no unique solution. A quite general form for w(b) is: w(b) =

bα (n) cα + bα (n)

(15)

where α ≤ 2 for analytical solutions. Analyzing the dependency of the winning bid

prices (the market prices) on the bid request features [2] , one finds that that the winning price distributions have no clear relation to the feature values. This is consistent with this expression for w(b), according to which the bid price is the only argument i.e.

we can write w(b, x) ≡ w(b). In [2] the values of α = 1, 2 are chosen where c is a constant. Using α = 1 we obtain the following expression for the bidding function: # "r 1 n+1−1 (16) b(n) = c cλ

where k can the estimated CTR or estimated CVR. This expression is clearly concave. The open parameter c is obtained from the dataset. To find λ, b(k) must be substituted into the Lagrangian constraint (where the inequality would become an equality): Z B dk b(k, λ)w(b(k, λ))pk (k) = , (17) N (T ) which states that all budget was used. The variable λ was included for convenience (see below). This equation can be solved numerically. In [2] however, they suggest a different approach which takes λ as a tuning parameter based on the training dataset. From equation (16) we see that, as λ decreases, b(k) increases. The win-rate function −1/2 k w(k) = 1 − +1 , (18) cλ It is not difficult to check that b(k)w(b(k)) increase monotonically as λ decreases. The same behavior occurs for the integral since pk (k) is independent of λ. Hence, increasing the budget per-case

B N (T )

implies that λ decreases and that corresponds to a

higher bid price. The optimal λ trend as a function of ix

B N (T )

is shown in Figure 5.

Figure 4: Example of bidding functions where the argument is the CTR. .

5.2

Tuning λ

6.1

Data Description of the Logs

There are four types of logs in the training dataset: • Bids logs • Impression logs • Click logs • Conversion logs Examples are shown in Figure 6. The log on the left-hand side is the bid log, where the first entry, Bid ID, uniquely identifies one ad impression. Corresponding to this bid log there are two other logs, namely, one impression log and one click log (or conversion log), both with the same Bid ID. Hence, the Bid ID can be used to join the bidding, impression and click logs. The 24 features in the dataset are shown in Figure 6. The dataset is from a wellknown DSP company and contains more than 15 million impressions with user feedback of 9 campaigns from multiples advertisers in an interval of 10 days in 2013. The logs are on a row-per-record basis. There are two types of logs, bid logs and impression/click and conversion logs x

Figure 5: Tuning of the λ parameter. . For each log there is information about the user, about the advertisers (characteristics of the creative such as format and size, etc), about the publisher (the auction reserve price, page domain, URL and ad slot, etc). For each bid request the user clicks/conversions is recorded if the advertiser won the auction. More specifically, for each record (row), some of the following types of entries/columns [3](for more details see here) are present: • Auction features (columns 1, 2, 4-12) – The bid ID identifies the event log. Note that a record (row) contains not only information about clicks/conversions. Part of the logs correspond to impressions. – The date format is yyyyyMMddHHmmssSSS – The Log type is either 1, 2 or 3 corresponding respectively to an impression, a click or a conversion. – The User-Agent column describes the device, OS and browser of the user. – Columns 10 and 11 correspond to the hosting webpage of the ad slot being auctioned, and the correspondent URL. – Column 18 define a minimum bid to win the auction. xi

Advertiser Key Season 3a7eb50444df6f61b2409f4e2f16b687 1 11 12 12 22 22 22 23 23 33 3

df6f61b2409f4e2f16b6873a7eb50444 9f4e2f16b6873a7eb504df6f61b24044 1458 3a7eb50444df6f61b2409f4e2f16b687 3358 9f4e2f16b6873a7eb504df6f61b24044 3386 1458 3427 3358 3476 3386 2259 3427 2261 3476 2821 2259 2997 2261 2821 2997

3 3 3

Industrial Category Chinese vertical e-commerce Consumer Packaged Vertical online mediaGoods (CPG) Chinese vertical vertical e-commerce e-commerce Chinese Softwareonline media Vertical International e-commerce Chinese vertical e-commerce Oil Software Tire International e-commerce Milk powder Oil Telecom Tire Footwear Milk powder Mobile e-commerce app install Telecom Footwear Mobile e-commerce app install

Table 2: The bidding log data format. Column with ∗ means the data in the column is hashed or modified before the log is released.

Table 3: The impression, click and conversion log data format. Most columns are the same with the bidding log, except the Log Type, Paying Price and Table 2: The bidding log data format. Column with Table 3: Page The URL. impression, and conversion log the Key Columnclick 23 (Advertiser ID) and SN the Column Example ∗ means data in the column is hashed or modified data format. Most columns the same with the 24 (User Profile IDs) are onlyare availabe in Season 2 before∗ 1theBid logIDis released. c0550000008e5a94ac18823d6f275121 • Conversion bidding log, except the Log Type, Paying Price and and 3 datasets. logs 2

∗3

Timestamp

20130218134701883

the Key Page URL. Column 23 (Advertiser ID) and

35605620124122340227135 Example SN Column Example 24shown (User Profile only availabe in Season 2 in Figure ?? IDs) are Mozilla/5.0 (Windows NT 5.1) \ They are ∗ Bid ID 01530000008a77e7ac18823f5a4f5121 c0550000008e5a94ac18823d6f275121 and 13 datasets. AppleWebKit/535.11 (KHTML, \ 2 Timestamp 20130218134701883 20130218134701883 like Gecko) Chrome/17.0.963.84 \ 3 Log Type 1Example 1, 2, 3 35605620124122340227135 SN Safari/535.11 SE 2.X MetaSr 1.0 ∗ 4 Column iPinYou ID 35605620124122340227135 Mozilla/5.0 (Windows NT 5.1) \ ∗1 119.163.222.∗ Bid ID 01530000008a77e7ac18823f5a4f5121 5 User-Agent Mozilla/5.0 (compatible; MSIE 9.0; \ ganisedonarow-per-recordbasis.ThefeaturedescriptionandexampleofeachcolumnoftheadlogdataarepresentedinTable1.Generally,eachrecordconta AppleWebKit/535.11 (KHTML, \ 6 Region ID 146 2 Timestamp 20130218134701883 Windows NT 6.1; WOW64; Trident/5.0) like147 Gecko) Chrome/17.0.963.84 \ pressioncanbecheckedtoupdatetheDSPperformance. 7 City ID ∗ 6 Log 3 Type 1 IP 118.81.189.∗ Safari/535.11 SE 2.X MetaSr 1.0 8 Ad Exchange 2 ∗4 ID 35605620124122340227135 One without 3,20,21ID columns (bidding log) One with column 3 equal to 1 (impres7 iPinYou Region 15 ∗5 ∗IP 119.163.222.∗ 9 Domain e80f4ec7f5bfbc9ca416a8c01cd1a049 58 User-Agent Mozilla/5.0 (compatible; MSIE 9.0; \ City ID3 equal to 2 (click) 16 sion) One with column One with column 3 equal to 3 (conversion) URL 6 ∗ 10 Region ID 146hz55b000008e5a94ac18823d6f275121 9 Ad Exchange 2Windows NT 6.1; WOW64; Trident/5.0) URL 147null 7 11 City Anonymous ID ∗6 See tables 2Domain and 3 from "iPinYou Global RTB Bidding Algorithm Competition 10 IP e80f4ec7f5bfbc9ca416a8c01cd1a049 118.81.189.∗ Ad Slot ID 8 12 Ad Exchange 2 973726 9023493 ∗ 11 URL ID hz55b000008e5a94ac18823d6f275121 15 Dataset" 7 Region ∗9 13 Ad Slot Width 300 Domain e80f4ec7f5bfbc9ca416a8c01cd1a049 12 City Anonymous URL null 8 ID 16 ∗ 10 14 250 The 2413 features in theIDdataset are shown in Figure 3. The dataset is from a wellURLAd Slot Height hz55b000008e5a94ac18823d6f275121 AdExchange Slot 2147689 8764813 9 Ad 2 15 Ad Slot Visibility FirstView 11 Anonymous URL null 14company Ad Slot Width 300 ∗ 10 known DSP and contains more than 15 million impressions with user feedDomain e80f4ec7f5bfbc9ca416a8c01cd1a049 Ad Slot Na 9023493 12 16 Ad Slot ID Format 973726 15 URL Ad Slot Height 250 ∗ 11 hz55b000008e5a94ac18823d6f275121 Ad Slot Floor Price300 0 back of 9 campaigns fromVisibility multiples advertisers in an interval of 10 days in 2013. The 13 17 Ad Slot Width 16 Anonymous Ad Slot SecondView 12 URL null 18 Creative ID f80f4ec7f5bfbc9ca416a8c01cd1a049 14 ∗ Ad Slot Height 250 17 Ad Slot Format Fixed logs are on are two 8764813 types of logs, bid logs and impres19 Bidding Price 573 13 a row-per-record Ad Slot ID basis. There 2147689 18 Ad Slot Floor Price 0 15 Ad Slot Visibility FirstView sion/click14 and conversion logs 20 Advertiser ID 2259 Slot Width 300 19 Ad Creative ID e39e178ffdf366606f8cab791ee56bcd 16 ∗ 21 Ad Slot UserFormat Profile IDs Na null 15 Ad Slotis Height 250the user, about the advertisers (characteris∗ 20 log For each there information about Bidding Price 753 17 Ad Slot Floor Price 0 16 Slot Price Visibility SecondView ∗ 21 Ad Paying 15 18 Creative ID f80f4ec7f5bfbc9ca416a8c01cd1a049 tics of the creative such format and size, etc), about the publisher (the auction reserve ∗ 22 Ad 17 SlotasFormat Fixed ∗ 19 Landing Page URL a8be178ffdf366606f8cab791ee56bcd Bidding Price 573 18 Ad Slot 0 For each bid request the user clicks/converprice, page domain, URLFloor and adPrice slot, etc). the bidding log, except Log Type, Paying Price and Key 23 Advertiser ID 3358 20 Advertiser ID 2259 ∗ 24 Creative 19 ID e39e178ffdf366606f8cab791ee56bcd IDs won the 123,5678,3456 ∗ 21 Page URL. Columns with ∗ means the data in the columnsions is ∗recordedUser if theProfile advertiser auction. User Profile IDs null Bidding Price 753 is hashed or modified before the log is released. The price More∗ 20 specifically, each record (row), 21 Payingfor Price 15 some of the following types of entries/unit is RMB/CPM. ∗ 22 PageseeURL a8be178ffdf366606f8cab791ee56bcd columns [2](forLanding more details here) are present: the bidding log, except Log Type, Paying Price and Key 23 Advertiser ID 3358 ∗ 24 DATASET DOWNLOAD User Profile IDs 1, 2, 4-12) 123,5678,3456 Page4. URL. Columns with ∗ means the data in the column • Auction features (columns

SN ∗1 2 ∗3 4

iPinYou ID Column 4 User-Agent Bid ID Timestamp iPinYou ID ∗User-Agent 5 IP

Figure 6: The four types of logs in the trainingWithout set. The column oncompetition the right represents the eﬀorts of iPinYou organization is hashed modified the log is released. The price Theordataset can before be downloaded from Baidu WebDrive

committee, this competition and the dataset release is im– The bid They ID identifies the eventShen, log. Note that aLiao, recordLingxiao (row) contains not possible. are Xuehua Hairen onlyZhenchuan informationLiu, aboutXijun clicks/conversions. Part of Xin the logs correspond to Peng, Piao, Xiaoli Yang, Li, and by University College London. The updated downloading Fanimpressions. Luo. This is a hobby project (not even 20% time 4. DATASET DOWNLOAD Without thethey eﬀorts of iPinYou competition address list will be maintained on http://contest.ipinyou. project) and organize this competition out organization of passion. The dataset can be downloaded WebDrive committee, competition and the dataset release is im– The date this format is yyyyyMMddHHmmssSSS com/data-release.html. Pleasefrom readBaidu the README file for All iPinYou members show strong support of this competihttp://pan.baidu.com/s/1kTwX2mF. It If is you alsohave available at more information about the dataset. any other possible. They are Xuehua Shen, Liao,designs Lingxiao tion and give help. Even Hairen UI designer a – The Log typea isvariety either 1,of2 or 3 corresponding respectively to an impression, http://data.computational-advertising.org, a server host questions about the dataset, feel free to email dsp-competition@ Peng, Zhenchuan Liu, Xijun Piao, Xiaoli Yang, Xin Li, and nice T-shirt for the competition. aLuo. click orThis a conversion. by University College London. The updated downloading ipinyou.com. FanThe is wins a hobby project (notofeven 20%mantime competition the strong support iPinYou address list will be maintained on http://contest.ipinyou. project) and they organize competition out of passion. agement team. Ascolumn far as describes wethis know, this is the ever – The User-Agent the device, OSfirst and time browser of the user. com/data-release.html. Please read the README file for All iPinYou members show strongasupport of this competi5. ACKNOWLEDGMENTS a worldwide competition integrate corporation production – Columns 10 and 11 correspond to the hosting webpage of the ad more information about the dataset. If you have any other tion and give a variety of help. Even UI designer designsslot a being questions about the dataset, feel free to email dsp-competition@ nice auctioned, andthe the competition. correspondent URL. T-shirt for

is also available at unit http://pan.baidu.com/s/1kTwX2mF. is RMB/CPM. three columns one for each Itlog type. http://data.computational-advertising.org, a server host

– Auction winning price (column 21) is the highest bid from competitors (since we have only second-price auctions). If the bidding engine gives a

ipinyou.com.

bidding price higher that the auction winning price, thesupport advertiser The competition the strong of iPinYougets man- the – Column 18 define wins a minimum bid to win the auction.

ad impression 5. ACKNOWLEDGMENTS

agement team. As far as we know, this is the first time ever

– Auction winning (column iscorporation the highestwill bid from competitors corresponding toa that Bid ID,price and this21)arecord occur in worldwide competition integrate production (since we have only second-price auctions)

the impression log. The distribution of the market price may be estimated • Ad features (columns 13-19 and 22-24) examining the data statistics or using a regression model, which is called vi bid landscape forecasting. The knowledge of the market price distribution

allows one to estimate the probability of winning a given ad auction and the corresponding cost given a bid price. • Ad features (columns 13-19 and 22-24) – Column 16 describes how fast the ad is viewed depending on its position. For example, if the ad slot is above the fold it is classified as “FirstView". If one has to “scroll down” to see the ad, it is classified as “SecondView” and so on. – Column 17 can be “Fixed” (fixed size and position), “Pop” (pop-up), etc • User feedback, measured by clicks or conversions, on that impression (column 3). The “User Tags” column will tell us the segments if which the user in interested in e.g. buying something, travel, readings, etc. The auction and ad features are sent to the bidding engine to generate a bid response. Note that log type, paying price and the key page URL are present only in the impression/click/conversion logs (and not in the bid logs).

xii

Experiment

The dataset was described in section 6. We will briefly review it here. Consider Figure 6. Note that the log on the left hand side, called the bid log, does not have information about Log type, paying price and Key Page URL. The Log type entry, present in the three logs on the right hand side determines if the log corresponds to an impression, a click or an impression. The auction winning price, or paying price or market price in column 21 is the highest bid from the competitors. The logs contain information about: • The user. For example User tags determines user segmentation (if the user is interested in travelling, reading, etc)

• The advertiser (e.g. format and size of creative) • Publisher (e.g. ad slot size) In Figure 8, the relation between the auction winning rate and the bid value from the training dataset is shown. Eq. (16) is therefore a reasonable form for the w(b). The dependence of the winning bid price (column 21) is shown in ?? not to depend strongly on the features values (this is operationally obtained by comparing, for each impression log (Log price = 1) the relation between the value on column 21 and the feature entries on that log). This suggests that the features are not very important in the determination of the winning rate which corroborates the ansatz w(b(x), x) = w(b(x)). Since, according to these curves, the impressions with low cost (low bid value) are more cost effective, this bidding strategy encourages the advertiser to bid more low cost impressions. In other words, the bidding strategy should try to bid more impressions instead of focusing only on the small set of impressions with high value.

8 8.1

Training/testing Training

The data is split with ration 2:1 by Timestamp sequence. The training data is used to: • Train the predicted CTR or CVR estimator extracting the feature values from the log data. Machine learning models are used here. • Tune the parameters c and λ of the bidding function.

8.2

Testing

The test data is a list records consisting of the features of a bid request, the auction winning price and user feedback information (CTR or CVR for example). The bidding engine reads each record and generates a bid price. The bid price is compared with the xiii

auction winning price and in case it is higher the ad is shown and the clicks/conversions and charged price of that record are used to update the performance and the total cost. In more details we have the following steps:

Once we define the optimal bidding strategy and choose a budget (for some test period) we proceed with the following steps..

Initialize The bidding engine is initialized and we set the cost and performance measure such as conversions to zero

Bid Request The bid request is passed (using the timestamp as ordering) to the bidding engine

Bid calculation A bid is calculated based on information about budget, market price and performance indicators

Auction simulation From impression log, if bid price is higher than winning price, impression is won

Matching Match click and conversion events in the logs

SOURCES http://www.socialmediatoday.com/content/business-process-management-howchoose-between-different-methodologies

Figure 7: Evaluation protocol.

CREATED BY Your Name / Company Name

1. First initialise the bidding engine, setting a predefined budget, initialising the cost and performance as zero. Measures of performance are, for example, achieved clicks and achieved number of conversions. 2. A bid request is passed (ordered by the timestamp feature) to the bidding engine. It contains contextual and behavioural data of the corresponding auction and also the ad data as shown Figure 6. 3. With information such as the budget, current cost and achieved performance, the bidding engine computes a bid for this request. If at some point the cost surpasses the budget, all left bid requests are skipped (the bid responses are set to zero after this point). 4. The auction is simulated by referencing the impression logs (see Fig. 6 on the right) if bid price is higher than the auction winning price in the log (paying price on column 21) and floor price (column 18), the bidding engine wins the auction and gets the ad impression. xiv

mps 056 104 765 802 360 556 617 561 437 258

Clicks 2,454 1,358 1,926 2,076 1,027 280 207 843 1,386 11,557

Cost CTR CPM eCPC 212,400 0.080% 68.89 86.55 160,943 0.078% 92.38 118.52 210,240 0.074% 81.06 109.16 219,067 0.073% 76.93 105.52 156,088 0.052% 79.22 151.99 77,755 0.034% 93.06 277.70 61,611 0.030% 89.60 297.64 5. The next step is to match the click and conversion events in the logs for this 118,082 0.064% 89.28 140.07 impression if the auction is won. 63.02 The performance14.21 of the bidding machine is 19,689 0.444% updated and saved and the cost increases by the paying price. 1,235,876 0.075% 80.28 106.94 6. Check for any bid requests left in the test data and terminate the process if there aren’t any. 1.00

1.00

8.3

Impact of budget

0.75 The budget was chosen to be equal to 1/64, 1/32, 1/16, 1/8, 1/4 and 1/2 of the orig-

0.75

inal total cost of the test data. 0.50 It is found that when the budget is small, the non-linear

0.50

bidding strategy improves the results (number of clicks or conversions) significantly. If the budget is less restricted, 0.25 simpler strategies also deliver reasonably good results.

0.25

Nevertheless, non-linearity is found to be optimum in all scenarios. In Figure 10 we see the comparison between the strategy currently used in the market and the strategy 0.00

0.00 300

explained here. In [2] it is shown the non-linear 100 200 0 that 100 200 strategy 300 outperforms all others Campaign 2 Bid Campaign 3 Bid both in number of clicks and eCPC.

1.00

0.75

0.50

Figure 6: Winnin features for camp

4.2

Evaluation

The task of DSP (such as clicks, conv 0.00 0.00 fore, the KPI is the 0 0 100 200 0 100 200 300 periment. Specifical d Campaign 5 Bid Campaign 6 Bid mary KPI, while ot Figure 8: Win-rate curve for one of the campaigns. We see that lower evaluated imRelationship between auction winning are also monitored. pressions are more cost effective. value for diﬀerent campaigns.. native KPI which is conversions, as will serve price, ad slot, page domain and URL) A region, Real-timethe bidding t (e.g. the time, browser and op). For each bidIn real-time request, there isauctions an [1],auction 4.3and KPI Estim bidding (instantaneous) advertising inventory is bought sold on a per-impression (click, basis. Advertisers bid on individual impressions and it the bid ice, and the user feedback conversion) For each campaign is won, the corresponding ad is displayed instantly on the publisher’s website. Typihe advertiser won the auction. More details logto andata to train a K cally the process begins when the user visits a website, triggering a bid request is shown in Table 2. ticularly, if the KPI ysis on winning bids. Figure 5 depicts the well-known CTR e xv with respect to the bid price for campaigns such as random fore ace limit, campaigns 7-9 are not shown). We can be applied here ll the campaigns follow a similar pattern: as bidding strategy in ncreases, the campaign winning rate increases apply the Logistic r 0.25

0.25

Test Environment 0. Initialise the bidding engine

Table 3: Bid strategy attributes. Test Starts

Bidding Strategies Consider budget conditions Evaluate per impression value Consider winning function

Test Data

If test data has next record?

Test Ends

Rand â&#x2C6;&#x161;

Mcpc â&#x2C6;&#x161;

Lin â&#x2C6;&#x161; â&#x2C6;&#x161;

ORTB â&#x2C6;&#x161; â&#x2C6;&#x161; â&#x2C6;&#x161;

Get next record

One bid request Auction & ad features

Auction win price

5. Go to next record Total Clicks

Yes

Const â&#x2C6;&#x161;

User feedback

Auction win price User feedback

. Auction Winning Checking

bid > winning price ?

Process flow Data dependency

7500

9000

5000

6000

2500

3000

10000

5000

0 const lin mcpc ortb1 ortb2 rand Budget (1/64)

tion flow chart.

0 const lin mcpc ortb1 ortb2 rand Budget (1/16)

const lin mcpc ortb1 ortb2 rand Budget (1/4)

60 30

40 30

eCPC

ts in the test data, then the 20 20 nal performance returned. 20 10 using user feedback logs for 10 our case, user feedback on0 0 0 const lin mcpc ortb1 ortb2 rand const lin mcpc ortb1 ortb2 rand const lin mcpc ortb1 ortb2 rand ions (having ad impressions); Budget (1/64) Budget (1/16) Budget (1/4) the losing bids. We thus do ll click or convert even if we Figure 8: Overall performance comparison. riginally losing ad auction in paper, we follow the convenframework. As shown in Eq. (13) and (15), the paramefrom sponsored search [19], Figure 9:ters Strategies Only linear andbyORTBs relevant in this disare c comparison. and Îť, where c istheobtained fittingarethe winning nd Web search [11] that the cussion. probability and Îť is tuned using the training data. n user feedback are ignored Table 3 summarises the attributes of the diďŹ&#x20AC;erent strateilure cases). To complement . gies. Mcpc is not budget-aware to make sure it spends all will further show the online the budget. Mcpc, Lin and ORTB perform bidding based tion DSP in Section 6. ad exchange. The bid request contains data such Taking as userâ&#x20AC;&#x2122;s winning demographics, browsing hison impression-level evaluation. functions s easy to see that if we set into account, ORTB is the most informative strategy. In tory, location, etc. Ad exchanges submit the request to advertisers who automatically riginal total cost in the test Section 5 we will analyse the impact of these attributes in as high as possible for each bid, in real time, to place their ads (this is done on each impression as it is served and it the final performance. budget and get all the logged is repeated for every ad slot on the webpage). The highest bidder wins the impression performance against various and their 5. ad is served on the webpage. ampaign, we respectively run OFFLINE EVALUATION 1/32, 1/16, 1/8, 1/4 and 1/2 In the experiment, we answer the following questions. (i) e test log as the budget. Does our derived non-linear bidding function outperform B Predicting Winning Price Censored the state-of-the-art linear one? in (ii)RTB Whatwith are the character- Data dding Strategies istics of our ORTB algorithms and how do the parameters baseline and state-of-the-art performance and what are their relationships Consider impact the linearthe regression eriment. The parameters of with the budget conditions? ed using the training data. z = Î˛T x + (19) . Bid a constant value for all 5.1 Performance Comparison aign. The parameter is the The performance comparison on total clicks and eCPC where the noise is Gaussian. The winning price is only observed when the bid wins under diďŹ&#x20AC;erent budget conditions are reported in Figure 8. Randomly choose a bid value the auction implies instances (x, z) observed a right-censored and Wewhich observe thatthat (i)the under everyofbudget condition, our proter is the upper bound of the posed bidding strategies ORTB1 and ORTB2 have the best biased. In [12] a censored linear regression is used to model the winning price as a performance on total clicks, which verifies the eďŹ&#x20AC;ectiveness C (Mcpc). As discussed in of the derived non-linear forms of the bidding function. (ii) l on max eCPC, which is the Except ORTB, Lin is the best algorithm in the comparison. he bid price on an impression This algorithm represents the widely used DSP bidding sx eCPC and pCTR. Here we xvi Mcpc is aware of the upper trategies in product [32]. (iii) ach campaign by dividing its bound cost for each bid request, and dynamically changes licks in the training data. No its bid according to the estimated CTR. However, compared ategy. to ORTB and Lin, Mcpc has no adaptibility to diďŹ&#x20AC;erent budCTR (Lin). In the previous get conditions. For example, when the budget is relatively early proportional to the pClow for the bid request volume, Mcpc will still bid based rally written as on the originally set max eCPC, while ORTB and Lin can adaptively lower the bid to earn the impressions and clickÎ¸ =b , (17)

Figure 10: The number of clicks improves using the strategy described here (compared with the linear strategy). function of the auction features. The distribution of the winning price is z − βT x p(z) = log φ σ

(20)

where φ is the standard normal density function. For the auctions lost the data (x, b), where b is the bid price the only information we have is that winning price is higher than the bid z > b and P (z > b) is defined as: T β x−b P (z > b) = log Φ σ

(21)

is the cummulative stardard normal. We conclude that if the observed data W = (x, z) and censored data L = x, b we have the following training of the censored linear regression: min − β

X (x,z)∈W

log φ

z − βT x σ

−

xvii

X (x,b)∈L

log Φ

βT x − b σ

(22)

Click-through rate model

Bid Landscape Forecasting

Conversions and the problem with sparcity

References [1] Real-time bidding, Wikipedia contributors, Wikipedia, 2 June 2017, The Free Encyclopedia. [2] Zhang et al, Optimal Real-Time Bidding for Display Advertising. [3] Real-Time Bidding Benchmarking with iPinYou Dataset, Weinan Zhang, Shuai Yuan, Jun Wang, Xuehua Shen, arXiv:1407.7073 [4] Feedback Control of Real-Time Display Advertising [5] Y. Cui, R. Zhang, W. Li, and J. Mao. Bid landscape forecasting in online ad exchange marketplace. In KDD, pages 265–273. ACM, 2011 [6] Real-Time Bidding Benchmarking with iPinYou Dataset [7] Statistical Arbitrage Mining for Display Advertising [8] More precisely, the winner pays “one cent” more than the second higher bid. [9] Balseiro, S. R., Besbes, O., and Weintraub, G. Y. (2015). Repeated auctions with budgets in ad exchanges: Approximations and design. Management Science, 61(4): 864–884. [10] Graepel et al, Web-Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine [11] Yuan S. and Wang J., Real-Time Bidding: A New Frontier of Computational Advertising Research, A WSDM 2015 Tutorial [12] Wush Chi-Hsuan Wu, Mi-Yen Yeh, Ming-Syan Chen, Predicting Winning Price in Real Time Bidding with Censored Data, KDD 2015. [13] T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. [14] M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads.

xviii