Market Efficiency in Person-to-Person Betting on Golf

Transcription

Market Efficiency in Person-to-Person Betting on Golf
FACULTY OF SOCIAL SCIENCES
UNIVERSITY OF COPENHAGEN
Master’s Thesis
Sebastian Schock
Market Efficiency in Person-to-Person Betting
on Golf Tournamnets
Supervisor:
Rasmus Jørgensen
April 7, 2014
Abstract
In this thesis, I examine market efficiency for golf betting markets on Betfair.com, the leading provider of person-to-person betting exchanges. I create
and analyze a novel dataset containing winning market prices for all golf tournaments in the PGA Tour and the European Tour in 2011 and 2012 as well as
historical golf results from the PGA Tour, European Tour, Champions Tour
and Nationwide Tour from the beginning of 2002 to the end of 2012. A set
of betting strategies are created with the aim of exploiting weak-form inefficiencies and semi strong form inefficiencies. I find no evidence of weak-form
inefficiency. However, the most profitable proposed betting strategy, which
is based on historical golf results, enables a bettor to more than double his
starting wealth over the two-year period from 2011 to 2012. My findings thus
indicate semi strong form inefficiency in Betfair’s golf markets.
Preface
The main idea for the thesis arose while I was watching a PGA golf tournament with my good friend Christian Kragh. While we watched the tournament, we followed the live odds on the online betting site Betfair.com and
discussed whether the odds reflected the golfers’ true winning probabilities.
The idea was further developed by talking to another good friend, Mathias Trock, who wrote his bachelor thesis evaluating the efficiency of betting
markets for Danish horse races. Mathias has furthermore been helpful with
comments and ideas to improve this thesis.
My supervisor, Rasmus Jørgensen, has also been of great help. We have
had lengthy talks about efficiency on betting markets for a wide range of
events - from US presidential elections to Swedish horse races.
Contents
1 Introduction
4
1.1
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.2
Literature review . . . . . . . . . . . . . . . . . . . . . . . . .
9
1.2.1
Prediction markets . . . . . . . . . . . . . . . . . . . .
9
1.2.2
Sport betting markets . . . . . . . . . . . . . . . . . . 11
2 Empirical evidence of inefficiencies in Betfair’s golf betting
markets
2.1
2.2
2.3
14
Domain knowledge . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1
Golf . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2
Betfair’s sport betting exchange . . . . . . . . . . . . . 18
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1
Historical golf results . . . . . . . . . . . . . . . . . . . 21
2.2.2
Historical odds for golf events . . . . . . . . . . . . . . 26
2.2.3
Merging the two data sources . . . . . . . . . . . . . . 30
2.2.4
Extracting attributes for statistical modeling . . . . . . 30
2.2.5
Removing outliers . . . . . . . . . . . . . . . . . . . . . 36
2.2.6
Weaknesses of the dataset . . . . . . . . . . . . . . . . 37
Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.1
One-step estimation: Conditional logit model . . . . . 40
2
2.3.2
One-step estimation: Conditional logit model with variable attribute discounting . . . . . . . . . . . . . . . . 43
2.3.3
2.4
2.5
Two-step estimation . . . . . . . . . . . . . . . . . . . 49
Betting strategies . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.4.1
Technical betting strategies . . . . . . . . . . . . . . . 53
2.4.2
Fundamental betting strategies . . . . . . . . . . . . . 55
Performance evaluation of the statistical models and betting
strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.5.1
Performance by following the strategies in the 139 tournaments . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.5.2
Performance by going against the strategies in the 139
tournaments . . . . . . . . . . . . . . . . . . . . . . . . 68
2.5.3
Robustness . . . . . . . . . . . . . . . . . . . . . . . . 70
3 Conclusion
73
Bibliography
76
A Appendix
81
A.1 Golf dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . 81
A.2 LR testing of parameter significance . . . . . . . . . . . . . . . 82
3
Chapter 1
Introduction
One of the dominant subjects of today is Big Data. We are generating everincreasing amounts of data, which we wish to analyze in order to gain greater
insight into complicated systems in diverse fields. Many of the analyzes we
wish to perform are focused on the question: “What is the probability of...?”.
(Arrow et al., 2008)
Examples of such questions regarding binary events could be: Will the
President be re-elected? Will Russia declare war on Ukraine? Is the new
iPhone going to generate more than X billion dollars in revenue next year?
Is the next quarterly sale of HP going to be in a given range? Will more
than Y million people watch the next episode of Netflix TV-series ’House of
Cards’ ?
One way to estimate probabilities for these questions would be the development of analytical theories and sophisticated simulation- or regressionalgorithms, which incorporate lots of data. Another approach would be asking the crowd. Recent studies have suggested that asking the crowd by letting
people trade in speculative futures markets (so called prediction markets),
could yield good probability estimates (Tziralis & Tatsiopoulos, 2012). The
4
idea is that dispersed information among the various individuals participating in the market can be aggregated via the free market mechanism and
thereby provide accurate estimates of event probabilities.
But how does prediction markets stand up against an analytical approach
that incorporates historical data in yielding precise probability estimates?
In this thesis I evaluate whether prediction market prices provide accurate
estimates of winning-probabilities for golfers in golf tournaments (i.e whether
golf prediction markets are efficient). I create a novel dataset containing: (1)
market-prices from the biggest public prediction market, Betfair1 , for all golf
tournaments in the PGA Tour and the European Tour in 2011 and 2012 and
(2) historical golf results from the PGA Tour, European Tour, Champion
Tour and Nationwide Tour from the beginning of 2002 to the end of 2012.
I evaluate whether golf prediction markets are efficient by testing if market
prices are well adjusted to two sets of relevant historical data. First, I perform
weak form tests to see if prices are efficiently adjusted to historical prices.
Secondly, I perform semi-strong form tests to see if prices are efficiently
adjusted to other information that is undoubtedly publicly available (e.g.
results from previous golf tournaments). These are two of the three tests
proposed by Fama (1970) to test the market efficiency hypothesis. Given my
dataset, I am not able to conduct the final strong form test concerned with
whether given investors have monopolistic access to any information relevant
for price formation.
I perform the weak form test by building a set of naive betting strategies with the aim of exploiting the commonly observed favorite-longshot bias
1
Franck et al. (2010) document, that Betfair is the biggest public prediction market
measured by the amount of money traded on the platform.
5
whereby bets on favorites tend to yield higher return than bets on longshots.
The strategies show weak signs of the bias, but does not enable a bettor to
achieve positive returns. I thus find no evidence of weak-form inefficiency.
This finding is in line with the finding by e.g. Shmanske (2005), who searches
for the favorite-longshot bias in the PGA Golf Tour in 2002. Shmanske also
finds weak evidence of the bias, but he is also not able to create a profitable
betting strategy.
The semi-strong form tests are performed by creating a set of analytical
models as well as betting strategies with the aim of achieving a positive return
by betting on the golf markets. The most profitable model is a conditional
logit model that models winning probabilities based on results from previous
golf tournaments. This model is operationalized via the widely used fractional
Kelly betting strategy (Kelly, 1956). This setup enables a bettor to produce a
positive return of 121% by betting on tournaments from the beginning of 2002
to the end of 2012. The possibility of achieving positive return indicates that
market prices are not well adjusted to publicly available information. The
market thus seems semi-strong form inefficient.
The indication of semi-strong form inefficiencies in Betfair’s golf betting
markets suggests that an analytical approach could yield more precise estimates of winning probabilities than prediction markets for golfers in golf
tournaments.
My approach of evaluating prediction markets efficiency differs a bit from
other papers dealing with the topic. For example, it is the aim of: Cowgill
& Zitzewitz (2013); Forsythe et al. (1992); Smith et al. (2006) to evaluate
whether prediction markets provide more precise probability estimates than
corporate experts, exit polls and bookmakers respectively. I evaluate pre-
6
diction markets by building analytical strategies with the aim of exploiting
market inefficiencies. I build heavily upon the literature on horse race wagering markets where, for example, Bolton & Chapman (1986), Benter (1994)
and Sung & Johnson (2012) try to achieve positive returns by exploiting
market inefficiencies.
1.1
Contributions
I contribute to the literature of prediction markets and sports betting markets
in three major ways:
1. This is by far the most comprehensive study of efficiency in golf markets. To the best of my knowledge, only one article deals with efficiency
in golf betting markets, Shmanske (2005) analyzes data for the PGA
Tour in 2002 with odds from a casino bookmaker. He focuses on both
weak form tests and semi-strong form tests, but he only analyzes a
small fraction of the amount of data analyzed in this thesis.
2. Most empirical studies of prediction markets compare the predictive
ability against forecasts by experts, exit polls or bookmakers (see e.g.
Forsythe et al., 1992; Cowgill & Zitzewitz, 2013; Smith et al., 2006;
Schmidt & Werwatz, 2002) and find that prediction markets give more
precise probability estimates.
I compare the predictive ability of prediction markets against a set of
analytical models that I develop in this thesis. I find evidence that
prices on golf prediction markets are not well adjusted to results from
previous golf tournaments and that the market thus seems semi-strong
inefficient.
7
I suggest that it would be a good idea to perform more studies, such
as this, before embracing the alleged predictive ability of prediction
markets.
3. I generalize the conditional logit model proposed by Bolton & Chapman (1986) for calculating winning probabilities in horse races based
on historical information. Bolton & Chapman model winning probabilities based on, among other attributes, the sum of winnings last
year. I model the probabilities based on, among other attributes, the
discounted sum of winnings last year, where the discounting factor is
found via maximum likelihood optimization together with other model
parameters.
I test whether my way of modeling the probabilities fits the data better
than the approach proposed by Bolton & Chapman. I find that my
approach results in a statistically significant improvement in model fit
(p value < 0.00001).
Even though Bolton & Chapman created their model back in 1986, it
is still used and thus still relevant to refine. The model is e.g. used by
Sung & Johnson (2012) to model winning probabilities for UK horse
races.
The approach of estimating optimal discounting rates when fitting regression models to datasets with a time dimension might furthermore
be applicable in a wide range of other research areas.
8
1.2
Literature review
This thesis builds upon literature from two semi-overlapping research fields:
(1) prediction markets and (2) sport betting markets. The following seeks to
describe the articles mentioned in the introduction in greater details as well
as introduce other related articles. The aim of this section is to illustrate
that prediction markets lately have received lots of attention. Little focus
has, however, been devoted to evaluate the markets for semi-strong form
efficiency. By using ideas from the sport betting literature this gap in the
literature can be explored.
1.2.1
Prediction markets
Prediction markets are forums for trading contracts that yield payments
based on the outcome of uncertain future events (Arrow et al., 2008).
The literature can roughly be split into four major categories: (1) descriptive work; (2) theoretical work; (3) Empirical applications; (4) Law and
policy. The categories are not mutually exclusive, which means that the same
paper potentially overlaps multible categories. The idea of the categorization is to give a quick overview of the work that has been done on prediction
markets.
(1) Descriptive: Tziralis & Tatsiopoulos (2012) describe the market structure and make a survey of the literature written about prediction markets.
They report that the research area has gained popularity in recent years. The
publication trend could be “roughly described as being of exponential growth”
(Tziralis & Tatsiopoulos, 2012).
Arrow et al. (2008) emphasize the potential of prediction markets to im-
9
prove decisions. The range of applications are, according to the authors, “limitless” - from helping businesses make better investment decisions to helping
governments make better fiscal policy decisions.
(2) Theoretical work: Manski (2006) shows mathematically, under a
wide range of assumptions, that the probabilities derived from prediction
markets typically do not correspond closely to the actual probability beliefs
of the market participants, unless the market probability is near either 0 or
1. Manski suggests that directly asking a group of participants to estimate
probabilities may lead to better results.
Wolfers & Zitzewitz (2006) show mathematically that for a broader class
of models, prediction market prices are usually close to the mean beliefs of
traders. Wolfers & Zitzewitz (2006) thus contradict Manski (2006). Wolfers
& Zitzewitz find that “Manski’s special case is in fact a worst-case scenario”.
(3) Application: Most empirical studies of prediction market application
benchmark the markets against forecasts from experts, bookmakers or exit
polls.
Forsythe et al. (1992) document the first application of a prediction market mechanism, the Iowa Electronic Market designed to predict results of US
presidential elections. Forsythe et al. compare exit polls with prices at the
Iowa Electronic Market. They find that: “the market worked extremely well,
dominating polls in forecasting the outcome of the 1988 presidential election”
Cowgill & Zitzewitz (2013) examine results from corporate prediction
markets from Google, Ford, and Koch Industries. They compare market
prices with forecasts from experts in the different companies and conclude
that prediction markets yield more precise estimates.
10
Smith et al. (2006) examine prediction markets for UK horse racing. They
benchmark market prices with prices announced by bookmakers and find that
prediction market prices are more efficient.
Schmidt & Werwatz (2002) analyze prediction markets for the Euro 2000
Championship in soccer and conclude that prediction markets are more efficient than bookmaker markets.
(4) Law and policy: Arrow et al. (2008) argue for liberalization of US
laws regarding gambling markets, so that companies and governments can
leverage the “power of prediction markets”.
1.2.2
Sport betting markets
There is a long and established literature examining the efficiency of sports
betting markets (see Hausch et al., 1994, 2008, for literature surveys). Studies
have been justified in their own right due to the sheer size of betting markets
but have also enhanced the understanding of more far-ranging environments:
“Wagering markets are especially simple financial markets, in
which the pricing problem is reduced. As a result, wagering markets can provide a clear view of pricing issues which are complicated elsewhere.” Sauer (1998)
The literature, which is mainly focused on horse racing, can roughly be split
into two categories: (1) building strategies to achieve positive return; (2)
explaining observed inefficiencies in betting markets.
11
Building strategies to achieve positive return
This part of the literature seek to build strategies to achieve a positive return
by betting via strategies that are either mainly technical or fundamental in
their nature.
Technical strategies are built to exploit weak-form inefficiencies. Weakform inefficiency occurs if prices are not well adjusted to historical prices. An
example of such an inefficiency is the commonly observed favorite-longshot
bias (the finding is typically attributed Griffith, 1949).
Shmanske (2005) builds a naive betting strategy for betting on golf at a
casino with the aim of exploiting the favorite-longshot bias. He is not able
to create a profitable strategy.
Ziemba (2008) surveys the literature of efficiency in financial, sports and
lottery markets. He collects data from a large number of horse race studies
to illustrate the favorite-longshot bias. Based on data on over 50,000 horse
races and 300,000 horses, he illustrate a clear positive relationship between
the expected return and the likelihood of winning. The more likely you are
of winning, the higher your expected return is.
Fundamental strategies are focused on modeling probabilities based
on historical fundamental information such as past results. These strategies
thus seek to exploit semi-strong form inefficiencies.
Bolton & Chapman (1986) were some of the pioneers in building fundamental betting strategies for horse races. Bolton & Chapman develop a
conditional logit model. They conclude that “[their] betting strategy appears
to offer the promise of positive expected returns”. My thesis employs many
of the ideas developed by these authors.
12
Benter (1994) develops a so called two-step model, which he operationalizes with a fractional Kelly betting strategy. The first step of his model is
closely related to the one proposed by Bolton & Chapman (1986); the second
step incorporates market odds in a seperate second-step of the modeling process. Benter claims that he has betted according to his model for a number
of years and earned a significant positive return.
Sung & Johnson (2012) compare the effectiveness of one- and two-step
conditional logit models for predicting UK horse races (i.e. they compares
the type of models proposed by respectively Bolton & Chapman (1986) and
Benter (1994)).
Explaining the observed inefficiencies in betting markets:
A lot of energy has been put into explaining the favorite-longshot bias. Ottaviani & Sørensen (2008) point out that the favorite-longshot bias is a “widely
documented empirical fact” observed across different events, countries, and
market structures. The favorite-longshot bias is often perceived to be an
important deviation from the market efficiency hypothesis. Ottaviani &
Sørensen (2008) present an overview of the main proposed theoretical explanations.
13
Chapter 2
Empirical evidence of
inefficiencies in Betfair’s golf
betting markets
In this chapter, I develop a set of betting models with the aim of exploiting
potential inefficiencies in Betfair’s golf markets for the PGA Tour and the
European Tour. The majority of the chapter is focused on exploiting potential semi-strong form inefficiencies that could arise because odds are poorly
adjusted to historical information.
2.1
Domain knowledge
The following seeks to establish a domain specific knowledge-foundation from
which decisions with regard to data scraping, feature extraction, data modeling, outlier detection etc. can be drawn.
Two main areas are introduced: (a) golf tournaments as well as economic
incentives and psychological aspects of the game; (b) golf betting markets
14
and how person-to-person sports betting markets work in practice.
2.1.1
Golf
Golf is a precision club and ball sport in which competing players (golfers)
use clubs to hit a ball into a series of holes on a course. Table A.1 presents
the most frequently used golf terms in this thesis. The table can be used as
a reference for readers not familiar with the sport.
Courses: Golf is one of the few ball games that does not require a standardized playing area. The game is played on a course, consisting of typically 18
holes. Each of the holes on the course contains a tee-box to start from, and
a putting green containing the hole. In between are other forms of terrain
such as the fairway, rough, and hazards. Virtually all courses are unique
in their specific layout and arrangement. Furthermore, courses change over
time due to the fact that the tee-box can be moved around thus altering e.g.
the length of the hole. The length of the grass may also change etc. The
holes can furthermore be modified for specific tournaments in order to e.g.
increase the level of difficulty.
Tournaments: Golf competitions are generally played for the lowest number of strokes by an individual golfer, known as stroke play. The winner of
a tournament is the golfer who completes typically four rounds of typically
18 holes using the lowest number of strokes. Most tournaments have a ”cut”
after the second round, in which a minimum aggregate score is selected to
eliminate about half of the golfers, i.e. only the best half of the golfers play
the remaining third and fourth round.
15
Pro golf tours: A small elite of professional golfers are tournament pros
who compete in international pro-tours. There are a number of pro-tours.
The PGA Tour tends to attract the best golfers, since the tournaments in
the PGA Tour have the highest prize money. The second most prestigious
tour worldwide is the European Tour (Beard, 2009). There is furthermore a
number of second-tier tours e.g. the Nationwide Tour1 (second-tier tour to
the PGA Tour).
Major championships: The major championships are the four most prestigious men’s tournaments of the year. The championships are: The Masters,
the U.S. Open, The Open Championship and the PGA Championship. The
top golfers from all over the world compete in these championships due to
the prestige and high prize money. The amount of competition and prestige
in these tournaments is thus second to none.
Weather: Weather plays a role in golf. Weather conditions such as rain or
wind influence heavily on golfer’ performance. The courses are exposed to
weather of varying degrees. Some are very exposed to wind while other lie
more protected in e.g. forest.
Some golfers are better than others in windy weather; some like rain more
than others etc. The role of weather becomes even more complex due the
the fact that tournaments are played over entire days i.e. some golfers in
the same tournaments experiences some weather while other golfers possibly
experience other weather.
1
The tour is currently called the Web.com tour, but in this thesis it is referred to as
the Nationwide tour.
16
Economic incentives and psychological aspects of the golf tournament
Economists have long studied how marginal return to effort influence the
performance of workers, executives and even golf players.
Using data from the PGA Tour in 1984 and the European Tour in 1987,
Ehrenberg & Bognanno (1990) find that golfers’ performance tend to vary
positively with both (1) total prize money in tournaments and (2) marginal
return to effort. These findings have later been confirmed by among others
Tanaka & Ishino (2012).
Tanaka & Ishino also study the effect on the golfers’ performance if a
superstar (i.e. a very good golfer) is playing in a given tournament; they
find that the presence of a superstar adversely affects the scores of the other
golfers’ performance.
Although the above findings have been contradicted by some (see e.g.
Orszag, 1994), the findings may suggest that:
• Incentives for playing optimal golf might diminish if the chance of winning disappears.
• All golfers have a chance of winning in the first round of a tournament;
the golfers’ performance in the first round might therefore give a good
indication of their optimal performance potential.
Psychologist have also spent much time analyzing golfers and their performance under stress. One of their main focus areas has been on the concept
choking, which refers to failing under high pressure, i.e. missing the shots
when they matter most (see e.g. Beilock & Carr, 2001).
There are many anecdotes of golfers who have choked under pressure
throughout history. One of them involves Kenny Perry who could taste
17
victory having a two-shot lead with only two holes to go at the prestigious
2009 Masters. Perry had played his careers best golf until that point. He
made a bogey at the second last hole due to a shot that sailed over the
green. At the final hole he could, after a series of shaky strokes, put for
the championship, but he failed and ended third after Angel Cabrera and
Chad Campbell. The next day Perry was quoted for saying: “Great players
make it happen [...] Your average players don’t. And that’s the way it is.”.
The dividing line between winning and losing is thus not only about talent,
technique or athleticism.
2.1.2
Betfair’s sport betting exchange
Betfair is an online provider of an exchange which gives its users the option
of backing or laying bets on events. An event could for example be a golf
tournament. Betfair is thus not a bookmaker, which announces the odds,
but an exchange provider. Betfair makes money by taking a commission of
all winnings. This commission is about 5%.
Betfair is a person-to-person exchange where individuals contract their
contrasting opinions with each other. Users can post the prices as well as
amounts of which they are willing to place a bet – on or against – a given
event e.g. Jason Day winning the Accenture Match Play 2014. The demand
for bets are then displayed in the order book which shows the most attractive
odds with corresponding available volume (see Figure 2.1).
Bettors have the choice of either (1) placing a limit order, which is an
order to buy or sell a bet at a specific price, and wait for another user to
match the bet or (2) place a market order and thereby directly match a bet
that has already been offered by another user.
Figure 2.1 shows a screen shot of the sporting exchange for the Accenture
18
Figure 2.1: A screenshot of Betfair’s online sport exchange for the event:
Accenture Match Play 2014. The table shows the three most attractive odds
to back and lay bets on for the eight leading golfers.
Match Play 2014. On this market bettors have the option of either:
• Placing a market order backing e.g. Jason Day at a 5.5 odds up to
£800 (see Figure 2.1). If the bettor backs Jason Day with £1 and
Jason wins, the bettor would get £5.5 · (1 − 0.05) = £5.23, where 0.05
is Betfair’s take.
• The bettor could also choose to place a limit order backing Jason Day
at a different odds of e.g. 5.7. Only if another bettor chooses to lay
this bet will it be matched.
• Placing a market order laying e.g. Jason Day at a odds of 5.7. If the
bettor lays £1 on Jason Day and Jason wins, the bettor will have to
pay £5.7; otherwise the bettor will get to keep the £1 (minus Betfair’s
19
cut).
• The bettor could also choose to place a limit order to lay a bet of £1
on Jason Day at a lower odds, say 5.5, and then hope to have the bet
matched.
Betfair has, due to its status as the leading provider of an online sporting
exchange, been the focus of much academic attention (See e.g. Franck et al.,
2010; Smith et al., 2009). Betfair accounted in 2010 for 90% of all exchangebased betting activity worldwide and claimed in 2010 to process five million
trades a day (Franck et al., 2010).
2.2
Data
The data used in this thesis comes from two different publicly available
sources (see References for descriptions and URL’s):
1. Historical golf results and golf course characteristics are extracted from:
Yahoo (2014a)
2. Historical odds and betting volumes (how much money is betted) for
golf events are extracted from: Betfair (2014)
A novel dataset is created in which data from the two above-mentioned
sources are merged. The merged dataset contains attributes that capture:
• Information associated with golfers’ historical results and performance;
• Characteristics of golf courses;
• Which odds the golfers were traded for on Betfair’s sports exchange
prior to the beginning of a the tournaments as well as the amount of
20
money that were traded (odds and volume information are only available for a subset of the tournaments).
From the merged dataset a number of attributes are extracted to create the
final dataset. The final dataset is split in two: (1) a train set and (2) a test
set. The first set is used for training models; the second is used to simulate
betting strategies and evaluate whether they are profitable. The idea of using
an out of sample test set is to avoid over-fitting the data. The aim is to be
able to make good ’out of sample’ predictions.
It is beyond the scope of this thesis to give an in-depth description of the
technicalities regarding the data-scraping process2 . The focus of the following
will be on dataset attributes, summary statistics and data processing.
2.2.1
Historical golf results
The web-site Yahoo has a sub-site that gives access to historical results for
golf tournaments in the golf Pro Tours: PGA Tour; European Tour; Champions Tour; Nationwide Tour and the Ladies Professional Golf Association
(LPGA). The data is consistent in format from 2002 until now (January
2014). Access to the data is granted for non-commercial use (Yahoo, 2014c).
Figure 2.2 shows a screen-shot from Yahoo in which the results for a sample
tournament (The Players Championship, 2013 ) are displayed.
Using a scraping-algorithm specifically written for the task, all data from
the tournaments in the date interval from January 2002 to December 2012 in
2
Data scraping is a technique in which a computer program extracts data from human-
readable output coming from another program.
21
Figure 2.2: Origin of the historical golf results used in this thesis. The
screen-shot illustrates the data available for: ’The Players Championship’,
2013. Source: Yahoo (2014b)
the Pro Tours: PGA Tour; European Tour; Champions Tour and Nationwide
Tour are scraped and saved in a database.
Second tier pro-tours such as the Champions Tour or the Nationwide Tour
are included in the dataset because they qualify golfers to the PGA Tour
and the European Tour, which I aim at estimating winning probabilities for.
Historical results from the Champions Tour or the Nationwide Tour thus
provide valuable information.
Data for the LPGA is not used for the thesis due to the fact that the
amount of data for male golfers is much greater.
The data extracted via the scraping algorithm contains the attributes listed
in Table 2.1. The dataset contains one observation with values for the listed
attributes for each golfer in each tournament.
22
Table 2.1: Attributes in dataset scraped from: Yahoo (2014a). Class=1:
Tournament specific; Class=2: Golfer specific.
Attribute
Class Description
id
1 Unique tournament id
protour
1
Pro Tour, e.g. PGA
date
1
Date of tournament
tournament
1
Name of tournament
course
1 Name of the course
par
1 Par for one round on the course
yardage
1
Length of course (yards)
name
2
Name of golfer
pos
2
Position in tournament
t
2
Tie between two or more players
round1
2 Strokes used in round 1
round2
2 Strokes used in round 2
round3
2 Strokes used in round 3
round4
2 Strokes used in round 4
playoff
2 Strokes used in playoff
strokes
2
purse
2 Prize money (USD)
Strokes used in tournament
There are basically two classes of attributes:
1. The seven first attributes are tournament specific and the values are
thus constant for all golfers in the same tournament.
2. The eighth attribute, name, is golfer specific and the remaining nine
attributes describe how that golfer performed in a specific tournament
23
and how much money he won in prize money.
Summary statistics
The data has two main dimensions: (1) time: Data lies in the time interval
from January 2002 to January 2014; (2) Pro-Tour: PGA Tour; European
Tour; Champions Tour and Nationwide Tour. Summary statistics of the
attributes, especially for the attribute purse, varies in these two dimensions.
Key insights into the data in terms of size and characteristics are given in
the following. Table 2.2 gives summary statistics for the dataset; Figure 2.3
depicts boxplots of the purses for the tournaments’ winners in the Pro Tours
for each year and Figure 2.4 depicts the average purse distribution for the
four Pro Tours i.e. for each Pro Tour the plot illustrates the average purse
as a percentage of the total amount of prize money for each position.
Table 2.2: Summary statistics for the historical golf results. ’#’ refers to:
’number of’.
Pro tour
# observations
# tournaments
# golfers
# courses
All tours
218,141
1,714
6,887
465
PGA
73,872
550
2,474
161
European
64,352
478
3,079
143
Champion
28,654
337
1,390
100
Nationwide
51,263
349
2,741
81
Main observations from Table 2.2, Figure 2.3 and Figure 2.4 :
• An average of 218141/1714 = 127 observations per tournament i.e. 127
golfers on average in each tournament.
24
Figure 2.3: Boxplot of purse (in 1000 US dollars) for the tournament
winners in the Pro Tours by year.
• An average of 218141/6887 = 32 observations per golfer i.e. each golfer
is on average part of 32 tournaments.
• Pro tours share both golfers and courses (the sum of the number of
golfers and courses in each pro tour is higher than the number of golfers
and courses for the entire dataset).
• Purses for winners of the tournaments in the PGA Tourare much greater
than for the Champion Tour and Nationwide Tour.
• There is great variance in purses for winners of the tournaments in
the European Tour. While most purses are on par with the purses in
Champions Tour, some are on par with PGA purses and some are on
par with Nationwide purses.
25
• Purses for tournaments in the PGA Tour and the European Tour has
experienced rapid growth in terms of both average and variance over
the time period.
• The purse distributions for the four pro tours are close to identical.
Figure 2.4: Average purse distribution for the four pro tours as a function
of position.
2.2.2
Historical odds for golf events
Betfair gives free access to historical odds-data with semi-detailed timestamps (Betfair, 2014). The attributes for the dataset is listed in Table 2.3
together with descriptions (Betfair, 2014). After having deleted historical
Betfair data for others sports than golf, the dataset contains 700,000 observations divided over the 139 tournaments in 2011 and 2012. See Table 2.4
26
for summary statistics for the data.
Table 2.3: Attributes in the dataset from: Betfair (2014).
Attribute
Type
Description
EVENT ID
integer
Betfair’s event id
EVENT
string
Name of event (golf tournament)
SELECTION ID
integer
Betfair’s golfer id
SELECTION
string
Name of golfer
ODDS
double
The odds
NUMBER BETS
integer
Number of bets on golfer
VOLUME MATCHED integer
LATEST TAKEN
Volume matched (GBP)
timestamp
When the odds was last matched
on the selection
FIRST TAKEN
timestamp
When the odds was first matched
on the selection
WIN FLAG
binary
Win: 1. Loose: 0
IN PLAY
binary
In-Play: 1. Pre-Event: 0
The format of the data is not ideal. The problem with the dataset is the fact
that the data is not properly time stamped. Each observation contains two
timestamps: F IST T AKEN and LAT EST T AKEN i.e. each observation
contains a date-time-range in which a given odds was traded for a given golfer
in a given tournament either in play or pre-event.
To illustrate the complexity of determining the odds for a given golfer prior
to the start of a tournament, the odds and volume matched in pound sterling
27
(GBP) for four golfers prior to the start of the Omega Dubai Desert Classic
(starting February 10, 2011) is depicted in Figure 2.5.
The complexity arises because each golfer is traded at various odds leading
up to the start of the tournament. The key takeaway from Figure 2.5 is that
the vast majority of the volume traded for each player prior to tournament
start is matched at approximately the same odds.
The strategy chosen in order to match odds to the golfers prior to the
beginning of each tournament is to choose the odds with the highest trading
volume. 43% of the total volume traded for golfers, prior to the start of the
tournaments, is matched at the odds with the highest trading volume. 89%
of the total volume traded for golfers, prior to the start of the tournaments, is
matched at odds within ±10% of the odds with the highest trading volume.
Given the above statistics, I find it reasonable to perceive the odds with the
highest trading volume as the market odds.
Summary statistics
Table 2.4 list summary statistics for the historical golf odds from Betfair.
From the table it is clear that it is more popular to bet on tournaments in
the PGA Tour compared to tournaments in the European Tour.
Table 2.4: Summary statistics for the historical golf odds from Betfair.
Pro tour
# Observations
# Tournaments
Avg. matched (GBP)*
All tours
698,345
139
3,198,346
PGA
420,050
76
4,687,875
European
278,295
63
1,377,812
* Average volume of bets in GBP matched during the tournaments
28
(a) Martin Kaymer - the favorite prior (b) Tiger Woods - the second favorite
to start of tournament
prior to start of tournament
(c) Alvaro Quiros - the winner of the (d) James Kingston - number two in
tournament
the tournament
Figure 2.5: Odds and volume matched prior to the start of the Omega
Dubai Desert Classic (Feb. 10, 2011) for the two favourites: Martin Kaymer
and Tiger Woods as well as the winner and number two: Alvaro Quiros and
James Kingston. The vast majority of the volume for the players are matched
at approximately the same odds. Since the data is not fully timestamped it is
unclear when the odds were actually matched. The volume matched for the
odds are here added at the time it was fist matched (i.e. F IRST T AKEN )
29
2.2.3
Merging the two data sources
In order to merge the two datasets, a table is manually created that add a
linkage between the unique tournament id from Table 2.1 and the EV EN T ID
from Table 2.3. The second link needed in order to successfully merge the
datasets is between the attributes name from Table 2.1 and SELECT ION
from Table 2.3 as well as the linking logic described in subsection 2.2.2
(matching the odds with the highest trading volume).
2.2.4
Extracting attributes for statistical modeling
Observations for each of the 1,714 tournaments have 20 attributes implying
that the total amount of attributes available for calculating winning probabilities of a future golf tournament could be very high. The amount of
attributes could be high because: (1) all previous information could potentially be used in determining future winning probabilities; (2) observations
for each tournament are located in different places in the space spanned by
the date and the protour dimensions. It is clear that some form of dimensionality reduction is needed in order to capture the important information
in the dataset more effectively.3
I turn to the existing literature on sports betting for ideas to reduce the
dimensionality of my dataset. The literature for estimating winning probabilities for golfers in golf tournaments contains, to my present knowledge,
one article. Shmanske (2005) models winning probabilities for golfers based
on summary statistics provided by the PGA Tour. He does thus not directly
use past golf results in his model. However, many articles have been written
3
The methodology and techniques used in for feature extraction and variable transfor-
mation in the following comes from (Tan et al., 2013, Chap. 2)
30
on horse-racing with this focus (see e.g. Bolton & Chapman, 1986; Lessmann
et al., 2007; Sung & Johnson, 2012). Many aspects of horse-racing are comparable to golf tournaments; horses compete against other horses of varying
quality and form just as golfers compete against other golfers of varying quality and form; each horse races in many races just as each golfer plays in many
tournaments; the courses varies in lengths etc. Table 2.5 lists some of the
aggregating attributes that have been used in the horse-racing literature in
order to reduce dataset-dimensionality.
Table 2.5: Attributes used for winning probability estimation in horseracing
No. Attribute descriptions
1
Speed rating for the previous race in which the horse ran
The average of a horse’s speed rating in its last 4 races; zero
2
when there is no past run
Total prize money earnings (finishing first, second or third) to
3
date/Number of races entered
4
The percentage of the races won by the horse in its career
5
The natural logarithm of the normalised final odds probability
Only attributes deemed relevant in the golf context are included. Complete attribute list
can be found in Sung & Johnson (2012). First four attributes were proposed by Bolton &
Chapman (1986), the last was proposed by Benter (1994).
Table 2.5 contains attributes whose goal it is to proxy: (1) a horse’s quality:
via e.g. the attributes for prize money earnings and win percentages; (2)
the horse’s form: via e.g. the speed rating attributes; (3) potential inside
information encapsulated in the odds.
The idea is to incorporate attributes which both capture the underlying,
31
probably slowly time-varying, horse quality as well as a measure, probably
more volatile, for current form. The attributes given in the table are likely
strongly correlated and thus capture aspects of both of the underlying measures.
It is clear from the table that the academics who have used these attributes
have made some arbitrary choices with regard to the dimensionality reduction, e.g. speed rating the last four races. There is, to my present knowledge,
no a priori reason why four is the right number. Furthermore, issues could
arise due to the fact that the time-dimension in the dataset has not been
incorporated into the attributes. A horse could, for example, have been sick
for a year and the four previous races (averaged over in the attribute) would
then have been prior to the horse’s sickness. The attribute is therefore not
likely to be a good estimator of the horse’s current quality and form.
I propose two sets of attributes to be used in predicting the winner probabilities for golfers: (1) a set of attributes resembling the attributes used in
the literature for horse-race estimation (Table 2.5); (2) a set of attributes
with less arbitrary choices with regard to the dimensionality reduction.
I will introduce a notation of golfers and tournaments in order to make
the following easier to read. The dataset contains n tournaments denoted
j = 1, 2, . . . , n. mj golfers are competing against each other in tournament
j. These golfers are denoted i = 1, 2, . . . , mj .
32
Static, arbitrary dimensionality reduction
A set of attributes resembling the attributes used in the literature for horserace winning probability estimation (see Table 2.5) is created based on the
original dataset (described in subsection 2.2.1 and subsection 2.2.2). The
idea from the horse-racing literature of including attributes to proxy form
and quality is used. The attributes are listed in Table 2.6.
The basic ideas from the economic literature on economic incentives and
psychological aspects of the golf tournament (section 2.1.1) are used to create the following two attributes:
1. A substitute for speed rating (from the horse-racing literature see Table 2.5). The feature-substitute is named score rating and is given by
the number of strokes used by golfer i in the first round of tournament
j subtracted by the median of strokes used by golfers in round 1 of
tournament j.
There is a difference between making a good score in a first-tier tour
such as the PGA Tour and a second-tier tour such as the Nationwide
Tour. I have analyzed the difference by looking at score ratings for
golfers participating in both first-tier and second-tier pro-tours in the
same calendar year. I find that the score rating on average is 1.43
strokes higher in first tier than second-tier pro tours.
I compensate for this difference by adding 1.43 to all score ratings from
tournaments in the Champions Tour and the Nationwide Tour.
2. An attribute to proxy a golfer’s ability to perform under pressure. The
attribute is named keep cool and is given by the number of wins divided
by the number of top 10 positions. I assume that this attribute gives
33
some indication of the golfers ability not to choke under pressure.
The amount of pressure golfers are under is higher in first-tier tours than
second-tier tours. I make the simplifying assumption that victories in
first-tier tours should count four times that of a victory in a second-tier
tournament.
Table 2.6: Attributes set no. 1
Attribute
avg score rating year
Attribute description
The average of a golfer’s score rating* the last year.
The average of a golfer’s wins compared to top 10
avg keep cool year
positions last year.
Total prize money earnings last two years divided
avg purse 2years
by the number of tournaments entered last two
years.
ln odds
The natural logarithm of the normalized final odds
probability.
*score rating is given by the number of strokes used by golfer i in round 1 of tournament
j subtracted by the median of strokes used by golfers in round 1 of tournament j. Score
ratings from second-tier tours are added with 1.43 to compensate for the difference in level
between first-tier and second-tier pro tours.
The list of attributes furthermore includes measures for: previous winnings
in GBP; winning percentages. Betfair odds are included for a part of the
dataset.
34
Dynamic dimensionallity reduction
I create a new set of attributes (listed in Table 2.7). The set contain the same
sort of information as in the static set (Table 2.6), but the proposed attributes
in this subsection reduce the original dataset less in terms of dimensionality.
This dataset contains vectors instead of numbers, e.g. avg purse last year
(from Table 2.6) contains one number per golfer per tournament which averages the previous years winnings. In the attribute set in this subsection,
historical purse information is captured in a vector, πij , with D elements.
Each element, πij,d , contains the purse won d days prior to start of tournament j. The following table lists all the attribute-vectors.
35
Table 2.7: Attributes set no. 2
Attribute
Attribute description
A vector containing D elements, where the dth element,
πij
πij,d , specifies the purse won by golfer i, d days prior to
the beginning of tournament j.
A vector containing D elements, where the dth element,
γij
γij,d , specifies the score rating of golfer i, d days prior to
the beginning of tournament j.
A vector containing D elements, where the dth element,
wij
wij,d , specifies whether golfer i won a tournament d days
prior to the beginning of tournament j.
A vector that contains D elements, where element d,
ψij
ψij,d , is a binary attribute that specifies whether golfer i
participated in a tournament d days prior to tournament
start.
A vector that contains D elements, where element d,
cij
ψij,d , is a binary attribute that specifies whether golfer
i ended in top 10 in a tournament d days prior to tournament start.
2.2.5
Removing outliers
The Major Championships are by far the four most prestigious annual tournaments in professional golf (Beard, 2009). Elite players from all over the
world participate in them, and the reputations of the greatest players in
golf history are to a large degree based on the number and variety of major
36
championship victories they accumulate (see subsection 2.1.1).
I assume that it is very likely that another set of parameters influence the
chance of winning a major championship compared to a regular non-major
championships.
I assume that a specific model should be built in order to estimate winning probabilities for major championships. I therefore choose to remove all
major championships from the list of tournaments I estimate winning probabilities for. I thus estimate winning probabilities for non-major tournaments
in the PGA Tour and the European Tour based on a dataset containing both
major and non-major tournaments for the four pro tours: PGA, European,
Nationwide and Champion.
2.2.6
Weaknesses of the dataset
My dataset has three main weaknesses compared to studies of market efficiency on betting markets for e.g. horse races. The weaknesses are inherent
in the way that professional golf tournaments are structured. Firstly, I have
relatively few tournaments with odds in my dataset. Even though I have
odds for all tournaments in 2011 and 2012, I only have 139 tournaments.
Secondly, many golfers participate in each tournament and many of these
golfers are likely to win the tournament. Thirdly, I assume that there are a
lot of factors influencing golfers’ performance that are very difficult to quantify. These factors could include e.g. the psychological state of the golfers.
The three weaknesses of the dataset given above lead to a couple of issues:
(a) they make it harder to model the golf tournament process and to predict
future events; (b) they weaken my conclusions and (c) they do not enable
me to perform the same weak-form test as the horse-race literature, because
37
the horse race studies rely on larger datasets. For example, Ziemba (2008) is
using data from 50,000 horse races to illustrate a clear positive relationship
between the expected return and the likelihood of winning.
The literature on horse racing has two solutions for the first weakness - that
there are few tournaments in the dataset.
Firstly, Bolton & Chapman (1986) describe an ’explosions’ principle to
double the size of the dataset. The general idea of the ’explosions’ principle
is to assume that the horse finishing second would have won if the horse that
won did not participate. The principle works by duplicating every race in
the dataset and removing the winner from the duplicated races. The idea is
well suited for horse races where getting over the line first is all that matters.
I find this approach ill suited for golf tournaments because, (a) there is a big
difference between finishing first and second in terms of pressure (due to e.g.
the risk of choking); (b) often multiple players finish second because they
have the same score.
Secondly, and probably most obvious: collect more data. I have chosen
to limit my odds data collection to two years because: (a) there is a nontrivial amount of work involved in merging the odds data with the remaining
dataset; (b) Betfair’s golf markets only started picking up pace in the last
part of 2010. By including previous years in the dataset I would face issues
with e.g. illiquid markets; it would thus only be possible to include one more
year (2013), and that would not solve the issues described above.
I have chosen to limit my dataset to two years because I find it to be a
good cut-off point to obtain results of certain accuracy and at the same time
be able to focus on analyzes. More data could be added in the future to
strengthen the results.
38
2.3
Modeling
The eventual goal is to find a model that can predict the winning probabilities for golfers in the PGA Tour and the European Tour in order to be able
to determine the degree of semi-strong form efficiency in golf betting markets.
Four models are proposed with an increasing level of complexity starting
with simple models and ending with models of higher complexity.
The starting point is the existing literature on winner probability estimation for sports events where the predominant focus has been on horse racing.
Two main approaches have been used:
1. A one-step model (Bolton & Chapman, 1986) in which either only
fundamental variables (see attribute No. 1-4 in Table 2.5) or both
fundamental variables and the market-generated variable (i.e. the odds)
are used. The conditional logit model has been the most widely used
statistical classification model due to its ability to account for both
independent variables measuring a horse’s potential and within-race
competition.
2. A two-step model (Benter, 1994) in which the modeling as well as
the dataset is split in two parts: (1) step one is to model the fundamental wining-probabilities based on only fundamental variables and
only using the first part of the dataset; (2) step two is to model the
winning-probabilities based on the market-generated variable as well
as the predicted fundamental wining-probabilities (the model fitted in
step 1) on the second part of the dataset. The most widely used statistical classification models has been the conditional logit. However,
lately new machine learning algorithms have been proposed, e.g. Less39
mann et al. (2007) use the least-square support vector regression for
step (1).
I create two models based on the ’one-step model’ and two models based on
the ’two-step model’.
Aspects of the proposed models deviate from the standard ’off the shelves’
statistical models. The modeling has therefore been written in the programming language Python, with help from among other the open source libraries
Sci-Kit learn (Pedregosa et al., 2011) and SciPy (Jones et al., 2001–).
2.3.1
One-step estimation: Conditional logit model
The aim of the model is to predict the winning probability, ρij , for golfer
i = 1, 2, . . . , mj in tournament j = 1, 2, . . . , n, where n is the number of
tournaments and mj is the number of golfers in tournament j.
Golf tournaments are, just as horse-racing, highly competitive; a good
probability estimate of golfer i’s chance of winning tournament j is thus more
likely to be obtained if his chance of winning is regarded as being conditional
on the information available for the other golfers in the tournament. The
conditional logit model (proposed by McFadden, 1974) seems to fit the task
well.
The model should predict the winning probabilities based on a matrix,
Xj , capturing relevant information for all the golfers in tournament j. Xj
should only contain information that is publicly available prior to the start
of tournament j.
A general specification of a statistical model of the golf-tournament-
40
process could be proposed as:
ρij = ρ(Xj )
(2.1)
ρ(Xj ) should satisfy the standard axioms of non-negative probabilities as well
as probabilities summing to one for all golfers in the same tournament. The
conditional logit model satisfies these axioms (Bolton & Chapman, 1986).
The model furthermore captures the competitive nature of golf tournaments.
It is assumed that the true ability, uij , for golfer i in tournament j is
composed of two parts: (1) a deterministic part vij = v(xij ), where xij is
vector i in the information matrix, Xj , and (2) a stochastic part εij which
reflects measurement errors in the modeling process.
To move on from here, it is assumed that the stochastic part, εij , is
independent of the deterministic component, vij , as well as identical and
independently distributed according to the double exponential distribution.
By making these assumptions McFadden (1974) showes that the conditional
probability, ρij , is given by:
exp (vij )
ρij = Pmj
i=1 exp (vij )
(2.2)
The predominant way to specify vij is via a linear in parameters specification:
vij = βxij
(2.3)
Where xij is a column vector capturing relevant information for golfer i prior
to tournament j. I normalize each attribute in the information matrix, Xj , to
have zero mean and unit standard deviation4 . By normalizing the attributes,
the row vector with coefficients, β, will measure the relative importance of
4
This operation is done by subtracting mean and dividing by standard deviation for
each attribute.
41
the elements of xij in determining the winning golfer. βxij is the dot product
between the two vectors.
The β vector is estimated based on the training dataset via maximization
m P
n
P
of the log-likelihood function L =
ln(ρij )yij (McFadden, 1974). Where
i=1 j=1
yij ∈ {0, 1} denotes whether golfer i won tournament j. Edelman (2007);
Sung & Johnson (2012); Bolton & Chapman (1986) have all used Equation 2.2
and Equation 2.3 to model the winning probabilities of race horses.
Estimation and results
Table 2.8: Conditional logit - Results*
Attribute
Coefficient estimate (β̂)
−0.648
avg score rating year
avg purse 2years
0.784
avg keep cool year
0.120
Summary Statistics
L(β = β̂)
-3294.32
L(β = 0)
-3887.54
McFadden R2
0.15
*Estimated on data from 795 tournaments. All attributes are significant on a 1% level of
confidence. Significance is calculated based on the LR test, see appendix A.2
The coefficients in Table 2.8 are quite intuitive:
• The avg score rating attribute has a negative coefficient i.e. the more
strokes a golfer used, compared to the median golfer in the tournaments
42
he competed in, the less likely he is - ceteris paribus - to win a future
tournament.
• The avg purse year attribute has a positive coefficients i.e. the more
money a golfer on average has won over the last couple of years, the
more likely he is - ceteris paribus - to win a future tournament.
• The coefficient for avg keep cool year indicate that golfers who have a
high ratio of wins compared to top 10 positions are likely to win future
tournaments.
2.3.2
One-step estimation: Conditional logit model with
variable attribute discounting
A new dimensionality reducing function, ξ, which aggregates the values for a
fundamental attribute, Q, (e.g. historic purses, score ratings or ’keep cool’)
and weights the aggregate by another fundamental attribute, W , (e.g. participation or top 10 positions) is introduced. The intuition behind, ξ, is
that new information should weigh more than old information in proxying a
golfers quality and form. Let the aggregated value of attribute Q for golfer
i weighted by W in the time interval [1 : DQ ] prior to the beginning of
tournament j be given by:
PDQ
ξij,Q,W = ξ (Q, W, θ, DQ , i, j) =
1+
d
d=1 θQ Qij,d
PDQ d
d=1 θQ Wij,d
(2.4)
Where θQ is a discounting factor; Q ∈ {π, γ, w} and W ∈ {ψ, c} are arrays
of fundamental attribute values (see Table 2.7): πij is a vector containing
Dπ elements, where the dth element, πij,d , specifies the purse won by player
i, d days prior to the beginning of tournament j; γij contains Dγ elements
43
with historic score ratings of golfer i in the interval [1; Dγ ]; wij contains
Dw elements, wij,d ∈ {0, 1} denotes whether golfer i won in a tournament
on day d. ψij contains Dψ elements, ψij,d ∈ {0, 1} denotes whether golfer
i participated in a tournament on day d; finally cij contains Dc elements,
cij,d ∈ {0, 1} denotes whether golfer i ended in top 10 in a tournament on
day d.
One is added to the denominator of Equation 2.4 in order to (a) make
sure that single observations in the past are discounted; (b) avoid dividing
by zero.
The function, ξ, returns a discounted, weighted average of either purse
(π); score rating (γ) or ’keep cool’ (w) weighted by either participations (ψ)
or top 10 positions (c) for golfer i in tournaments in the date interval [1; DQ ]
prior to the start of tournament j.
ξij,π,ψ is a generalization of the attribute avg purse last year from the
static set (Table 2.6): avg purse last 2year = ξij,π,ψ for θπ = 1, Dπ = 730.
In the same fashion ξij,γ,ψ = avg score rating year for θγ = 1, Dγ = 365
and ξij,w,c = avg keep cool year for θc = 1, Dγ = 365.
ξj,Q,W is a vector that contains the values for the discounted average of
attribute Q weighted by W for all mj golfers in tournament
j: ξj,Q = [ξ1j,Q , ξ2j,Q , . . . ξmj j,Q ]
The discounted sum of purses/score-ratings is weighted by the discounted
count of tournament-participation due to the fact that not all golfers participate in the same number of tournaments. It seems reasonable to assume that
the number of tournaments a golfer participates in is not directly a determining factor of his ability. Tiger Woods is e.g. known for only participating in
relatively few golf tournaments and being extraordinarily good. A weighted
44
average thus seem reasonable. The discounted sum of ’keep cool’ is weighted
by the discounted count of tournament top 10 positions.
The discounting function has been chosen due to its dominant position
in economic literature.
A modification of the linear in parameters specification of the deterministic part, vij , specified in Equation 2.3 is proposed:
vij = α1 · z (ξj,π , i) + α2 · z (ξj,γ , i) + α3 · z (ξj,c , i)
(2.5)
Where α is a vector of coefficients that measures the relative importance of
the three new terms; z is a function that returns the standard-scores i.e.
transforms the elements of the vector ξj,Q to have zero mean and unit standard deviation (std). The idea of the standardization is to center the data in
order to ease the optimization problem (described lated) as well as parameter
interpretation. z is defined as:
z (ξj,Q , i) =
ξij,Q − mean (ξj,Q )
std (ξj,Q )
(2.6)
The winning probability for golfer i in tournament j is still defined as in
Equation 2.2:
exp (vij )
ρij = Pmj
(2.7)
i=1 exp (vij )
But now both of the vectors α and θ - each with three elements - are to be
estimated based on the training dataset via maximization of the log-likelihood
m P
n
P
function: L =
ln(ρij )yij (McFadden, 1974), where yij ∈ {0, 1} denotes
i=1 j=1
whether golfer i won tournament j.
The specification of the likelihood function given above is not a convex
function and care thus has to be taken in the optimization process in order
not to end up in a local maximum. A randomized grit-search is performed
where the optimization algorithm is initialized with random α and θ values.
45
Estimation and results
To make the results comparable to the results from the simple conditional
model (subsection 2.3.1), the values of DQ are chosen, so that both models
have the same fundamental data foundation: two years of data for historic
purses is included (Dπ = 730); one year of score-ratings and ’keep cool’ is
included (Dγ = Dc = 365).
Table 2.9: Conditional logit with variable discounting of historic purses
and score ratings - Results*
Coefficient estimates
Attribute
α̂
θ̂
-0.491
0.996
Discounted weighted avg. purse, ξij,π
0.624
0.998
Discounted weighted avg. keep cool, ξij,c
0.113
0.999
Discounted weighted avg. score rating, ξij,γ
Summary Statistics
L(α = α̂, θ = θ̂)
-3273.37
L(α = 0, θ = 0)
-3887.54
McFadden R2
0.16
*Estimated on data from 795 tournaments. All attributes are significant on a 1% level of
confidence. Significance is calculated based on the LR test, see appendix A.2
The α̂ coefficients in Table 2.9, which measures the relative importance of
the two attributes, are quite intuitive:
• The discounted weighted avg. score rating has a negative coefficients
(-0.491) i.e. the more strokes a golfer used, compared to the median
46
golfer in the tournaments he competed in, the less likely he is - ceteris
paribus - to win a future tournament.
• The discounted weighted avg. purse has a positive coefficients (0.624)
i.e. the more money a golfer on average has won over the last couple of years, the more likely he is - ceteris paribus - to win a future
tournament.
• The keep cool attribute has a positive coefficient, i.e. golfers with a
high ratio of wins compared to top 10 positions are likely to win future
tournaments.
Furthermore, the discounting factors, θ̂ are in line with what one would expect. The weights used when calculating the weighted average of the historic
purses, score-ratings and ’keep cool’ are depicted as a function of days until
tournament start, d, in Figure 2.6.
Figure 2.6: Weights as a function of days until tournament start, d, used
when calculating the discounted weighted average of the historic purses,
score-ratings and ’keep cool’.
47
It is clear from the figure that historic score rating are discounted much
harder than purses and ’keep cool’. This is in line with the idea that discounted weighted average score rating is a proxy for the golfers form while
discounted weighted average purse as well as ’keep cool’ are proxies for the
golfers quality. A priori it makes sense that the golfers form is less persistent
than his quality.
Comparing the conditional Logit model with variable attribute discounting with the conditional logit model
The ’conditional Logit model’ is a special case of the ’conditional Logit model
with variable attribute discounting’ - if the discounting-factors, θ, equal one,
the models are identical. The fact that the one is a special case of the other
implies that I can use a likelihood ratio test to compare the fit of two models.
The test is based on the likelihood ratio, which expresses how many times
more likely the data is under one model than the other.
The test statistic, LR, is given by:
LR = −2ln(likelihood for conditional logit)
+2ln(likelihood for conditional logit with variable discounting)
= −2 · (−3294.32) + 2 · (−3273.37) = 41.9
The test statistic is asymptotically chi-squared distributed (Verbeek, 2008)
with degrees of freedom equal to df2 − df1 = 6 − 3 = 3. Where df1 is the number of free parameters in the ’conditional Logit model’ and df2 is the number
of free parameters in the ’conditional Logit model with variable attribute
discounting’. The p-value for the test is very low (less than 0.0001).
The results show that adding the discounting functionality to the model
results in a statistically significant improvement in model fit.
48
2.3.3
Two-step estimation
There will always be a significant amount of information in golf tournaments
that cannot easily be incorporated in a statistical model. This information
includes inside information as well as not-easily quantifiable information e.g.
new special workouts; the golfers intention and motivation; off-course issues
affecting the golfer’s performance etc. To exemplify such not easily incorporable information, consider the 2013 edition of the RBC Canadian Open
at Glen Abbey Golf Club in Oakville, Ontario. Hunter Mahan was after
round 3 two strokes ahead of his nearest competitor and looked like the sure
winner of the prestigious tournament and close to $1 million, when he suddenly decided to withdraw from the tournament to be present for the birth
of his first child.
Benter (1994) pointed out that this kind of information will always be
available to certain parties who will no doubt take advantage of it. Their
betting will be reflected in the odds.
Benter (1994) proposed a two-step estimation approach where the first
step was the estimation of a fundamental model (such as subsection 2.3.1 or
subsection 2.3.2). The second step was the new conditional logit model:
exp (β1 · ln(ρij ) + β2 · ln(oij ))
cij = Pmj
i=1 exp (β1 · ln(ρij ) + β2 · ln(oij ))
(2.8)
Where ρij is the fundamental winner probability given in either Equation 2.5
(the simple conditional logit model) or Equation 2.7 (the conditional logit
model with variable discounting); oij is the public’s implied winner probability estimate i.e. the inverse of the odds (oij = 1/Betf air odds); ln() is a
function that returns the natural logarithm. cij is thus a combined probability estimate.
Equation 2.8 should be evaluated using fundamental probability estimates
49
predicted by one of the two previously described models which is fitted via
a separate sample of races. The idea of using out-of-sample estimates is,
as Benter points out, to avoid overestimation of the fundamental model’s
significance caused by over-fitting the model.
Estimation and results
One problem arrises quickly: the size of the dataset. Benter points out that:
“In the author’s experience the minimum amount of data needed
for adequate model development and testing samples is in the
range 500 to 1000 races.” (Benter, 1994, p. 185)
The dataset used in this thesis only contains odds for 139 tournaments (Table 2.4). This data sample should be used for both estimation and testing
and it thus contains far less information than Benter deemed necessary. Golf
tournaments are, furthermore, harder to predict than horse races due to the
fact that the competition is much higher. 6-20 horses compete against each
other; 80-150 golfers compete against each other. If anything, this would
imply the need for even more data.
To illustrate the issue, standard errors for the parameters in the models
are calculated via bootstrapped resampling (Fox, 2008). Standard errors are
in parenthesis in Table 2.10.
50
Table 2.10: Two-step models - Results*
Coefficient estimates**
Model used in step 1
β1
β2
-1.80
39.92
(7.28)
(9.60)
2. Conditional logit with variable his- -2.94
41.13
toric purse and score-rating discounting (24.87)
(16.67)
1. Conditional logit
McFadden R2 **
0.113
1. Conditional logit
(0.03)
2. Conditional logit with variable his-
0.114
toric purse and score-rating discounting
(0.06)
* Estimated on data from 139 tournaments; ** Standard errors of parameters (in parentheses) are estimated via 5-fold bootstrapped resampling.
Benter (1994) uses the difference between the McFadden R2 for the one-step
model and the two-step model, ∆R2 , as a heuristic measure of potential
profitability increase by going from a one-step model to a two-step model.
∆R2 is negative for both models and thus does thus not indicate that the
second step has improved the one-step models.
In summary it must be concluded that, most likely, odds-data for a significantly higher number of tournaments is needed in order for the two-step
model to achieve a better model than the one-step fundamental models.
51
2.4
Betting strategies
Just like the stock market, betting strategies at a golf-course or any other
sport event are usually either primarily technical or fundamental in their
nature:
• Technical betting strategies take the odds for the individual golfers as a
starting point. The betting strategies are constructed to take advantage
of e.g. simpel biases such as the commonly reported favorite-long-shot
bias (Hausch et al., 1994). The bias is such that favorites tend to
be underbet (and thus have too high odds compared to true winning
probability) and longshots tend to be overbet;
• Fundamental betting strategies utilize an underlying model of fundamental attributes such as historic results to estimate a fundamental
winning probability.5 Based on the fundamental winning probabilities
and the public’s belief probabilities (given by the inverse of the odds), a
betting algorithm such as the Kelly criterion (Kelly, 1956) is employed
to place bet(s).
The technical/fundamental betting strategies proposed in the following have
two parameters that define how they work:
1. Number of bets: The betting strategies can either advise the bettor
to place one bet or several bets.
Single bet: The bettor places a bet of $1 on golfer k. If k wins the
5
The two-step model described in subsection 2.3.3 incorporates both odds as well as
fundamental attributes and thus places itself somewhere between a technical and fundamental model. The two-step models will in the following be used with the fundamental
betting strategies due to the fact, that it is primarily fundamental.
52
tournament (wk = 1) the bettor gets ok otherwise the bettor loses the
$1. The profit, m, is thus given as:
m = ok · wk − 1
(2.9)
Betfair charges a fee of 5% if m is greater than zero.
Multiple bets: The bettor places bets on a set of golfers, S. If i ∈ S the
bettor bets $Ki on golfer i. If golfer k wins the tournament (wk = 1)
and k ∈ S (i.e. if a bet has been placed on k), the bettor gets ok . The
profit, m, is thus given as:
X
m=
i∈S
K i · oi · wi −
X
i∈S
Ki
(2.10)
Betfair charges a fee of 5% if m is greater than zero.
2. Long/short bets: Unlike most other betting markets, the possibility
to take a both long and short position in any bet exists on Betfair. You
can for an example both bet that e.g. Tiger Woods wins the tournament
or that he does not win the tournament. The spread between the two
bets are quite small.
The mathematical foundation for short-betting on many mutually exclusive events (e.g. golfers winning a golf tournament) has, to the best
of my knowledge, not been created. I therefore choose only to allow
the betting strategies to place long bets.
2.4.1
Technical betting strategies
Two technical betting strategies are proposed:
1. A random strategy in which random bets are placed. The motivation
for creating a random strategy is to have a benchmark to which other
strategies can be measured.
53
2. Two strategies designed to exploit the favorite-long-shot bias.
Random betting
To have a valid benchmark to which other strategies can be compared, a
random betting scheme is proposed. The strategy is basically to take a
long/short bet of a fraction, K, of total wealth on one/multiple random
golfer(s). In order to minimize variance of such a strategies and since the
dataset only contains 139 tournaments with odds, a restriction is put on the
strategy: Only bet if the belief probability is higher than 5%6 i.e. do not bet
on extreme long-shots since the number of observations is so limited.
Favorite-long-shot bias
High probability–low payoff gambles have been shown to have higher expected return than low probability–high payoff gambles (Hausch et al., 1994,
2008). Favorites thus win more often than projected by their odds. This
discrepancy between the true probability and the probability implied by the
odds (the belief probability given by the inverse of the odds) is called the
favorite-long-shot bias. The first documentation of this bias is attributed to
Griffith (1949). Ottaviani & Sørensen (2008) present an overview of the main
theoretical explanations for this bias proposed in the literature.
The bias challenges normative assumptions because it means that the expected return increases with the probability of winning.
Shmanske (2005) proposes a series of naive betting strategies which he uses
for the PGA-tour in 2002. He proposes to bet a flat amount of $1 on of
the following four options: (1) the favorite in each tournament; (2) a group
6
The 5% limit is arbitrarily set before estimation of results.
54
of favorite in each tournament; (3) the long-shot in each tournament; (4) a
group of long-shots in each tournament.
All of the four strategies are basically attempts to profit from the favoritelong-shot bias and have also been utilized in the horse-racing literature. I
follow Shmanske (2005) and create the same betting strategies. If any of
theses strategies produce a positive return, it would be an indication of weak
form inefficiency. It is furthermore evidence of the favorite longshot bias, if
the ’favorite’ strategies achieves higher return than the ’longshot’ strategies.
2.4.2
Fundamental betting strategies
To operationalize the models proposed in section 2.3 a betting strategy is
needed. Two versions of the essential same betting strategy is introduced:
1. Kelly criterion: a formula used to determine the optimal size of
a series of independent bets in order to maximize the expected logarithm of wealth. This strategy has been utilized in many articles on
horse-racing (e.g. Lessmann et al., 2007; Sung & Johnson, 2012). The
strategy has three important properties (Hausch et al., 1994, p. 87):
(1) it maximizes the asymptotic growth of capital; (2) asymptotically,
it minimizes the expected time to reach a specific goal; (3) it outperforms in the long run any other essentially different strategy almost
surely.
2. Fractional Kelly betting: The downside of betting, according to
the Kelly criterion, is that (1) it is a rough ride i.e. very volatile; (2) it
exposes the bettor to non-deterministic errors in the advantage calculations (the difference between predicted odds and the market odds).
55
Kelly wagering strategy
The Kelly criterion was introduced by Kelly (1956) to assist AT&T with its
long distance telephone signal noise issues. The gambling community quickly
got wind of the theory and it has since been widely used as a betting strategy for horse-racing (e.g. Lessmann et al., 2007; Sung & Johnson, 2012). The
strategy is also referred to as pure Kelly strategy.
It is important to note that the Kelly strategy is not related in any way
to e.g. a Martingale betting system (which is to double the bet after every
loss, so that the first win would recover all previous losses). Kelly strategies
are only profitable in the long run, if the bettor is able to estimate event
probabilities better than the market.
Single bet: For simple bets with two outcomes: (1) losing the entire
amount of the bet; (2) winning the bet amount multiplied by the odds, o,
the Kelly bet is given by:
K=
(o − 1) · ρ − (1 − ρ)
o−1
(2.11)
Where ρ is the winning probability. The fraction is equal to the expected net
winnings divided by net winnings if you win on the bet. The Kelly bet, K,
is the fraction of total wealth that should be placed on the bet.
Multible bets: Kelly’s criterion can be generalized on gambling on many
mutually exclusive outcomes, like in horse races or golf tournaments.
Smoczynski & Tomkins (2010) propose an algorithm for Kelly-betting on
many mutually exclusive outcomes. The algorithm is split in 4 steps:
56
1. Calculate the expected revenues for the m different golfers:
re,i = ρi · oi · (1 − t)
(2.12)
Where the probability that golfer i wins the tournament is given by ρi ;
the odds for him winning is oi and the take of the bet exchange is t
(Betfair’s take is 5%).
2. Reorder the vector of expected revenues, re , so that the new vector is
non-increasing. Thus re,1 will be the bet with highest expected revenue.
3. Set S = ∅; R (S) = 1. S is the set of bets to be made, R (S) is the
reserve rate, that is, the fraction of the gamblers wealth that is not bet
on any golfer.
4. Loop through the m golfers: If re,i > R (S) then:
• Insert i into the set of bets to be made: S = S ∪ {i}
• Recalculate R (S) according to:
P
1 − i∈S ρi
R (S) =
P
1 − i∈S (1−t)
oi
(2.13)
else stop the loop and set Sopt = S and calculate the optimal Kelly
bets, Kopt , according to:
Kopt,i =
re,i − R (Sopt )
oi · (1 − t)
(2.14)
Example:
To exemplify the Kelly betting strategy for many mutually exclusive bets,
consider the 2012 edition of the KLM Open in which the outsider, Peter
Hanson, won. Table 2.11 lists the id’s and names for the 8 golfers with
the highest expected revenue rates (the belief probabilities multiplied by the
57
model probabilities subtracted by the track take, given in Equation 2.12; the
simple one-step conditional logit model described in subsection 2.3.1 is used
to calculate probabilities).
In the table, step 1+2 of the betting algorithm has already been performed
(1: calculate re,i ; 2: sort by expected revenue rates). According to the model
described in subsection 2.3.1, Martin Kaymer is the golfer with the highest
expected revenue rate. The public has thus valued the skill of Martin Kaymer
less than the model.
Table 2.11: Betting strategy example for the KLM Open, 2012*. Step 1+2
Id
Name
Odds, 1/oi
Prob., ρi
Exp. revenue rate, re,i
130 Martin Kaymer
0.059
0.116
1.934
154 Peter Hanson
0.038
0.055
1.395
155 Richie Ramsay
0.020
0.020
0.981
140 Marcus Fraser
0.015
0.012
0.752
119 David Lynn
0.045
0.033
0.710
146 Anders Hansen
0.026
0.019
0.698
125 Danny Willett
0.017
0.011
0.667
96
..
.
0.012
..
.
0.008
..
.
0.589
..
.
Victor Dubuisson
..
.
*Only the eight golfers with highest expected revenue rate are listed.
In Table 2.12, step 3+4 of the betting algorithm is illustrated. In the first
iteration of the loop, the set of bets to be made, S is empty. Martin Kaymer
is the first golfer to be considered for being added to S. Since re,i > R(S)
he is added. Next iteration Peter Hanson is considered and also added. The
loop stops in iteration 4 after the golfers with id 130, 154, 155 are added.
58
The optimal bets are calculated according to Equation 2.14.
Peter Hanson ends up winning the tournament. The profit from the bet
is therefore equal to:
(1 − 0.05) · [0.019 · 1/0.038 − (0.061 + 0.019 + 0.001)] = 0.40
(2.15)
The bets in the tournament thus increase the bankroll with 490% (Betfair
takes a 5% fee of net winnings).
Table 2.12: Betting strategy example for the KLM Open, 2012*. Step 3+4
Itteration
i S before iteration
R(S)
re,i
Bet Kopt,i
1 130
{}
1.000
1.934
Yes
0.061
2 154
{130}
0.938
1.395
Yes
0.019
3 155
{130, 154}
0.916
0.981
Yes
0.001
4 140
{130, 154, 155}
0.914
0.752
No
Fractional Kelly wagering strategy
A fractional Kelly wagering strategy is a strategy derived from the pure Kelly
strategy. The idea is basically to bet a fixed fraction of the amount recommended by the Kelly wagering strategy. Benter (1994) argues for fractional
Kelly betting and proposes that betting 1/2 or 1/3 of the Kelly strategy is a
good idea.
The reason for betting only a fraction of the pure Kelly strategy is to be
less exposed to non-deterministic errors in the advantage calculations (the difference between predicted odds and the market odds (Hausch et al., 1994)).
59
I will in the following exemplify the issues with the pure Kelly strategy arising from non-deterministic errors in the bettor’s advantage via Monte Carlo
simulations.
Monte Carlo simulations of betting strategies:
To test the performance of the pure Kelly strategy and the 0.3 fractional
Kelly strategy compared to betting randomly, a Monte Carlo simulation experiment is set up. MacLean et al. (1992) create a mathematical framework
which illustrates the fractional Kelly strategies’ ability to decrease variance of
return. There is, to my present knowledge, no literature evaluating fractional
Kelly strategies via Monte Carlo simulations.
A tournament simulation framework is created in which m simulated
golfers compete against each other. Three sets of probabilities are drawn
from random distributions. True winning probabilities are drawn randomly
and the public as well as a ’bettor’ makes noisy estimates of the true winning
probabilities:
1. The true winning probability for golfer i = 1, 2, . . . , m are drawn randomly. The probabilities are drawn from a squared poisson distribution with λ = 1. I find that this distribution approximates the winning
probabilities implied by the odds given in the dataset.
2. The publics belief probability (1/odds) are given by the true winning
probability with added noise. The winning probabilities are multiplied
by random draws from a normal distribution with mean=1 and standard deviation=0.2.
3. The bettor uses a ”statistical model” to calculate winning probabilities.
Unfortunately, the model is not flawless, so noise is added. True prob60
abilities are multiplied by a random draw from a normal distribution
with mean=1 and standard deviation=0.1.
Each of the three sets of probabilities are finally divided by the sum, so that
each probability set sum to one.
The tournament is simulated 140 times with m = 127 golfers (my dataset
contains 140 tournaments with odds, and there are on average 127 golfers in
a tournament, see Table 2.4). Returns from the different betting strategies
are depicted in a boxplot in Figure 2.7 for one simulation of the experiment.
That is, for each of the 140 simulated tournaments the betting strategies
Figure 2.7: Boxplots of Monte Carlo experiment return for the three betting strategies: Random betting; Kelly betting and Fractional Kelly using a
fraction of 0.3
”places bets” on a subset of the 127 golfers. The return for each simulated
tournament is finally plotted in the boxplot for each strategy.
61
The random strategy has a well-defined minimum return, given by the fraction of wealth, which is bet in each tournament. The pure Kelly strategy
has high variance (this is what Benter (1994) referred to as a “rough ride”)
but overall the highest mean. The fractional Kelly strategy has less variance
than the Kelly strategy.
The above given simulation of tournament j = 1, 2, . . . , 140 is simulated
100 times and the end wealth is calculated for each of the 100 simulations. A
starting capital of $1 is chosen for simplicity and the end wealth is calculated
140
Q
as:
(1 + rj ). Where rj is the return in simulation j.
j=1
Figure 2.8 contains the boxplots of end wealth in each of the 100 simulations for the three strategies. The fractional Kelly strategy is the only one
with an expected end-wealth above $1.
However, if somebody with a perfect betting model was betting, and he
could calculate the true winning probabilities, the results would be different.
Figure 2.9 depicts the boxplot of end wealth for the three strategies with no
noise in the bettor’s probability estimates. Given this boxplot, it is clear why
the Kelly strategy has received so much attention.
Robustness of simulation results: The general picture of Figure 2.8
and Figure 2.9 are robust to a number of different processes for generating the
three sets of probabilities: (1) the true winning probabilities; (2) the public’s
belief probabilities and (3) the bettor’s probabilities. I have simulated the
Monte Carlo experiments using the t-distribution (with heavier tales than the
normal distribution) as well as a skewed normal distribution to draw noise
62
Figure 2.8: Boxplots of the end wealth in the Monte Carlo experiment for
the three betting strategies: Random betting; Kelly betting and Fractional
Kelly using a fraction of 0.3
from (to simluate the public’s and the bettor’s noisy probability estimates).
The main conclusions from the Monte Carlo simulation experiments are as
follows:
• The fractional Kelly strategy has the potential to be profitable as long
as the noise term for the bettor’s probability estimates in general is
smaller than the noise for the public’s belief probabilities.
• The bigger the difference between the noise term in the public’s belief
probabilities and the bettor’s probability estimates, the larger is the
optimal fraction to use in the fractional Kelly strategy, i.e.:
– if the public is bad at assessing the true winning probabilities (lots
of noise) and the bettor is good at assessing it (little amount of
63
Figure 2.9: Boxplots of the end wealth in the Monte Carlo experiment with
no noise in bettor’s probabilities for the three betting strategies: Random
betting; Kelly betting and Fractional Kelly using a fraction of 0.3
or no noise), then he should use a fraction close to one in his
fractional Kelly strategy (this is the case in Figure 2.9).
– if, on the other hand, the bettor is only marginally better than
the public, the fraction of the Kelly bet should be closer to zero
(this is the case in Figure 2.8).
2.5
Performance evaluation of the statistical
models and betting strategies
The following is going to be a race between the different models proposed
in section 2.3 as well as the betting strategies proposed in section 2.4. The
models are estimated with one dataset and evaluated with respect to perfor-
64
mance with a different dataset. Non-overlapping datasets are used in order to
assess the models’ abilities to predict winners for out of sample tournaments.
The strategies are ultimately being compared by the end-of-period wealth
they create, i.e. are they enabling a bettor to achieve positive return. Without loosing generalization, it is assumed, that a bettor starts with $1. His
end-wealth is calculated as:
140
Y
(1 + rj )
(2.16)
j=1
Where rj is the return of bets in tournament j.
In the following, I will evaluate performance of three sets of strategies: (1)
technical strategies that try to exploit weak-form inefficiencies - arising if
market prices are not well adjusted to previous market prices; (2) fundamental strategies that try to exploit semi-strong inefficiencies - arising if market
prices are not well adjusted to results from previous golf tournaments; (3)
Benchmark strategies, to have a valid comparison to the other strategies.
I am only considering strategies that possibly place multiple bets in each
tournament.
1. Technical strategies:
(a) Bet 3% of wealth on the five golfers with lowest odds (the favourites).
(b) Bet 3% of wealth on the five golfers with highest odds (the longshots).
2. Fundamental strategies: Bet according to the pure Kelly strategy
and the 0.3 fractional Kelly strategy using the following four different
models to calculate probabilities:
(a) The one-step conditional logit model.
65
(b) The one-step conditional logit model with variable attribute discounting.
(c) The two-step model with the conditional logit model as step one.
(d) The two-step model with the conditional logit model with variable
attribute discounting as step one.
3. Benchmark:
(a) Bet 3% of wealth on five randomly chosen golfers.
The numbering in the above list is used for reference in the following. Conditional logit is abbreviated to CL.
2.5.1
Performance by following the strategies in the
139 tournaments
The return for each of the 139 tournaments are depicted in a boxplot in Figure 2.10 for the technical betting strategies; the fundamental betting strategies and the benchmark strategies. End wealth is calculated according to
Equation 2.16 for all strategies and listed in Table 2.13.
The best overall performing strategy is the ’2.fractional Two step discounted
CL’ strategy. By following the strategy, a bettor would have $0.54 after
betting for two years, if he started with $1. The performance is thus not
overwhelming.
The fact that the expected return (the average return for each tournament) of strategy ’1a. favorite’ outperforms ’1b. longshot’, can be seen as
evidence of the favorite-longshot bias. However, the evidence is weak because
of the relatively small size of the dataset.
66
Figure 2.10: Boxplots of return for the technical -, fundamental - and
benchmark betting strategies. The numbering (e.g. 1a) refers to the numbered list on page 65; pure/fractional refer to the fractional - and pure Kelly
strategy, respectively. CL is abbreviation for conditional logit.
67
Table 2.13: End wealth; 0.3 fractional Kelly betting strategy with startwealth of $1
Strategy
End wealth # bets
0
1a. Favorite
0.45
695
1
1b. Longshot
0.05
695
2
2a.fractional. CL
0.37
781
3
2b.fractional. Discounted CL
0.42
741
4
2c.fractionl. Two Step CL
0.53
111
5
2d.fractional. Two Step Discounted CL
0.54
109
6
2a.pure. CL
0.00
781
7
2b.pure. Discounted CL
0.00
741
8
2c.pure. Two Step CL
0.00
111
9
2d.pure. Two Step Discounted CL
0.00
109
10
3a. Random
0.40
695
The numbering (e.g. 1a) refers to the numbered list on page 65; pure/fractional refer to
the pure and fractional Kelly strategy, respectively.
2.5.2
Performance by going against the strategies in
the 139 tournaments
Since Betfair allows for both long bets and short bets, I have the option of
going against my own strategies. That is, instead of backing a set of golfers,
I will lay them (i.e. short the golfers). At first, this idea seems very counterintuitive, but I will argue that the idea can be justified. When shortening the
strategies, the pure Kelly strategies bet more than 100% of wealth. Only the
technical strategies, fractional Kelly strategies and benchmark strategy are
therefore considered. End wealth is calculated for the strategies according to
68
Equation 2.16 and listed in Table 2.14.
Table 2.14: End wealth; 0.3 fractional Kelly betting strategy with startwealth of $1
Strategy
End wealth, $
Number of bets
0 1a. Favorite
0.89
695
1 1.b Longshot
0.01
695
2 2a.fractional. CL
2.16
781
3 2b.fractional. Discounted CL
2.21
741
4 2.c.fractionl. Two Step CL
0.07
111
5 2.d.fractionl. Two Step Discounted CL
0.04
109
6 3.a Random
0.25
695
The numbering (e.g. 1a) refers to the numbered list on page 65; fractional refer to the
fractional Kelly strategy.
The conditional logit strategy, as well as the strategy where historic results
are discounted with a variable discounting coefficient, produce a large positive
result. By following these strategies in 2011 and 2012, a bettor would have
been able to more than double his starting wealth.
Interpretation of results
The seemingly strange observation, that a bettor could double his starting
wealth by shortening the strategies, could be explained by the following:
(1) the people who are betting, have a biased perception of true winning
probabilities. This bias leads to market prices that do not reflect the true
winning probabilities. (2) The model generates probability estimates which
are biased in the same direction as the public, only more extreme. That is,
69
for each golfer, the model’s probability estimate is further away from the true
probability than the public’s belief probability.
2.5.3
Robustness
The robustness of the end wealth calculations are estimated by evaluating
the results’ robustness to two factors: (1) which golfers there are betted on;
(2) which tournaments there are betted on. To keep the following simple,
only the two best performing strategies are considered.
(1) Which golfers there are betted on: I test whether the results are
driven by bets on golfers with specific values for ’odds’ and ’volume matched’.
I run a grid search where the two betting strategies are evaluated for golfers
with different ranges of odds and matched volume. That is, I evaluate the
betting strategies on golfers with low odds (< 50) and high odds (≥ 50) as
well as liquid (golfer that are matched > £5000 on) and less liquid golfers
(golfer that are matched £1000 to £5000 on). I have chosen not to look at
illiquid golfers, because only very small bets can be placed on them. From
Table 2.15 it is clear that the positive return is mainly driven by bets on
golfers with odds <50.
Table 2.15: End wealth by betting on select group of golfers.
2a.fractional CL
2b.fractional Discounted CL
£1000 to £5000 ≥ £5000
£1000 to £5000
≥ £5000
Odds < 50
1.54
1.34
1.72
1.46
Odds ≥ 50
0.96
1.05
0.97
1.03
70
(2) Which tournaments there are betted on: To test whether the
results are driven by bets on only a few tournaments, I create 1000 sets of
bootstraped resamples (with replacement) of the 139 tournaments with odds
(see Table 2.4). For each of the 1000 bootstraped samples with 139 tournaments, I calculate the end wealth of the strategies according to Equation 2.16.
Figure 2.11 contains boxplots of the end wealth for the 1000 bootstraped
samples for the two betting strategies ’2b.fractional CL’ and ’2b.fractional
Discounted CL. The results indicate that the strategies could enable a bettor
to make a very large positive return. The positive return does not appear to
be driven by few specific tournaments in the sample.
71
Figure 2.11: Boxplot of end wealth for the bootstrapped resamples for
the two best performing strategies. The numbering (e.g. 2a) refer to the
numbered list on page 65; fractional refer to the fractional Kelly strategy.
How much to bet?
Before deciding whether to implement the above proposed betting system in
the real world, it would be relevant to investigate how much money one could
expect to be able to win. It is clear that the odds are going to change if you
place big bets on a golfer. If you back a golfer, the odds will likely fall; if you
lay a golfer, the odds will likely increase. The size of the increase/decrease
is likely related to the liquidity of the golfer, but it is not possible to derive
an explicit formula for how the odds will change as a function of the size of
odds placed.
72
Chapter 3
Conclusion
In this thesis I evaluate whether Betfair’s golf prediction markets are efficient. I create a novel dataset containing: (1) winning market prices from
the biggest public prediction market, Betfair, for all golf tournaments in the
PGA Tour and the European Tour in 2011 and 2012 and (2) historical golf
results from the PGA Tour, European Tour, Champion Tour and Nationwide
Tour from the beginning of 2002 to the end of 2012.
I evaluate whether the golf prediction markets are efficient by testing
whether market prices are well adjusted to two sets of relevant historical
data. First, I perform weak form tests to see if prices are efficiently adjusted
to historical prices. Secondly, I perform semi-strong form tests to see if prices
are efficiently adjusted to results from previous golf tournaments.
I test for for weak form and semi-strong form efficiency by building betting strategies which aim at achieving positive return. Positive return indicate market inefficiency. My approach of evaluating prediction markets
efficiency differs a bit from other papers dealing with the prediction markets.
For example, it is the aim of: Cowgill & Zitzewitz (2013); Forsythe et al.
(1992); Smith et al. (2006) to evaluate whether prediction markets provide
73
more precise probability estimates than corporate experts, exit polls and
bookmakers respectively. My approach is, however, in line with other studies
of market efficiency on sports markets see e.g Bolton & Chapman (1986);
Benter (1994) and Sung & Johnson (2012).
My analyzes lead to the following two main findings:
Firstly, I am not able to achieve positive return using the naive betting
strategies - betting on favorites and longshots, respectively. I thus show no
sign of weak form inefficiency in the market.
Secondly, by using the most profitable proposed betting strategy based on
results from previous golf tournaments, a bettor is able to more than double
his start wealth over the two year period from the beginning of 2011 to the
end of 2012. This robustness of the finding is evaluated via (1) bootstrapped
resampling of the tournaments as well as (2) simulations of the betting strategies for golfers in different ranges of odds and matched volume. The result
stands up to the robustness tests. The fact that the betting strategy enables
a bettor to achieve positive return over the two year period indicate that
prices not are well adjusted to e.g. results from previous golf tournaments;
Betfair’s golf markets thus seem semi-strong form inefficient.
My conclusions are weakened by the fundamental structure of golf tournaments. I have relatively few tournaments with many golfers, much competition and possibly important attributes that are hard to quantify (e.g. psychological state of the golfers). The shown ability to achieve positive return
could have been caused by e.g. a sampling bias, whereby some tournament
outcomes are overrepresented in the two year sample I analyze, compared to
the ’true population of golf tournament outcomes’. More tournaments could
74
be included in the analyzes in the future to strengthen the conclusions.
The findings in this thesis lead to the following two considerations.
Firstly, it would be worth investigating the possibilities of implementing
the betting strategy in real life. The ideas and approaches proposed in this
thesis could be further refined in order to improve performance before a real
life implementation. The existing attributes could be incorporated into the
model in different ways and new attributes could be added to the model
in order to obtain a better and more unbiased fit. The added attributes
could describe the golfers individual weather preferences, psychological state,
preferences with regards to type of course etc. Benter (1994) writes that it
took him “approximately five man-years of effort to [...] organize the database
and develop a handicapping model”. More time is thus likely to improve the
performance of the proposed models and strategies.
Secondly, this thesis provides evidence that suggests that prediction markets are inefficient estimators of event probabilities. I propose that more
studies such as this is needed before prediction markets can confidently be
used as efficient probability estimators.
When deciding which method to use in order to answer the question “What
is the probability of...?”, the amount of available data should play a key role.
My findings indicate that prices in prediction markets do not always correspond to true event probabilities even for very liquid markets such as Betfair’s
golf markets. An analytical approach could potentially provide more efficient
estimates of event probabilities than prediction markets.
75
Bibliography
Arrow, K. J., Forsythe, R., Gorham, M., Hahn, R., Hanson, R.,
Ledyard, J. O., Levmore, S., Litan, R., Milgrom, P., Nelson,
F. D. et al. (2008). The promise of prediction markets. SCIENCE-NEW
YORK THEN WASHINGTON- 320(5878), 877.
Beard, H. (2009). Golf: An Unofficial and Unauthorized History of the
World’s Most Preposterous Sport. Simonand Schuster.
Beilock, S. L. & Carr, T. H. (2001). On the fragility of skilled performance: What governs choking under pressure? Journal of experimental
psychology: General 130(4), 701.
Benter, B. (1994). Computer based horse race handicapping and wagering
systems: a report. In: Efficiency of Racetrack Betting Markets (Hausch,
D. B., Lo, V. S. & Ziemba, W. T., eds.). Academic Press, pp. 183 –
198.
Betfair (2014). Historical golf odds. http://data.betfair.com/. [Online;
accessed: 2014-01-18].
Bolton, R. N. & Chapman, R. G. (1986). Searching for positive returns
at the track: A multinomial logit model for handicapping horse races.
Management Science 32(8), 1040–1060.
76
Cowgill, B. & Zitzewitz, E. (2013). Corporate prediction markets:
Evidence from google, ford, and koch industries1 .
Edelman, D. (2007). Adapting support vector machine methods for horserace odds prediction. Annals of Operations Research 151(1), 325–336.
Ehrenberg, R. G. & Bognanno, M. L. (1990). The incentive effects of
tournaments revisited: Evidence from the european pga tour. Industrial
and Labor Relations Review , 74S–88S.
Fama, M. B. G., Eugene F (1970). Efficient capital markets: A review of
theory and empirical work*. The journal of Finance 25(2), 383–417.
Forsythe, R., Nelson, F., Neumann, G. R. & Wright, J. (1992).
Anatomy of an experimental political stock market. American Economic
Review 82, 1142–1142.
Fox, J. (2008). Applied regression analysis and generalized linear models.
Sage.
Franck, E., Verbeek, E. & Nüesch, S. (2010). Prediction accuracy of
different market structures, bookmakers versus a betting exchange. International Journal of Forecasting 26(3), 448–459.
Griffith, R. M. (1949). Odds adjustments by american horse-race bettors.
The American Journal of Psychology .
Hausch, D. B., Lo, V. S. & Ziemba, W. T. (1994). Efficiency of Racetrack Betting Markets, vol. 2. World Scientific Publishing.
Hausch, D. B., Lo, V. S., Ziemba, W. T. & Ziemba, W. (2008). Efficiency of Racetrack Betting Markets, vol. 2. World Scientific Publishing.
77
Jones, E., Oliphant, T., Peterson, P. et al. (2001–). SciPy: Open
source scientific tools for Python. URL http://www.scipy.org/.
Kelly, J. L. (1956). A new interpretation of information rate. Information
Theory, IRE Transactions on 2(3), 185–189.
Lessmann, S., Sung, M.-C. & Johnson, J. E. (2007). Adapting leastsquare support vector regression models to forecast the outcome of horseraces. The Journal of Prediction Markets 1(3), 169–187.
MacLean, L., Ziemba, W. T. & Blazenko, G. (1992). Growth versus
security in dynamic investment analysis. Management Science 38(11),
1562–1585.
Manski, C. F. (2006). Interpreting the predictions of prediction markets.
economics letters 91(3), 425–429.
McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior .
Orszag, J. M. (1994). A new look at incentive effects and golf tournaments.
Economics Letters 46(1), 77–88.
Ottaviani, M. & Sørensen, P. N. (2008). The favorite-longshot bias:
an overview of the main explanations. Handbook of Sports and Lottery
Markets (eds. Hausch, DB and Ziemba, WT), North-Holland/Elsevier ,
83–102.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss,
R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,
Brucher, M., Perrot, M. & Duchesnay, E. (2011). Scikit-learn:
78
Machine learning in Python. Journal of Machine Learning Research 12,
2825–2830.
Sauer, R. D. (1998). The economics of wagering markets. Journal of
economic Literature 36(4), 2021–2064.
Schmidt, C. & Werwatz, A. (2002). How accurate do markets predict
the outcome of an event? .
Shmanske, S. (2005). Odds-setting efficiency in gambling markets: Evidence from the pga tour. Journal of Economics and Finance 29(3), 391–
402.
Smith, M. A., Paton, D. & Williams, L. V. (2006). Market efficiency
in person-to-person betting. Economica 73(292), 673–689.
Smith, M. A., Paton, D. & Williams, L. V. (2009). Do bookmakers possess superior skills to bettors in predicting outcomes? Journal of
Economic Behavior & Organization 71(2), 539–549.
Smoczynski, P. & Tomkins, D. (2010). An explicit solution to the problem of optimizing the allocations of a bettor’s wealth when wagering on
horse races. Mathematical Scientist 35(1).
Stevenson, A. & Lindberg, C. (2010). New Oxford American Dictionary,
Third Edition. Oxford University Press.
Sung, M. & Johnson, J. E. (2012). Comparing the effectiveness of one-and
two-step conditional logit models for predicting outcomes in a speculative
market. The Journal of Prediction Markets 1(1), 43–59.
Tan, P.-N., Steinbach, M. & Kumar, V. (2013). Introduction to data
mining. Pearson Education India.
79
Tanaka, R. & Ishino, K. (2012). Testing the incentive effects in tournaments with a superstar. Journal of the Japanese and International
Economies 26(3), 393 – 404. URL http://www.sciencedirect.com/
science/article/pii/S0889158312000196.
Tziralis, G. & Tatsiopoulos, I. (2012). Prediction markets: An extended literature review. The journal of prediction markets 1(1), 75–91.
Verbeek, M. (2008). A guide to modern econometrics. John Wiley & Sons.
Wolfers, J. & Zitzewitz, E. (2006). Interpreting prediction market
prices as probabilities. Tech. rep., National Bureau of Economic Research.
Yahoo (2014a). Historical golf results. http://sports.yahoo.com/golf/.
[Online; accessed: 2014-01-18].
Yahoo (2014b). THE PLAYERS Championship. http://sports.yahoo.
com/golf/pga/leaderboard/2013/13. [Online; accessed: 2014-01-18].
Yahoo (2014c). Yahoo data license. http://info.yahoo.com/guidelines/
us/yahoo/ydn/ydn-3955.html. [Online; accessed: 2014-01-18].
Ziemba, W. T. (2008). Chapter 10 - efficiency of racing, sports, and lottery
betting markets. In: Handbook of Sports and Lottery Markets (Hausch,
D. B. & Ziemba, W. T., eds.), Handbooks in Finance. San Diego: Elsevier, pp. 183 – 222.
80
Appendix A
Appendix
A.1
Golf dictionary
Table A.1: Golf terms
Term
Description
Bogey
A score of one over par
The area between the tee box and the putting green where
Fairway
the grass is cut even and short
Golfer
A person who plays golf
Handicap
A handicap is a numerical measure of a golfer’s potential playing ability based on the tees played for a given course
Special areas on the golf course that have additional rules
Hazard
for play. There are generally two types: (1) water hazards,
e.g. ponds, lakes, and rivers; and (2) bunkers, which are sand
traps.
81
Continued from previous page
Term
Description
The pre-determined number of strokes that a scratch (or 0
Par
handicap) golfer should require to complete a hole or a round.
A grass area on the golf course where the grass is cut higher
than the grass on the fairway and the green. It is typically a
Rough
disadvantageous area to hit from.
The most common scoring system in golf. It involves counting
Stroke play
the total number of strokes used on each hole during a given
round, or series of rounds. The winner is the player who has
used the fewest number of strokes over the course of the round,
Tee box
or rounds.
The starting point of a golf hole.
Source: Stevenson & Lindberg (2010)
A.2
LR testing of parameter significance
The likelihood-ratio test is used to assess the contribution of individual attributes to the models. The LR test is given by (Verbeek, 2008):
D = −2 ln
likelihood of the null modell
likelihood of the alternative modell
(A.1)
The null model is the model without the attribute which contribution is to
82
be assessed. The alternative model contains the attribute. The test statistic,
D, is approximately chi-squared distributed (Verbeek, 2008) with 1 degree
of freedom (the difference between the number of free parameters in the two
models). Based on the test, it can be concluded whether there is a significant
association between the attribute (the predictor) and the outcome.
83