Market Efficiency in Person-to-Person Betting on Golf
Transcription
Market Efficiency in Person-to-Person Betting on Golf
FACULTY OF SOCIAL SCIENCES UNIVERSITY OF COPENHAGEN Master’s Thesis Sebastian Schock Market Efficiency in Person-to-Person Betting on Golf Tournamnets Supervisor: Rasmus Jørgensen April 7, 2014 Abstract In this thesis, I examine market efficiency for golf betting markets on Betfair.com, the leading provider of person-to-person betting exchanges. I create and analyze a novel dataset containing winning market prices for all golf tournaments in the PGA Tour and the European Tour in 2011 and 2012 as well as historical golf results from the PGA Tour, European Tour, Champions Tour and Nationwide Tour from the beginning of 2002 to the end of 2012. A set of betting strategies are created with the aim of exploiting weak-form inefficiencies and semi strong form inefficiencies. I find no evidence of weak-form inefficiency. However, the most profitable proposed betting strategy, which is based on historical golf results, enables a bettor to more than double his starting wealth over the two-year period from 2011 to 2012. My findings thus indicate semi strong form inefficiency in Betfair’s golf markets. Preface The main idea for the thesis arose while I was watching a PGA golf tournament with my good friend Christian Kragh. While we watched the tournament, we followed the live odds on the online betting site Betfair.com and discussed whether the odds reflected the golfers’ true winning probabilities. The idea was further developed by talking to another good friend, Mathias Trock, who wrote his bachelor thesis evaluating the efficiency of betting markets for Danish horse races. Mathias has furthermore been helpful with comments and ideas to improve this thesis. My supervisor, Rasmus Jørgensen, has also been of great help. We have had lengthy talks about efficiency on betting markets for a wide range of events - from US presidential elections to Swedish horse races. Contents 1 Introduction 4 1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.1 Prediction markets . . . . . . . . . . . . . . . . . . . . 9 1.2.2 Sport betting markets . . . . . . . . . . . . . . . . . . 11 2 Empirical evidence of inefficiencies in Betfair’s golf betting markets 2.1 2.2 2.3 14 Domain knowledge . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.1 Golf . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.2 Betfair’s sport betting exchange . . . . . . . . . . . . . 18 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Historical golf results . . . . . . . . . . . . . . . . . . . 21 2.2.2 Historical odds for golf events . . . . . . . . . . . . . . 26 2.2.3 Merging the two data sources . . . . . . . . . . . . . . 30 2.2.4 Extracting attributes for statistical modeling . . . . . . 30 2.2.5 Removing outliers . . . . . . . . . . . . . . . . . . . . . 36 2.2.6 Weaknesses of the dataset . . . . . . . . . . . . . . . . 37 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.3.1 One-step estimation: Conditional logit model . . . . . 40 2 2.3.2 One-step estimation: Conditional logit model with variable attribute discounting . . . . . . . . . . . . . . . . 43 2.3.3 2.4 2.5 Two-step estimation . . . . . . . . . . . . . . . . . . . 49 Betting strategies . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.4.1 Technical betting strategies . . . . . . . . . . . . . . . 53 2.4.2 Fundamental betting strategies . . . . . . . . . . . . . 55 Performance evaluation of the statistical models and betting strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 2.5.1 Performance by following the strategies in the 139 tournaments . . . . . . . . . . . . . . . . . . . . . . . . . . 66 2.5.2 Performance by going against the strategies in the 139 tournaments . . . . . . . . . . . . . . . . . . . . . . . . 68 2.5.3 Robustness . . . . . . . . . . . . . . . . . . . . . . . . 70 3 Conclusion 73 Bibliography 76 A Appendix 81 A.1 Golf dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . 81 A.2 LR testing of parameter significance . . . . . . . . . . . . . . . 82 3 Chapter 1 Introduction One of the dominant subjects of today is Big Data. We are generating everincreasing amounts of data, which we wish to analyze in order to gain greater insight into complicated systems in diverse fields. Many of the analyzes we wish to perform are focused on the question: “What is the probability of...?”. (Arrow et al., 2008) Examples of such questions regarding binary events could be: Will the President be re-elected? Will Russia declare war on Ukraine? Is the new iPhone going to generate more than X billion dollars in revenue next year? Is the next quarterly sale of HP going to be in a given range? Will more than Y million people watch the next episode of Netflix TV-series ’House of Cards’ ? One way to estimate probabilities for these questions would be the development of analytical theories and sophisticated simulation- or regressionalgorithms, which incorporate lots of data. Another approach would be asking the crowd. Recent studies have suggested that asking the crowd by letting people trade in speculative futures markets (so called prediction markets), could yield good probability estimates (Tziralis & Tatsiopoulos, 2012). The 4 idea is that dispersed information among the various individuals participating in the market can be aggregated via the free market mechanism and thereby provide accurate estimates of event probabilities. But how does prediction markets stand up against an analytical approach that incorporates historical data in yielding precise probability estimates? In this thesis I evaluate whether prediction market prices provide accurate estimates of winning-probabilities for golfers in golf tournaments (i.e whether golf prediction markets are efficient). I create a novel dataset containing: (1) market-prices from the biggest public prediction market, Betfair1 , for all golf tournaments in the PGA Tour and the European Tour in 2011 and 2012 and (2) historical golf results from the PGA Tour, European Tour, Champion Tour and Nationwide Tour from the beginning of 2002 to the end of 2012. I evaluate whether golf prediction markets are efficient by testing if market prices are well adjusted to two sets of relevant historical data. First, I perform weak form tests to see if prices are efficiently adjusted to historical prices. Secondly, I perform semi-strong form tests to see if prices are efficiently adjusted to other information that is undoubtedly publicly available (e.g. results from previous golf tournaments). These are two of the three tests proposed by Fama (1970) to test the market efficiency hypothesis. Given my dataset, I am not able to conduct the final strong form test concerned with whether given investors have monopolistic access to any information relevant for price formation. I perform the weak form test by building a set of naive betting strategies with the aim of exploiting the commonly observed favorite-longshot bias 1 Franck et al. (2010) document, that Betfair is the biggest public prediction market measured by the amount of money traded on the platform. 5 whereby bets on favorites tend to yield higher return than bets on longshots. The strategies show weak signs of the bias, but does not enable a bettor to achieve positive returns. I thus find no evidence of weak-form inefficiency. This finding is in line with the finding by e.g. Shmanske (2005), who searches for the favorite-longshot bias in the PGA Golf Tour in 2002. Shmanske also finds weak evidence of the bias, but he is also not able to create a profitable betting strategy. The semi-strong form tests are performed by creating a set of analytical models as well as betting strategies with the aim of achieving a positive return by betting on the golf markets. The most profitable model is a conditional logit model that models winning probabilities based on results from previous golf tournaments. This model is operationalized via the widely used fractional Kelly betting strategy (Kelly, 1956). This setup enables a bettor to produce a positive return of 121% by betting on tournaments from the beginning of 2002 to the end of 2012. The possibility of achieving positive return indicates that market prices are not well adjusted to publicly available information. The market thus seems semi-strong form inefficient. The indication of semi-strong form inefficiencies in Betfair’s golf betting markets suggests that an analytical approach could yield more precise estimates of winning probabilities than prediction markets for golfers in golf tournaments. My approach of evaluating prediction markets efficiency differs a bit from other papers dealing with the topic. For example, it is the aim of: Cowgill & Zitzewitz (2013); Forsythe et al. (1992); Smith et al. (2006) to evaluate whether prediction markets provide more precise probability estimates than corporate experts, exit polls and bookmakers respectively. I evaluate pre- 6 diction markets by building analytical strategies with the aim of exploiting market inefficiencies. I build heavily upon the literature on horse race wagering markets where, for example, Bolton & Chapman (1986), Benter (1994) and Sung & Johnson (2012) try to achieve positive returns by exploiting market inefficiencies. 1.1 Contributions I contribute to the literature of prediction markets and sports betting markets in three major ways: 1. This is by far the most comprehensive study of efficiency in golf markets. To the best of my knowledge, only one article deals with efficiency in golf betting markets, Shmanske (2005) analyzes data for the PGA Tour in 2002 with odds from a casino bookmaker. He focuses on both weak form tests and semi-strong form tests, but he only analyzes a small fraction of the amount of data analyzed in this thesis. 2. Most empirical studies of prediction markets compare the predictive ability against forecasts by experts, exit polls or bookmakers (see e.g. Forsythe et al., 1992; Cowgill & Zitzewitz, 2013; Smith et al., 2006; Schmidt & Werwatz, 2002) and find that prediction markets give more precise probability estimates. I compare the predictive ability of prediction markets against a set of analytical models that I develop in this thesis. I find evidence that prices on golf prediction markets are not well adjusted to results from previous golf tournaments and that the market thus seems semi-strong inefficient. 7 I suggest that it would be a good idea to perform more studies, such as this, before embracing the alleged predictive ability of prediction markets. 3. I generalize the conditional logit model proposed by Bolton & Chapman (1986) for calculating winning probabilities in horse races based on historical information. Bolton & Chapman model winning probabilities based on, among other attributes, the sum of winnings last year. I model the probabilities based on, among other attributes, the discounted sum of winnings last year, where the discounting factor is found via maximum likelihood optimization together with other model parameters. I test whether my way of modeling the probabilities fits the data better than the approach proposed by Bolton & Chapman. I find that my approach results in a statistically significant improvement in model fit (p value < 0.00001). Even though Bolton & Chapman created their model back in 1986, it is still used and thus still relevant to refine. The model is e.g. used by Sung & Johnson (2012) to model winning probabilities for UK horse races. The approach of estimating optimal discounting rates when fitting regression models to datasets with a time dimension might furthermore be applicable in a wide range of other research areas. 8 1.2 Literature review This thesis builds upon literature from two semi-overlapping research fields: (1) prediction markets and (2) sport betting markets. The following seeks to describe the articles mentioned in the introduction in greater details as well as introduce other related articles. The aim of this section is to illustrate that prediction markets lately have received lots of attention. Little focus has, however, been devoted to evaluate the markets for semi-strong form efficiency. By using ideas from the sport betting literature this gap in the literature can be explored. 1.2.1 Prediction markets Prediction markets are forums for trading contracts that yield payments based on the outcome of uncertain future events (Arrow et al., 2008). The literature can roughly be split into four major categories: (1) descriptive work; (2) theoretical work; (3) Empirical applications; (4) Law and policy. The categories are not mutually exclusive, which means that the same paper potentially overlaps multible categories. The idea of the categorization is to give a quick overview of the work that has been done on prediction markets. (1) Descriptive: Tziralis & Tatsiopoulos (2012) describe the market structure and make a survey of the literature written about prediction markets. They report that the research area has gained popularity in recent years. The publication trend could be “roughly described as being of exponential growth” (Tziralis & Tatsiopoulos, 2012). Arrow et al. (2008) emphasize the potential of prediction markets to im- 9 prove decisions. The range of applications are, according to the authors, “limitless” - from helping businesses make better investment decisions to helping governments make better fiscal policy decisions. (2) Theoretical work: Manski (2006) shows mathematically, under a wide range of assumptions, that the probabilities derived from prediction markets typically do not correspond closely to the actual probability beliefs of the market participants, unless the market probability is near either 0 or 1. Manski suggests that directly asking a group of participants to estimate probabilities may lead to better results. Wolfers & Zitzewitz (2006) show mathematically that for a broader class of models, prediction market prices are usually close to the mean beliefs of traders. Wolfers & Zitzewitz (2006) thus contradict Manski (2006). Wolfers & Zitzewitz find that “Manski’s special case is in fact a worst-case scenario”. (3) Application: Most empirical studies of prediction market application benchmark the markets against forecasts from experts, bookmakers or exit polls. Forsythe et al. (1992) document the first application of a prediction market mechanism, the Iowa Electronic Market designed to predict results of US presidential elections. Forsythe et al. compare exit polls with prices at the Iowa Electronic Market. They find that: “the market worked extremely well, dominating polls in forecasting the outcome of the 1988 presidential election” Cowgill & Zitzewitz (2013) examine results from corporate prediction markets from Google, Ford, and Koch Industries. They compare market prices with forecasts from experts in the different companies and conclude that prediction markets yield more precise estimates. 10 Smith et al. (2006) examine prediction markets for UK horse racing. They benchmark market prices with prices announced by bookmakers and find that prediction market prices are more efficient. Schmidt & Werwatz (2002) analyze prediction markets for the Euro 2000 Championship in soccer and conclude that prediction markets are more efficient than bookmaker markets. (4) Law and policy: Arrow et al. (2008) argue for liberalization of US laws regarding gambling markets, so that companies and governments can leverage the “power of prediction markets”. 1.2.2 Sport betting markets There is a long and established literature examining the efficiency of sports betting markets (see Hausch et al., 1994, 2008, for literature surveys). Studies have been justified in their own right due to the sheer size of betting markets but have also enhanced the understanding of more far-ranging environments: “Wagering markets are especially simple financial markets, in which the pricing problem is reduced. As a result, wagering markets can provide a clear view of pricing issues which are complicated elsewhere.” Sauer (1998) The literature, which is mainly focused on horse racing, can roughly be split into two categories: (1) building strategies to achieve positive return; (2) explaining observed inefficiencies in betting markets. 11 Building strategies to achieve positive return This part of the literature seek to build strategies to achieve a positive return by betting via strategies that are either mainly technical or fundamental in their nature. Technical strategies are built to exploit weak-form inefficiencies. Weakform inefficiency occurs if prices are not well adjusted to historical prices. An example of such an inefficiency is the commonly observed favorite-longshot bias (the finding is typically attributed Griffith, 1949). Shmanske (2005) builds a naive betting strategy for betting on golf at a casino with the aim of exploiting the favorite-longshot bias. He is not able to create a profitable strategy. Ziemba (2008) surveys the literature of efficiency in financial, sports and lottery markets. He collects data from a large number of horse race studies to illustrate the favorite-longshot bias. Based on data on over 50,000 horse races and 300,000 horses, he illustrate a clear positive relationship between the expected return and the likelihood of winning. The more likely you are of winning, the higher your expected return is. Fundamental strategies are focused on modeling probabilities based on historical fundamental information such as past results. These strategies thus seek to exploit semi-strong form inefficiencies. Bolton & Chapman (1986) were some of the pioneers in building fundamental betting strategies for horse races. Bolton & Chapman develop a conditional logit model. They conclude that “[their] betting strategy appears to offer the promise of positive expected returns”. My thesis employs many of the ideas developed by these authors. 12 Benter (1994) develops a so called two-step model, which he operationalizes with a fractional Kelly betting strategy. The first step of his model is closely related to the one proposed by Bolton & Chapman (1986); the second step incorporates market odds in a seperate second-step of the modeling process. Benter claims that he has betted according to his model for a number of years and earned a significant positive return. Sung & Johnson (2012) compare the effectiveness of one- and two-step conditional logit models for predicting UK horse races (i.e. they compares the type of models proposed by respectively Bolton & Chapman (1986) and Benter (1994)). Explaining the observed inefficiencies in betting markets: A lot of energy has been put into explaining the favorite-longshot bias. Ottaviani & Sørensen (2008) point out that the favorite-longshot bias is a “widely documented empirical fact” observed across different events, countries, and market structures. The favorite-longshot bias is often perceived to be an important deviation from the market efficiency hypothesis. Ottaviani & Sørensen (2008) present an overview of the main proposed theoretical explanations. 13 Chapter 2 Empirical evidence of inefficiencies in Betfair’s golf betting markets In this chapter, I develop a set of betting models with the aim of exploiting potential inefficiencies in Betfair’s golf markets for the PGA Tour and the European Tour. The majority of the chapter is focused on exploiting potential semi-strong form inefficiencies that could arise because odds are poorly adjusted to historical information. 2.1 Domain knowledge The following seeks to establish a domain specific knowledge-foundation from which decisions with regard to data scraping, feature extraction, data modeling, outlier detection etc. can be drawn. Two main areas are introduced: (a) golf tournaments as well as economic incentives and psychological aspects of the game; (b) golf betting markets 14 and how person-to-person sports betting markets work in practice. 2.1.1 Golf Golf is a precision club and ball sport in which competing players (golfers) use clubs to hit a ball into a series of holes on a course. Table A.1 presents the most frequently used golf terms in this thesis. The table can be used as a reference for readers not familiar with the sport. Courses: Golf is one of the few ball games that does not require a standardized playing area. The game is played on a course, consisting of typically 18 holes. Each of the holes on the course contains a tee-box to start from, and a putting green containing the hole. In between are other forms of terrain such as the fairway, rough, and hazards. Virtually all courses are unique in their specific layout and arrangement. Furthermore, courses change over time due to the fact that the tee-box can be moved around thus altering e.g. the length of the hole. The length of the grass may also change etc. The holes can furthermore be modified for specific tournaments in order to e.g. increase the level of difficulty. Tournaments: Golf competitions are generally played for the lowest number of strokes by an individual golfer, known as stroke play. The winner of a tournament is the golfer who completes typically four rounds of typically 18 holes using the lowest number of strokes. Most tournaments have a ”cut” after the second round, in which a minimum aggregate score is selected to eliminate about half of the golfers, i.e. only the best half of the golfers play the remaining third and fourth round. 15 Pro golf tours: A small elite of professional golfers are tournament pros who compete in international pro-tours. There are a number of pro-tours. The PGA Tour tends to attract the best golfers, since the tournaments in the PGA Tour have the highest prize money. The second most prestigious tour worldwide is the European Tour (Beard, 2009). There is furthermore a number of second-tier tours e.g. the Nationwide Tour1 (second-tier tour to the PGA Tour). Major championships: The major championships are the four most prestigious men’s tournaments of the year. The championships are: The Masters, the U.S. Open, The Open Championship and the PGA Championship. The top golfers from all over the world compete in these championships due to the prestige and high prize money. The amount of competition and prestige in these tournaments is thus second to none. Weather: Weather plays a role in golf. Weather conditions such as rain or wind influence heavily on golfer’ performance. The courses are exposed to weather of varying degrees. Some are very exposed to wind while other lie more protected in e.g. forest. Some golfers are better than others in windy weather; some like rain more than others etc. The role of weather becomes even more complex due the the fact that tournaments are played over entire days i.e. some golfers in the same tournaments experiences some weather while other golfers possibly experience other weather. 1 The tour is currently called the Web.com tour, but in this thesis it is referred to as the Nationwide tour. 16 Economic incentives and psychological aspects of the golf tournament Economists have long studied how marginal return to effort influence the performance of workers, executives and even golf players. Using data from the PGA Tour in 1984 and the European Tour in 1987, Ehrenberg & Bognanno (1990) find that golfers’ performance tend to vary positively with both (1) total prize money in tournaments and (2) marginal return to effort. These findings have later been confirmed by among others Tanaka & Ishino (2012). Tanaka & Ishino also study the effect on the golfers’ performance if a superstar (i.e. a very good golfer) is playing in a given tournament; they find that the presence of a superstar adversely affects the scores of the other golfers’ performance. Although the above findings have been contradicted by some (see e.g. Orszag, 1994), the findings may suggest that: • Incentives for playing optimal golf might diminish if the chance of winning disappears. • All golfers have a chance of winning in the first round of a tournament; the golfers’ performance in the first round might therefore give a good indication of their optimal performance potential. Psychologist have also spent much time analyzing golfers and their performance under stress. One of their main focus areas has been on the concept choking, which refers to failing under high pressure, i.e. missing the shots when they matter most (see e.g. Beilock & Carr, 2001). There are many anecdotes of golfers who have choked under pressure throughout history. One of them involves Kenny Perry who could taste 17 victory having a two-shot lead with only two holes to go at the prestigious 2009 Masters. Perry had played his careers best golf until that point. He made a bogey at the second last hole due to a shot that sailed over the green. At the final hole he could, after a series of shaky strokes, put for the championship, but he failed and ended third after Angel Cabrera and Chad Campbell. The next day Perry was quoted for saying: “Great players make it happen [...] Your average players don’t. And that’s the way it is.”. The dividing line between winning and losing is thus not only about talent, technique or athleticism. 2.1.2 Betfair’s sport betting exchange Betfair is an online provider of an exchange which gives its users the option of backing or laying bets on events. An event could for example be a golf tournament. Betfair is thus not a bookmaker, which announces the odds, but an exchange provider. Betfair makes money by taking a commission of all winnings. This commission is about 5%. Betfair is a person-to-person exchange where individuals contract their contrasting opinions with each other. Users can post the prices as well as amounts of which they are willing to place a bet – on or against – a given event e.g. Jason Day winning the Accenture Match Play 2014. The demand for bets are then displayed in the order book which shows the most attractive odds with corresponding available volume (see Figure 2.1). Bettors have the choice of either (1) placing a limit order, which is an order to buy or sell a bet at a specific price, and wait for another user to match the bet or (2) place a market order and thereby directly match a bet that has already been offered by another user. Figure 2.1 shows a screen shot of the sporting exchange for the Accenture 18 Figure 2.1: A screenshot of Betfair’s online sport exchange for the event: Accenture Match Play 2014. The table shows the three most attractive odds to back and lay bets on for the eight leading golfers. Match Play 2014. On this market bettors have the option of either: • Placing a market order backing e.g. Jason Day at a 5.5 odds up to £800 (see Figure 2.1). If the bettor backs Jason Day with £1 and Jason wins, the bettor would get £5.5 · (1 − 0.05) = £5.23, where 0.05 is Betfair’s take. • The bettor could also choose to place a limit order backing Jason Day at a different odds of e.g. 5.7. Only if another bettor chooses to lay this bet will it be matched. • Placing a market order laying e.g. Jason Day at a odds of 5.7. If the bettor lays £1 on Jason Day and Jason wins, the bettor will have to pay £5.7; otherwise the bettor will get to keep the £1 (minus Betfair’s 19 cut). • The bettor could also choose to place a limit order to lay a bet of £1 on Jason Day at a lower odds, say 5.5, and then hope to have the bet matched. Betfair has, due to its status as the leading provider of an online sporting exchange, been the focus of much academic attention (See e.g. Franck et al., 2010; Smith et al., 2009). Betfair accounted in 2010 for 90% of all exchangebased betting activity worldwide and claimed in 2010 to process five million trades a day (Franck et al., 2010). 2.2 Data The data used in this thesis comes from two different publicly available sources (see References for descriptions and URL’s): 1. Historical golf results and golf course characteristics are extracted from: Yahoo (2014a) 2. Historical odds and betting volumes (how much money is betted) for golf events are extracted from: Betfair (2014) A novel dataset is created in which data from the two above-mentioned sources are merged. The merged dataset contains attributes that capture: • Information associated with golfers’ historical results and performance; • Characteristics of golf courses; • Which odds the golfers were traded for on Betfair’s sports exchange prior to the beginning of a the tournaments as well as the amount of 20 money that were traded (odds and volume information are only available for a subset of the tournaments). From the merged dataset a number of attributes are extracted to create the final dataset. The final dataset is split in two: (1) a train set and (2) a test set. The first set is used for training models; the second is used to simulate betting strategies and evaluate whether they are profitable. The idea of using an out of sample test set is to avoid over-fitting the data. The aim is to be able to make good ’out of sample’ predictions. It is beyond the scope of this thesis to give an in-depth description of the technicalities regarding the data-scraping process2 . The focus of the following will be on dataset attributes, summary statistics and data processing. 2.2.1 Historical golf results The web-site Yahoo has a sub-site that gives access to historical results for golf tournaments in the golf Pro Tours: PGA Tour; European Tour; Champions Tour; Nationwide Tour and the Ladies Professional Golf Association (LPGA). The data is consistent in format from 2002 until now (January 2014). Access to the data is granted for non-commercial use (Yahoo, 2014c). Figure 2.2 shows a screen-shot from Yahoo in which the results for a sample tournament (The Players Championship, 2013 ) are displayed. Using a scraping-algorithm specifically written for the task, all data from the tournaments in the date interval from January 2002 to December 2012 in 2 Data scraping is a technique in which a computer program extracts data from human- readable output coming from another program. 21 Figure 2.2: Origin of the historical golf results used in this thesis. The screen-shot illustrates the data available for: ’The Players Championship’, 2013. Source: Yahoo (2014b) the Pro Tours: PGA Tour; European Tour; Champions Tour and Nationwide Tour are scraped and saved in a database. Second tier pro-tours such as the Champions Tour or the Nationwide Tour are included in the dataset because they qualify golfers to the PGA Tour and the European Tour, which I aim at estimating winning probabilities for. Historical results from the Champions Tour or the Nationwide Tour thus provide valuable information. Data for the LPGA is not used for the thesis due to the fact that the amount of data for male golfers is much greater. The data extracted via the scraping algorithm contains the attributes listed in Table 2.1. The dataset contains one observation with values for the listed attributes for each golfer in each tournament. 22 Table 2.1: Attributes in dataset scraped from: Yahoo (2014a). Class=1: Tournament specific; Class=2: Golfer specific. Attribute Class Description id 1 Unique tournament id protour 1 Pro Tour, e.g. PGA date 1 Date of tournament tournament 1 Name of tournament course 1 Name of the course par 1 Par for one round on the course yardage 1 Length of course (yards) name 2 Name of golfer pos 2 Position in tournament t 2 Tie between two or more players round1 2 Strokes used in round 1 round2 2 Strokes used in round 2 round3 2 Strokes used in round 3 round4 2 Strokes used in round 4 playoff 2 Strokes used in playoff strokes 2 purse 2 Prize money (USD) Strokes used in tournament There are basically two classes of attributes: 1. The seven first attributes are tournament specific and the values are thus constant for all golfers in the same tournament. 2. The eighth attribute, name, is golfer specific and the remaining nine attributes describe how that golfer performed in a specific tournament 23 and how much money he won in prize money. Summary statistics The data has two main dimensions: (1) time: Data lies in the time interval from January 2002 to January 2014; (2) Pro-Tour: PGA Tour; European Tour; Champions Tour and Nationwide Tour. Summary statistics of the attributes, especially for the attribute purse, varies in these two dimensions. Key insights into the data in terms of size and characteristics are given in the following. Table 2.2 gives summary statistics for the dataset; Figure 2.3 depicts boxplots of the purses for the tournaments’ winners in the Pro Tours for each year and Figure 2.4 depicts the average purse distribution for the four Pro Tours i.e. for each Pro Tour the plot illustrates the average purse as a percentage of the total amount of prize money for each position. Table 2.2: Summary statistics for the historical golf results. ’#’ refers to: ’number of’. Pro tour # observations # tournaments # golfers # courses All tours 218,141 1,714 6,887 465 PGA 73,872 550 2,474 161 European 64,352 478 3,079 143 Champion 28,654 337 1,390 100 Nationwide 51,263 349 2,741 81 Main observations from Table 2.2, Figure 2.3 and Figure 2.4 : • An average of 218141/1714 = 127 observations per tournament i.e. 127 golfers on average in each tournament. 24 Figure 2.3: Boxplot of purse (in 1000 US dollars) for the tournament winners in the Pro Tours by year. • An average of 218141/6887 = 32 observations per golfer i.e. each golfer is on average part of 32 tournaments. • Pro tours share both golfers and courses (the sum of the number of golfers and courses in each pro tour is higher than the number of golfers and courses for the entire dataset). • Purses for winners of the tournaments in the PGA Tourare much greater than for the Champion Tour and Nationwide Tour. • There is great variance in purses for winners of the tournaments in the European Tour. While most purses are on par with the purses in Champions Tour, some are on par with PGA purses and some are on par with Nationwide purses. 25 • Purses for tournaments in the PGA Tour and the European Tour has experienced rapid growth in terms of both average and variance over the time period. • The purse distributions for the four pro tours are close to identical. Figure 2.4: Average purse distribution for the four pro tours as a function of position. 2.2.2 Historical odds for golf events Betfair gives free access to historical odds-data with semi-detailed timestamps (Betfair, 2014). The attributes for the dataset is listed in Table 2.3 together with descriptions (Betfair, 2014). After having deleted historical Betfair data for others sports than golf, the dataset contains 700,000 observations divided over the 139 tournaments in 2011 and 2012. See Table 2.4 26 for summary statistics for the data. Table 2.3: Attributes in the dataset from: Betfair (2014). Attribute Type Description EVENT ID integer Betfair’s event id EVENT string Name of event (golf tournament) SELECTION ID integer Betfair’s golfer id SELECTION string Name of golfer ODDS double The odds NUMBER BETS integer Number of bets on golfer VOLUME MATCHED integer LATEST TAKEN Volume matched (GBP) timestamp When the odds was last matched on the selection FIRST TAKEN timestamp When the odds was first matched on the selection WIN FLAG binary Win: 1. Loose: 0 IN PLAY binary In-Play: 1. Pre-Event: 0 The format of the data is not ideal. The problem with the dataset is the fact that the data is not properly time stamped. Each observation contains two timestamps: F IST T AKEN and LAT EST T AKEN i.e. each observation contains a date-time-range in which a given odds was traded for a given golfer in a given tournament either in play or pre-event. To illustrate the complexity of determining the odds for a given golfer prior to the start of a tournament, the odds and volume matched in pound sterling 27 (GBP) for four golfers prior to the start of the Omega Dubai Desert Classic (starting February 10, 2011) is depicted in Figure 2.5. The complexity arises because each golfer is traded at various odds leading up to the start of the tournament. The key takeaway from Figure 2.5 is that the vast majority of the volume traded for each player prior to tournament start is matched at approximately the same odds. The strategy chosen in order to match odds to the golfers prior to the beginning of each tournament is to choose the odds with the highest trading volume. 43% of the total volume traded for golfers, prior to the start of the tournaments, is matched at the odds with the highest trading volume. 89% of the total volume traded for golfers, prior to the start of the tournaments, is matched at odds within ±10% of the odds with the highest trading volume. Given the above statistics, I find it reasonable to perceive the odds with the highest trading volume as the market odds. Summary statistics Table 2.4 list summary statistics for the historical golf odds from Betfair. From the table it is clear that it is more popular to bet on tournaments in the PGA Tour compared to tournaments in the European Tour. Table 2.4: Summary statistics for the historical golf odds from Betfair. Pro tour # Observations # Tournaments Avg. matched (GBP)* All tours 698,345 139 3,198,346 PGA 420,050 76 4,687,875 European 278,295 63 1,377,812 * Average volume of bets in GBP matched during the tournaments 28 (a) Martin Kaymer - the favorite prior (b) Tiger Woods - the second favorite to start of tournament prior to start of tournament (c) Alvaro Quiros - the winner of the (d) James Kingston - number two in tournament the tournament Figure 2.5: Odds and volume matched prior to the start of the Omega Dubai Desert Classic (Feb. 10, 2011) for the two favourites: Martin Kaymer and Tiger Woods as well as the winner and number two: Alvaro Quiros and James Kingston. The vast majority of the volume for the players are matched at approximately the same odds. Since the data is not fully timestamped it is unclear when the odds were actually matched. The volume matched for the odds are here added at the time it was fist matched (i.e. F IRST T AKEN ) 29 2.2.3 Merging the two data sources In order to merge the two datasets, a table is manually created that add a linkage between the unique tournament id from Table 2.1 and the EV EN T ID from Table 2.3. The second link needed in order to successfully merge the datasets is between the attributes name from Table 2.1 and SELECT ION from Table 2.3 as well as the linking logic described in subsection 2.2.2 (matching the odds with the highest trading volume). 2.2.4 Extracting attributes for statistical modeling Observations for each of the 1,714 tournaments have 20 attributes implying that the total amount of attributes available for calculating winning probabilities of a future golf tournament could be very high. The amount of attributes could be high because: (1) all previous information could potentially be used in determining future winning probabilities; (2) observations for each tournament are located in different places in the space spanned by the date and the protour dimensions. It is clear that some form of dimensionality reduction is needed in order to capture the important information in the dataset more effectively.3 I turn to the existing literature on sports betting for ideas to reduce the dimensionality of my dataset. The literature for estimating winning probabilities for golfers in golf tournaments contains, to my present knowledge, one article. Shmanske (2005) models winning probabilities for golfers based on summary statistics provided by the PGA Tour. He does thus not directly use past golf results in his model. However, many articles have been written 3 The methodology and techniques used in for feature extraction and variable transfor- mation in the following comes from (Tan et al., 2013, Chap. 2) 30 on horse-racing with this focus (see e.g. Bolton & Chapman, 1986; Lessmann et al., 2007; Sung & Johnson, 2012). Many aspects of horse-racing are comparable to golf tournaments; horses compete against other horses of varying quality and form just as golfers compete against other golfers of varying quality and form; each horse races in many races just as each golfer plays in many tournaments; the courses varies in lengths etc. Table 2.5 lists some of the aggregating attributes that have been used in the horse-racing literature in order to reduce dataset-dimensionality. Table 2.5: Attributes used for winning probability estimation in horseracing No. Attribute descriptions 1 Speed rating for the previous race in which the horse ran The average of a horse’s speed rating in its last 4 races; zero 2 when there is no past run Total prize money earnings (finishing first, second or third) to 3 date/Number of races entered 4 The percentage of the races won by the horse in its career 5 The natural logarithm of the normalised final odds probability Only attributes deemed relevant in the golf context are included. Complete attribute list can be found in Sung & Johnson (2012). First four attributes were proposed by Bolton & Chapman (1986), the last was proposed by Benter (1994). Table 2.5 contains attributes whose goal it is to proxy: (1) a horse’s quality: via e.g. the attributes for prize money earnings and win percentages; (2) the horse’s form: via e.g. the speed rating attributes; (3) potential inside information encapsulated in the odds. The idea is to incorporate attributes which both capture the underlying, 31 probably slowly time-varying, horse quality as well as a measure, probably more volatile, for current form. The attributes given in the table are likely strongly correlated and thus capture aspects of both of the underlying measures. It is clear from the table that the academics who have used these attributes have made some arbitrary choices with regard to the dimensionality reduction, e.g. speed rating the last four races. There is, to my present knowledge, no a priori reason why four is the right number. Furthermore, issues could arise due to the fact that the time-dimension in the dataset has not been incorporated into the attributes. A horse could, for example, have been sick for a year and the four previous races (averaged over in the attribute) would then have been prior to the horse’s sickness. The attribute is therefore not likely to be a good estimator of the horse’s current quality and form. I propose two sets of attributes to be used in predicting the winner probabilities for golfers: (1) a set of attributes resembling the attributes used in the literature for horse-race estimation (Table 2.5); (2) a set of attributes with less arbitrary choices with regard to the dimensionality reduction. I will introduce a notation of golfers and tournaments in order to make the following easier to read. The dataset contains n tournaments denoted j = 1, 2, . . . , n. mj golfers are competing against each other in tournament j. These golfers are denoted i = 1, 2, . . . , mj . 32 Static, arbitrary dimensionality reduction A set of attributes resembling the attributes used in the literature for horserace winning probability estimation (see Table 2.5) is created based on the original dataset (described in subsection 2.2.1 and subsection 2.2.2). The idea from the horse-racing literature of including attributes to proxy form and quality is used. The attributes are listed in Table 2.6. The basic ideas from the economic literature on economic incentives and psychological aspects of the golf tournament (section 2.1.1) are used to create the following two attributes: 1. A substitute for speed rating (from the horse-racing literature see Table 2.5). The feature-substitute is named score rating and is given by the number of strokes used by golfer i in the first round of tournament j subtracted by the median of strokes used by golfers in round 1 of tournament j. There is a difference between making a good score in a first-tier tour such as the PGA Tour and a second-tier tour such as the Nationwide Tour. I have analyzed the difference by looking at score ratings for golfers participating in both first-tier and second-tier pro-tours in the same calendar year. I find that the score rating on average is 1.43 strokes higher in first tier than second-tier pro tours. I compensate for this difference by adding 1.43 to all score ratings from tournaments in the Champions Tour and the Nationwide Tour. 2. An attribute to proxy a golfer’s ability to perform under pressure. The attribute is named keep cool and is given by the number of wins divided by the number of top 10 positions. I assume that this attribute gives 33 some indication of the golfers ability not to choke under pressure. The amount of pressure golfers are under is higher in first-tier tours than second-tier tours. I make the simplifying assumption that victories in first-tier tours should count four times that of a victory in a second-tier tournament. Table 2.6: Attributes set no. 1 Attribute avg score rating year Attribute description The average of a golfer’s score rating* the last year. The average of a golfer’s wins compared to top 10 avg keep cool year positions last year. Total prize money earnings last two years divided avg purse 2years by the number of tournaments entered last two years. ln odds The natural logarithm of the normalized final odds probability. *score rating is given by the number of strokes used by golfer i in round 1 of tournament j subtracted by the median of strokes used by golfers in round 1 of tournament j. Score ratings from second-tier tours are added with 1.43 to compensate for the difference in level between first-tier and second-tier pro tours. The list of attributes furthermore includes measures for: previous winnings in GBP; winning percentages. Betfair odds are included for a part of the dataset. 34 Dynamic dimensionallity reduction I create a new set of attributes (listed in Table 2.7). The set contain the same sort of information as in the static set (Table 2.6), but the proposed attributes in this subsection reduce the original dataset less in terms of dimensionality. This dataset contains vectors instead of numbers, e.g. avg purse last year (from Table 2.6) contains one number per golfer per tournament which averages the previous years winnings. In the attribute set in this subsection, historical purse information is captured in a vector, πij , with D elements. Each element, πij,d , contains the purse won d days prior to start of tournament j. The following table lists all the attribute-vectors. 35 Table 2.7: Attributes set no. 2 Attribute Attribute description A vector containing D elements, where the dth element, πij πij,d , specifies the purse won by golfer i, d days prior to the beginning of tournament j. A vector containing D elements, where the dth element, γij γij,d , specifies the score rating of golfer i, d days prior to the beginning of tournament j. A vector containing D elements, where the dth element, wij wij,d , specifies whether golfer i won a tournament d days prior to the beginning of tournament j. A vector that contains D elements, where element d, ψij ψij,d , is a binary attribute that specifies whether golfer i participated in a tournament d days prior to tournament start. A vector that contains D elements, where element d, cij ψij,d , is a binary attribute that specifies whether golfer i ended in top 10 in a tournament d days prior to tournament start. 2.2.5 Removing outliers The Major Championships are by far the four most prestigious annual tournaments in professional golf (Beard, 2009). Elite players from all over the world participate in them, and the reputations of the greatest players in golf history are to a large degree based on the number and variety of major 36 championship victories they accumulate (see subsection 2.1.1). I assume that it is very likely that another set of parameters influence the chance of winning a major championship compared to a regular non-major championships. I assume that a specific model should be built in order to estimate winning probabilities for major championships. I therefore choose to remove all major championships from the list of tournaments I estimate winning probabilities for. I thus estimate winning probabilities for non-major tournaments in the PGA Tour and the European Tour based on a dataset containing both major and non-major tournaments for the four pro tours: PGA, European, Nationwide and Champion. 2.2.6 Weaknesses of the dataset My dataset has three main weaknesses compared to studies of market efficiency on betting markets for e.g. horse races. The weaknesses are inherent in the way that professional golf tournaments are structured. Firstly, I have relatively few tournaments with odds in my dataset. Even though I have odds for all tournaments in 2011 and 2012, I only have 139 tournaments. Secondly, many golfers participate in each tournament and many of these golfers are likely to win the tournament. Thirdly, I assume that there are a lot of factors influencing golfers’ performance that are very difficult to quantify. These factors could include e.g. the psychological state of the golfers. The three weaknesses of the dataset given above lead to a couple of issues: (a) they make it harder to model the golf tournament process and to predict future events; (b) they weaken my conclusions and (c) they do not enable me to perform the same weak-form test as the horse-race literature, because 37 the horse race studies rely on larger datasets. For example, Ziemba (2008) is using data from 50,000 horse races to illustrate a clear positive relationship between the expected return and the likelihood of winning. The literature on horse racing has two solutions for the first weakness - that there are few tournaments in the dataset. Firstly, Bolton & Chapman (1986) describe an ’explosions’ principle to double the size of the dataset. The general idea of the ’explosions’ principle is to assume that the horse finishing second would have won if the horse that won did not participate. The principle works by duplicating every race in the dataset and removing the winner from the duplicated races. The idea is well suited for horse races where getting over the line first is all that matters. I find this approach ill suited for golf tournaments because, (a) there is a big difference between finishing first and second in terms of pressure (due to e.g. the risk of choking); (b) often multiple players finish second because they have the same score. Secondly, and probably most obvious: collect more data. I have chosen to limit my odds data collection to two years because: (a) there is a nontrivial amount of work involved in merging the odds data with the remaining dataset; (b) Betfair’s golf markets only started picking up pace in the last part of 2010. By including previous years in the dataset I would face issues with e.g. illiquid markets; it would thus only be possible to include one more year (2013), and that would not solve the issues described above. I have chosen to limit my dataset to two years because I find it to be a good cut-off point to obtain results of certain accuracy and at the same time be able to focus on analyzes. More data could be added in the future to strengthen the results. 38 2.3 Modeling The eventual goal is to find a model that can predict the winning probabilities for golfers in the PGA Tour and the European Tour in order to be able to determine the degree of semi-strong form efficiency in golf betting markets. Four models are proposed with an increasing level of complexity starting with simple models and ending with models of higher complexity. The starting point is the existing literature on winner probability estimation for sports events where the predominant focus has been on horse racing. Two main approaches have been used: 1. A one-step model (Bolton & Chapman, 1986) in which either only fundamental variables (see attribute No. 1-4 in Table 2.5) or both fundamental variables and the market-generated variable (i.e. the odds) are used. The conditional logit model has been the most widely used statistical classification model due to its ability to account for both independent variables measuring a horse’s potential and within-race competition. 2. A two-step model (Benter, 1994) in which the modeling as well as the dataset is split in two parts: (1) step one is to model the fundamental wining-probabilities based on only fundamental variables and only using the first part of the dataset; (2) step two is to model the winning-probabilities based on the market-generated variable as well as the predicted fundamental wining-probabilities (the model fitted in step 1) on the second part of the dataset. The most widely used statistical classification models has been the conditional logit. However, lately new machine learning algorithms have been proposed, e.g. Less39 mann et al. (2007) use the least-square support vector regression for step (1). I create two models based on the ’one-step model’ and two models based on the ’two-step model’. Aspects of the proposed models deviate from the standard ’off the shelves’ statistical models. The modeling has therefore been written in the programming language Python, with help from among other the open source libraries Sci-Kit learn (Pedregosa et al., 2011) and SciPy (Jones et al., 2001–). 2.3.1 One-step estimation: Conditional logit model The aim of the model is to predict the winning probability, ρij , for golfer i = 1, 2, . . . , mj in tournament j = 1, 2, . . . , n, where n is the number of tournaments and mj is the number of golfers in tournament j. Golf tournaments are, just as horse-racing, highly competitive; a good probability estimate of golfer i’s chance of winning tournament j is thus more likely to be obtained if his chance of winning is regarded as being conditional on the information available for the other golfers in the tournament. The conditional logit model (proposed by McFadden, 1974) seems to fit the task well. The model should predict the winning probabilities based on a matrix, Xj , capturing relevant information for all the golfers in tournament j. Xj should only contain information that is publicly available prior to the start of tournament j. A general specification of a statistical model of the golf-tournament- 40 process could be proposed as: ρij = ρ(Xj ) (2.1) ρ(Xj ) should satisfy the standard axioms of non-negative probabilities as well as probabilities summing to one for all golfers in the same tournament. The conditional logit model satisfies these axioms (Bolton & Chapman, 1986). The model furthermore captures the competitive nature of golf tournaments. It is assumed that the true ability, uij , for golfer i in tournament j is composed of two parts: (1) a deterministic part vij = v(xij ), where xij is vector i in the information matrix, Xj , and (2) a stochastic part εij which reflects measurement errors in the modeling process. To move on from here, it is assumed that the stochastic part, εij , is independent of the deterministic component, vij , as well as identical and independently distributed according to the double exponential distribution. By making these assumptions McFadden (1974) showes that the conditional probability, ρij , is given by: exp (vij ) ρij = Pmj i=1 exp (vij ) (2.2) The predominant way to specify vij is via a linear in parameters specification: vij = βxij (2.3) Where xij is a column vector capturing relevant information for golfer i prior to tournament j. I normalize each attribute in the information matrix, Xj , to have zero mean and unit standard deviation4 . By normalizing the attributes, the row vector with coefficients, β, will measure the relative importance of 4 This operation is done by subtracting mean and dividing by standard deviation for each attribute. 41 the elements of xij in determining the winning golfer. βxij is the dot product between the two vectors. The β vector is estimated based on the training dataset via maximization m P n P of the log-likelihood function L = ln(ρij )yij (McFadden, 1974). Where i=1 j=1 yij ∈ {0, 1} denotes whether golfer i won tournament j. Edelman (2007); Sung & Johnson (2012); Bolton & Chapman (1986) have all used Equation 2.2 and Equation 2.3 to model the winning probabilities of race horses. Estimation and results Table 2.8: Conditional logit - Results* Attribute Coefficient estimate (β̂) −0.648 avg score rating year avg purse 2years 0.784 avg keep cool year 0.120 Summary Statistics L(β = β̂) -3294.32 L(β = 0) -3887.54 McFadden R2 0.15 *Estimated on data from 795 tournaments. All attributes are significant on a 1% level of confidence. Significance is calculated based on the LR test, see appendix A.2 The coefficients in Table 2.8 are quite intuitive: • The avg score rating attribute has a negative coefficient i.e. the more strokes a golfer used, compared to the median golfer in the tournaments 42 he competed in, the less likely he is - ceteris paribus - to win a future tournament. • The avg purse year attribute has a positive coefficients i.e. the more money a golfer on average has won over the last couple of years, the more likely he is - ceteris paribus - to win a future tournament. • The coefficient for avg keep cool year indicate that golfers who have a high ratio of wins compared to top 10 positions are likely to win future tournaments. 2.3.2 One-step estimation: Conditional logit model with variable attribute discounting A new dimensionality reducing function, ξ, which aggregates the values for a fundamental attribute, Q, (e.g. historic purses, score ratings or ’keep cool’) and weights the aggregate by another fundamental attribute, W , (e.g. participation or top 10 positions) is introduced. The intuition behind, ξ, is that new information should weigh more than old information in proxying a golfers quality and form. Let the aggregated value of attribute Q for golfer i weighted by W in the time interval [1 : DQ ] prior to the beginning of tournament j be given by: PDQ ξij,Q,W = ξ (Q, W, θ, DQ , i, j) = 1+ d d=1 θQ Qij,d PDQ d d=1 θQ Wij,d (2.4) Where θQ is a discounting factor; Q ∈ {π, γ, w} and W ∈ {ψ, c} are arrays of fundamental attribute values (see Table 2.7): πij is a vector containing Dπ elements, where the dth element, πij,d , specifies the purse won by player i, d days prior to the beginning of tournament j; γij contains Dγ elements 43 with historic score ratings of golfer i in the interval [1; Dγ ]; wij contains Dw elements, wij,d ∈ {0, 1} denotes whether golfer i won in a tournament on day d. ψij contains Dψ elements, ψij,d ∈ {0, 1} denotes whether golfer i participated in a tournament on day d; finally cij contains Dc elements, cij,d ∈ {0, 1} denotes whether golfer i ended in top 10 in a tournament on day d. One is added to the denominator of Equation 2.4 in order to (a) make sure that single observations in the past are discounted; (b) avoid dividing by zero. The function, ξ, returns a discounted, weighted average of either purse (π); score rating (γ) or ’keep cool’ (w) weighted by either participations (ψ) or top 10 positions (c) for golfer i in tournaments in the date interval [1; DQ ] prior to the start of tournament j. ξij,π,ψ is a generalization of the attribute avg purse last year from the static set (Table 2.6): avg purse last 2year = ξij,π,ψ for θπ = 1, Dπ = 730. In the same fashion ξij,γ,ψ = avg score rating year for θγ = 1, Dγ = 365 and ξij,w,c = avg keep cool year for θc = 1, Dγ = 365. ξj,Q,W is a vector that contains the values for the discounted average of attribute Q weighted by W for all mj golfers in tournament j: ξj,Q = [ξ1j,Q , ξ2j,Q , . . . ξmj j,Q ] The discounted sum of purses/score-ratings is weighted by the discounted count of tournament-participation due to the fact that not all golfers participate in the same number of tournaments. It seems reasonable to assume that the number of tournaments a golfer participates in is not directly a determining factor of his ability. Tiger Woods is e.g. known for only participating in relatively few golf tournaments and being extraordinarily good. A weighted 44 average thus seem reasonable. The discounted sum of ’keep cool’ is weighted by the discounted count of tournament top 10 positions. The discounting function has been chosen due to its dominant position in economic literature. A modification of the linear in parameters specification of the deterministic part, vij , specified in Equation 2.3 is proposed: vij = α1 · z (ξj,π , i) + α2 · z (ξj,γ , i) + α3 · z (ξj,c , i) (2.5) Where α is a vector of coefficients that measures the relative importance of the three new terms; z is a function that returns the standard-scores i.e. transforms the elements of the vector ξj,Q to have zero mean and unit standard deviation (std). The idea of the standardization is to center the data in order to ease the optimization problem (described lated) as well as parameter interpretation. z is defined as: z (ξj,Q , i) = ξij,Q − mean (ξj,Q ) std (ξj,Q ) (2.6) The winning probability for golfer i in tournament j is still defined as in Equation 2.2: exp (vij ) ρij = Pmj (2.7) i=1 exp (vij ) But now both of the vectors α and θ - each with three elements - are to be estimated based on the training dataset via maximization of the log-likelihood m P n P function: L = ln(ρij )yij (McFadden, 1974), where yij ∈ {0, 1} denotes i=1 j=1 whether golfer i won tournament j. The specification of the likelihood function given above is not a convex function and care thus has to be taken in the optimization process in order not to end up in a local maximum. A randomized grit-search is performed where the optimization algorithm is initialized with random α and θ values. 45 Estimation and results To make the results comparable to the results from the simple conditional model (subsection 2.3.1), the values of DQ are chosen, so that both models have the same fundamental data foundation: two years of data for historic purses is included (Dπ = 730); one year of score-ratings and ’keep cool’ is included (Dγ = Dc = 365). Table 2.9: Conditional logit with variable discounting of historic purses and score ratings - Results* Coefficient estimates Attribute α̂ θ̂ -0.491 0.996 Discounted weighted avg. purse, ξij,π 0.624 0.998 Discounted weighted avg. keep cool, ξij,c 0.113 0.999 Discounted weighted avg. score rating, ξij,γ Summary Statistics L(α = α̂, θ = θ̂) -3273.37 L(α = 0, θ = 0) -3887.54 McFadden R2 0.16 *Estimated on data from 795 tournaments. All attributes are significant on a 1% level of confidence. Significance is calculated based on the LR test, see appendix A.2 The α̂ coefficients in Table 2.9, which measures the relative importance of the two attributes, are quite intuitive: • The discounted weighted avg. score rating has a negative coefficients (-0.491) i.e. the more strokes a golfer used, compared to the median 46 golfer in the tournaments he competed in, the less likely he is - ceteris paribus - to win a future tournament. • The discounted weighted avg. purse has a positive coefficients (0.624) i.e. the more money a golfer on average has won over the last couple of years, the more likely he is - ceteris paribus - to win a future tournament. • The keep cool attribute has a positive coefficient, i.e. golfers with a high ratio of wins compared to top 10 positions are likely to win future tournaments. Furthermore, the discounting factors, θ̂ are in line with what one would expect. The weights used when calculating the weighted average of the historic purses, score-ratings and ’keep cool’ are depicted as a function of days until tournament start, d, in Figure 2.6. Figure 2.6: Weights as a function of days until tournament start, d, used when calculating the discounted weighted average of the historic purses, score-ratings and ’keep cool’. 47 It is clear from the figure that historic score rating are discounted much harder than purses and ’keep cool’. This is in line with the idea that discounted weighted average score rating is a proxy for the golfers form while discounted weighted average purse as well as ’keep cool’ are proxies for the golfers quality. A priori it makes sense that the golfers form is less persistent than his quality. Comparing the conditional Logit model with variable attribute discounting with the conditional logit model The ’conditional Logit model’ is a special case of the ’conditional Logit model with variable attribute discounting’ - if the discounting-factors, θ, equal one, the models are identical. The fact that the one is a special case of the other implies that I can use a likelihood ratio test to compare the fit of two models. The test is based on the likelihood ratio, which expresses how many times more likely the data is under one model than the other. The test statistic, LR, is given by: LR = −2ln(likelihood for conditional logit) +2ln(likelihood for conditional logit with variable discounting) = −2 · (−3294.32) + 2 · (−3273.37) = 41.9 The test statistic is asymptotically chi-squared distributed (Verbeek, 2008) with degrees of freedom equal to df2 − df1 = 6 − 3 = 3. Where df1 is the number of free parameters in the ’conditional Logit model’ and df2 is the number of free parameters in the ’conditional Logit model with variable attribute discounting’. The p-value for the test is very low (less than 0.0001). The results show that adding the discounting functionality to the model results in a statistically significant improvement in model fit. 48 2.3.3 Two-step estimation There will always be a significant amount of information in golf tournaments that cannot easily be incorporated in a statistical model. This information includes inside information as well as not-easily quantifiable information e.g. new special workouts; the golfers intention and motivation; off-course issues affecting the golfer’s performance etc. To exemplify such not easily incorporable information, consider the 2013 edition of the RBC Canadian Open at Glen Abbey Golf Club in Oakville, Ontario. Hunter Mahan was after round 3 two strokes ahead of his nearest competitor and looked like the sure winner of the prestigious tournament and close to $1 million, when he suddenly decided to withdraw from the tournament to be present for the birth of his first child. Benter (1994) pointed out that this kind of information will always be available to certain parties who will no doubt take advantage of it. Their betting will be reflected in the odds. Benter (1994) proposed a two-step estimation approach where the first step was the estimation of a fundamental model (such as subsection 2.3.1 or subsection 2.3.2). The second step was the new conditional logit model: exp (β1 · ln(ρij ) + β2 · ln(oij )) cij = Pmj i=1 exp (β1 · ln(ρij ) + β2 · ln(oij )) (2.8) Where ρij is the fundamental winner probability given in either Equation 2.5 (the simple conditional logit model) or Equation 2.7 (the conditional logit model with variable discounting); oij is the public’s implied winner probability estimate i.e. the inverse of the odds (oij = 1/Betf air odds); ln() is a function that returns the natural logarithm. cij is thus a combined probability estimate. Equation 2.8 should be evaluated using fundamental probability estimates 49 predicted by one of the two previously described models which is fitted via a separate sample of races. The idea of using out-of-sample estimates is, as Benter points out, to avoid overestimation of the fundamental model’s significance caused by over-fitting the model. Estimation and results One problem arrises quickly: the size of the dataset. Benter points out that: “In the author’s experience the minimum amount of data needed for adequate model development and testing samples is in the range 500 to 1000 races.” (Benter, 1994, p. 185) The dataset used in this thesis only contains odds for 139 tournaments (Table 2.4). This data sample should be used for both estimation and testing and it thus contains far less information than Benter deemed necessary. Golf tournaments are, furthermore, harder to predict than horse races due to the fact that the competition is much higher. 6-20 horses compete against each other; 80-150 golfers compete against each other. If anything, this would imply the need for even more data. To illustrate the issue, standard errors for the parameters in the models are calculated via bootstrapped resampling (Fox, 2008). Standard errors are in parenthesis in Table 2.10. 50 Table 2.10: Two-step models - Results* Coefficient estimates** Model used in step 1 β1 β2 -1.80 39.92 (7.28) (9.60) 2. Conditional logit with variable his- -2.94 41.13 toric purse and score-rating discounting (24.87) (16.67) 1. Conditional logit McFadden R2 ** 0.113 1. Conditional logit (0.03) 2. Conditional logit with variable his- 0.114 toric purse and score-rating discounting (0.06) * Estimated on data from 139 tournaments; ** Standard errors of parameters (in parentheses) are estimated via 5-fold bootstrapped resampling. Benter (1994) uses the difference between the McFadden R2 for the one-step model and the two-step model, ∆R2 , as a heuristic measure of potential profitability increase by going from a one-step model to a two-step model. ∆R2 is negative for both models and thus does thus not indicate that the second step has improved the one-step models. In summary it must be concluded that, most likely, odds-data for a significantly higher number of tournaments is needed in order for the two-step model to achieve a better model than the one-step fundamental models. 51 2.4 Betting strategies Just like the stock market, betting strategies at a golf-course or any other sport event are usually either primarily technical or fundamental in their nature: • Technical betting strategies take the odds for the individual golfers as a starting point. The betting strategies are constructed to take advantage of e.g. simpel biases such as the commonly reported favorite-long-shot bias (Hausch et al., 1994). The bias is such that favorites tend to be underbet (and thus have too high odds compared to true winning probability) and longshots tend to be overbet; • Fundamental betting strategies utilize an underlying model of fundamental attributes such as historic results to estimate a fundamental winning probability.5 Based on the fundamental winning probabilities and the public’s belief probabilities (given by the inverse of the odds), a betting algorithm such as the Kelly criterion (Kelly, 1956) is employed to place bet(s). The technical/fundamental betting strategies proposed in the following have two parameters that define how they work: 1. Number of bets: The betting strategies can either advise the bettor to place one bet or several bets. Single bet: The bettor places a bet of $1 on golfer k. If k wins the 5 The two-step model described in subsection 2.3.3 incorporates both odds as well as fundamental attributes and thus places itself somewhere between a technical and fundamental model. The two-step models will in the following be used with the fundamental betting strategies due to the fact, that it is primarily fundamental. 52 tournament (wk = 1) the bettor gets ok otherwise the bettor loses the $1. The profit, m, is thus given as: m = ok · wk − 1 (2.9) Betfair charges a fee of 5% if m is greater than zero. Multiple bets: The bettor places bets on a set of golfers, S. If i ∈ S the bettor bets $Ki on golfer i. If golfer k wins the tournament (wk = 1) and k ∈ S (i.e. if a bet has been placed on k), the bettor gets ok . The profit, m, is thus given as: X m= i∈S K i · oi · wi − X i∈S Ki (2.10) Betfair charges a fee of 5% if m is greater than zero. 2. Long/short bets: Unlike most other betting markets, the possibility to take a both long and short position in any bet exists on Betfair. You can for an example both bet that e.g. Tiger Woods wins the tournament or that he does not win the tournament. The spread between the two bets are quite small. The mathematical foundation for short-betting on many mutually exclusive events (e.g. golfers winning a golf tournament) has, to the best of my knowledge, not been created. I therefore choose only to allow the betting strategies to place long bets. 2.4.1 Technical betting strategies Two technical betting strategies are proposed: 1. A random strategy in which random bets are placed. The motivation for creating a random strategy is to have a benchmark to which other strategies can be measured. 53 2. Two strategies designed to exploit the favorite-long-shot bias. Random betting To have a valid benchmark to which other strategies can be compared, a random betting scheme is proposed. The strategy is basically to take a long/short bet of a fraction, K, of total wealth on one/multiple random golfer(s). In order to minimize variance of such a strategies and since the dataset only contains 139 tournaments with odds, a restriction is put on the strategy: Only bet if the belief probability is higher than 5%6 i.e. do not bet on extreme long-shots since the number of observations is so limited. Favorite-long-shot bias High probability–low payoff gambles have been shown to have higher expected return than low probability–high payoff gambles (Hausch et al., 1994, 2008). Favorites thus win more often than projected by their odds. This discrepancy between the true probability and the probability implied by the odds (the belief probability given by the inverse of the odds) is called the favorite-long-shot bias. The first documentation of this bias is attributed to Griffith (1949). Ottaviani & Sørensen (2008) present an overview of the main theoretical explanations for this bias proposed in the literature. The bias challenges normative assumptions because it means that the expected return increases with the probability of winning. Shmanske (2005) proposes a series of naive betting strategies which he uses for the PGA-tour in 2002. He proposes to bet a flat amount of $1 on of the following four options: (1) the favorite in each tournament; (2) a group 6 The 5% limit is arbitrarily set before estimation of results. 54 of favorite in each tournament; (3) the long-shot in each tournament; (4) a group of long-shots in each tournament. All of the four strategies are basically attempts to profit from the favoritelong-shot bias and have also been utilized in the horse-racing literature. I follow Shmanske (2005) and create the same betting strategies. If any of theses strategies produce a positive return, it would be an indication of weak form inefficiency. It is furthermore evidence of the favorite longshot bias, if the ’favorite’ strategies achieves higher return than the ’longshot’ strategies. 2.4.2 Fundamental betting strategies To operationalize the models proposed in section 2.3 a betting strategy is needed. Two versions of the essential same betting strategy is introduced: 1. Kelly criterion: a formula used to determine the optimal size of a series of independent bets in order to maximize the expected logarithm of wealth. This strategy has been utilized in many articles on horse-racing (e.g. Lessmann et al., 2007; Sung & Johnson, 2012). The strategy has three important properties (Hausch et al., 1994, p. 87): (1) it maximizes the asymptotic growth of capital; (2) asymptotically, it minimizes the expected time to reach a specific goal; (3) it outperforms in the long run any other essentially different strategy almost surely. 2. Fractional Kelly betting: The downside of betting, according to the Kelly criterion, is that (1) it is a rough ride i.e. very volatile; (2) it exposes the bettor to non-deterministic errors in the advantage calculations (the difference between predicted odds and the market odds). 55 Kelly wagering strategy The Kelly criterion was introduced by Kelly (1956) to assist AT&T with its long distance telephone signal noise issues. The gambling community quickly got wind of the theory and it has since been widely used as a betting strategy for horse-racing (e.g. Lessmann et al., 2007; Sung & Johnson, 2012). The strategy is also referred to as pure Kelly strategy. It is important to note that the Kelly strategy is not related in any way to e.g. a Martingale betting system (which is to double the bet after every loss, so that the first win would recover all previous losses). Kelly strategies are only profitable in the long run, if the bettor is able to estimate event probabilities better than the market. Single bet: For simple bets with two outcomes: (1) losing the entire amount of the bet; (2) winning the bet amount multiplied by the odds, o, the Kelly bet is given by: K= (o − 1) · ρ − (1 − ρ) o−1 (2.11) Where ρ is the winning probability. The fraction is equal to the expected net winnings divided by net winnings if you win on the bet. The Kelly bet, K, is the fraction of total wealth that should be placed on the bet. Multible bets: Kelly’s criterion can be generalized on gambling on many mutually exclusive outcomes, like in horse races or golf tournaments. Smoczynski & Tomkins (2010) propose an algorithm for Kelly-betting on many mutually exclusive outcomes. The algorithm is split in 4 steps: 56 1. Calculate the expected revenues for the m different golfers: re,i = ρi · oi · (1 − t) (2.12) Where the probability that golfer i wins the tournament is given by ρi ; the odds for him winning is oi and the take of the bet exchange is t (Betfair’s take is 5%). 2. Reorder the vector of expected revenues, re , so that the new vector is non-increasing. Thus re,1 will be the bet with highest expected revenue. 3. Set S = ∅; R (S) = 1. S is the set of bets to be made, R (S) is the reserve rate, that is, the fraction of the gamblers wealth that is not bet on any golfer. 4. Loop through the m golfers: If re,i > R (S) then: • Insert i into the set of bets to be made: S = S ∪ {i} • Recalculate R (S) according to: P 1 − i∈S ρi R (S) = P 1 − i∈S (1−t) oi (2.13) else stop the loop and set Sopt = S and calculate the optimal Kelly bets, Kopt , according to: Kopt,i = re,i − R (Sopt ) oi · (1 − t) (2.14) Example: To exemplify the Kelly betting strategy for many mutually exclusive bets, consider the 2012 edition of the KLM Open in which the outsider, Peter Hanson, won. Table 2.11 lists the id’s and names for the 8 golfers with the highest expected revenue rates (the belief probabilities multiplied by the 57 model probabilities subtracted by the track take, given in Equation 2.12; the simple one-step conditional logit model described in subsection 2.3.1 is used to calculate probabilities). In the table, step 1+2 of the betting algorithm has already been performed (1: calculate re,i ; 2: sort by expected revenue rates). According to the model described in subsection 2.3.1, Martin Kaymer is the golfer with the highest expected revenue rate. The public has thus valued the skill of Martin Kaymer less than the model. Table 2.11: Betting strategy example for the KLM Open, 2012*. Step 1+2 Id Name Odds, 1/oi Prob., ρi Exp. revenue rate, re,i 130 Martin Kaymer 0.059 0.116 1.934 154 Peter Hanson 0.038 0.055 1.395 155 Richie Ramsay 0.020 0.020 0.981 140 Marcus Fraser 0.015 0.012 0.752 119 David Lynn 0.045 0.033 0.710 146 Anders Hansen 0.026 0.019 0.698 125 Danny Willett 0.017 0.011 0.667 96 .. . 0.012 .. . 0.008 .. . 0.589 .. . Victor Dubuisson .. . *Only the eight golfers with highest expected revenue rate are listed. In Table 2.12, step 3+4 of the betting algorithm is illustrated. In the first iteration of the loop, the set of bets to be made, S is empty. Martin Kaymer is the first golfer to be considered for being added to S. Since re,i > R(S) he is added. Next iteration Peter Hanson is considered and also added. The loop stops in iteration 4 after the golfers with id 130, 154, 155 are added. 58 The optimal bets are calculated according to Equation 2.14. Peter Hanson ends up winning the tournament. The profit from the bet is therefore equal to: (1 − 0.05) · [0.019 · 1/0.038 − (0.061 + 0.019 + 0.001)] = 0.40 (2.15) The bets in the tournament thus increase the bankroll with 490% (Betfair takes a 5% fee of net winnings). Table 2.12: Betting strategy example for the KLM Open, 2012*. Step 3+4 Itteration i S before iteration R(S) re,i Bet Kopt,i 1 130 {} 1.000 1.934 Yes 0.061 2 154 {130} 0.938 1.395 Yes 0.019 3 155 {130, 154} 0.916 0.981 Yes 0.001 4 140 {130, 154, 155} 0.914 0.752 No Fractional Kelly wagering strategy A fractional Kelly wagering strategy is a strategy derived from the pure Kelly strategy. The idea is basically to bet a fixed fraction of the amount recommended by the Kelly wagering strategy. Benter (1994) argues for fractional Kelly betting and proposes that betting 1/2 or 1/3 of the Kelly strategy is a good idea. The reason for betting only a fraction of the pure Kelly strategy is to be less exposed to non-deterministic errors in the advantage calculations (the difference between predicted odds and the market odds (Hausch et al., 1994)). 59 I will in the following exemplify the issues with the pure Kelly strategy arising from non-deterministic errors in the bettor’s advantage via Monte Carlo simulations. Monte Carlo simulations of betting strategies: To test the performance of the pure Kelly strategy and the 0.3 fractional Kelly strategy compared to betting randomly, a Monte Carlo simulation experiment is set up. MacLean et al. (1992) create a mathematical framework which illustrates the fractional Kelly strategies’ ability to decrease variance of return. There is, to my present knowledge, no literature evaluating fractional Kelly strategies via Monte Carlo simulations. A tournament simulation framework is created in which m simulated golfers compete against each other. Three sets of probabilities are drawn from random distributions. True winning probabilities are drawn randomly and the public as well as a ’bettor’ makes noisy estimates of the true winning probabilities: 1. The true winning probability for golfer i = 1, 2, . . . , m are drawn randomly. The probabilities are drawn from a squared poisson distribution with λ = 1. I find that this distribution approximates the winning probabilities implied by the odds given in the dataset. 2. The publics belief probability (1/odds) are given by the true winning probability with added noise. The winning probabilities are multiplied by random draws from a normal distribution with mean=1 and standard deviation=0.2. 3. The bettor uses a ”statistical model” to calculate winning probabilities. Unfortunately, the model is not flawless, so noise is added. True prob60 abilities are multiplied by a random draw from a normal distribution with mean=1 and standard deviation=0.1. Each of the three sets of probabilities are finally divided by the sum, so that each probability set sum to one. The tournament is simulated 140 times with m = 127 golfers (my dataset contains 140 tournaments with odds, and there are on average 127 golfers in a tournament, see Table 2.4). Returns from the different betting strategies are depicted in a boxplot in Figure 2.7 for one simulation of the experiment. That is, for each of the 140 simulated tournaments the betting strategies Figure 2.7: Boxplots of Monte Carlo experiment return for the three betting strategies: Random betting; Kelly betting and Fractional Kelly using a fraction of 0.3 ”places bets” on a subset of the 127 golfers. The return for each simulated tournament is finally plotted in the boxplot for each strategy. 61 The random strategy has a well-defined minimum return, given by the fraction of wealth, which is bet in each tournament. The pure Kelly strategy has high variance (this is what Benter (1994) referred to as a “rough ride”) but overall the highest mean. The fractional Kelly strategy has less variance than the Kelly strategy. The above given simulation of tournament j = 1, 2, . . . , 140 is simulated 100 times and the end wealth is calculated for each of the 100 simulations. A starting capital of $1 is chosen for simplicity and the end wealth is calculated 140 Q as: (1 + rj ). Where rj is the return in simulation j. j=1 Figure 2.8 contains the boxplots of end wealth in each of the 100 simulations for the three strategies. The fractional Kelly strategy is the only one with an expected end-wealth above $1. However, if somebody with a perfect betting model was betting, and he could calculate the true winning probabilities, the results would be different. Figure 2.9 depicts the boxplot of end wealth for the three strategies with no noise in the bettor’s probability estimates. Given this boxplot, it is clear why the Kelly strategy has received so much attention. Robustness of simulation results: The general picture of Figure 2.8 and Figure 2.9 are robust to a number of different processes for generating the three sets of probabilities: (1) the true winning probabilities; (2) the public’s belief probabilities and (3) the bettor’s probabilities. I have simulated the Monte Carlo experiments using the t-distribution (with heavier tales than the normal distribution) as well as a skewed normal distribution to draw noise 62 Figure 2.8: Boxplots of the end wealth in the Monte Carlo experiment for the three betting strategies: Random betting; Kelly betting and Fractional Kelly using a fraction of 0.3 from (to simluate the public’s and the bettor’s noisy probability estimates). The main conclusions from the Monte Carlo simulation experiments are as follows: • The fractional Kelly strategy has the potential to be profitable as long as the noise term for the bettor’s probability estimates in general is smaller than the noise for the public’s belief probabilities. • The bigger the difference between the noise term in the public’s belief probabilities and the bettor’s probability estimates, the larger is the optimal fraction to use in the fractional Kelly strategy, i.e.: – if the public is bad at assessing the true winning probabilities (lots of noise) and the bettor is good at assessing it (little amount of 63 Figure 2.9: Boxplots of the end wealth in the Monte Carlo experiment with no noise in bettor’s probabilities for the three betting strategies: Random betting; Kelly betting and Fractional Kelly using a fraction of 0.3 or no noise), then he should use a fraction close to one in his fractional Kelly strategy (this is the case in Figure 2.9). – if, on the other hand, the bettor is only marginally better than the public, the fraction of the Kelly bet should be closer to zero (this is the case in Figure 2.8). 2.5 Performance evaluation of the statistical models and betting strategies The following is going to be a race between the different models proposed in section 2.3 as well as the betting strategies proposed in section 2.4. The models are estimated with one dataset and evaluated with respect to perfor- 64 mance with a different dataset. Non-overlapping datasets are used in order to assess the models’ abilities to predict winners for out of sample tournaments. The strategies are ultimately being compared by the end-of-period wealth they create, i.e. are they enabling a bettor to achieve positive return. Without loosing generalization, it is assumed, that a bettor starts with $1. His end-wealth is calculated as: 140 Y (1 + rj ) (2.16) j=1 Where rj is the return of bets in tournament j. In the following, I will evaluate performance of three sets of strategies: (1) technical strategies that try to exploit weak-form inefficiencies - arising if market prices are not well adjusted to previous market prices; (2) fundamental strategies that try to exploit semi-strong inefficiencies - arising if market prices are not well adjusted to results from previous golf tournaments; (3) Benchmark strategies, to have a valid comparison to the other strategies. I am only considering strategies that possibly place multiple bets in each tournament. 1. Technical strategies: (a) Bet 3% of wealth on the five golfers with lowest odds (the favourites). (b) Bet 3% of wealth on the five golfers with highest odds (the longshots). 2. Fundamental strategies: Bet according to the pure Kelly strategy and the 0.3 fractional Kelly strategy using the following four different models to calculate probabilities: (a) The one-step conditional logit model. 65 (b) The one-step conditional logit model with variable attribute discounting. (c) The two-step model with the conditional logit model as step one. (d) The two-step model with the conditional logit model with variable attribute discounting as step one. 3. Benchmark: (a) Bet 3% of wealth on five randomly chosen golfers. The numbering in the above list is used for reference in the following. Conditional logit is abbreviated to CL. 2.5.1 Performance by following the strategies in the 139 tournaments The return for each of the 139 tournaments are depicted in a boxplot in Figure 2.10 for the technical betting strategies; the fundamental betting strategies and the benchmark strategies. End wealth is calculated according to Equation 2.16 for all strategies and listed in Table 2.13. The best overall performing strategy is the ’2.fractional Two step discounted CL’ strategy. By following the strategy, a bettor would have $0.54 after betting for two years, if he started with $1. The performance is thus not overwhelming. The fact that the expected return (the average return for each tournament) of strategy ’1a. favorite’ outperforms ’1b. longshot’, can be seen as evidence of the favorite-longshot bias. However, the evidence is weak because of the relatively small size of the dataset. 66 Figure 2.10: Boxplots of return for the technical -, fundamental - and benchmark betting strategies. The numbering (e.g. 1a) refers to the numbered list on page 65; pure/fractional refer to the fractional - and pure Kelly strategy, respectively. CL is abbreviation for conditional logit. 67 Table 2.13: End wealth; 0.3 fractional Kelly betting strategy with startwealth of $1 Strategy End wealth # bets 0 1a. Favorite 0.45 695 1 1b. Longshot 0.05 695 2 2a.fractional. CL 0.37 781 3 2b.fractional. Discounted CL 0.42 741 4 2c.fractionl. Two Step CL 0.53 111 5 2d.fractional. Two Step Discounted CL 0.54 109 6 2a.pure. CL 0.00 781 7 2b.pure. Discounted CL 0.00 741 8 2c.pure. Two Step CL 0.00 111 9 2d.pure. Two Step Discounted CL 0.00 109 10 3a. Random 0.40 695 The numbering (e.g. 1a) refers to the numbered list on page 65; pure/fractional refer to the pure and fractional Kelly strategy, respectively. 2.5.2 Performance by going against the strategies in the 139 tournaments Since Betfair allows for both long bets and short bets, I have the option of going against my own strategies. That is, instead of backing a set of golfers, I will lay them (i.e. short the golfers). At first, this idea seems very counterintuitive, but I will argue that the idea can be justified. When shortening the strategies, the pure Kelly strategies bet more than 100% of wealth. Only the technical strategies, fractional Kelly strategies and benchmark strategy are therefore considered. End wealth is calculated for the strategies according to 68 Equation 2.16 and listed in Table 2.14. Table 2.14: End wealth; 0.3 fractional Kelly betting strategy with startwealth of $1 Strategy End wealth, $ Number of bets 0 1a. Favorite 0.89 695 1 1.b Longshot 0.01 695 2 2a.fractional. CL 2.16 781 3 2b.fractional. Discounted CL 2.21 741 4 2.c.fractionl. Two Step CL 0.07 111 5 2.d.fractionl. Two Step Discounted CL 0.04 109 6 3.a Random 0.25 695 The numbering (e.g. 1a) refers to the numbered list on page 65; fractional refer to the fractional Kelly strategy. The conditional logit strategy, as well as the strategy where historic results are discounted with a variable discounting coefficient, produce a large positive result. By following these strategies in 2011 and 2012, a bettor would have been able to more than double his starting wealth. Interpretation of results The seemingly strange observation, that a bettor could double his starting wealth by shortening the strategies, could be explained by the following: (1) the people who are betting, have a biased perception of true winning probabilities. This bias leads to market prices that do not reflect the true winning probabilities. (2) The model generates probability estimates which are biased in the same direction as the public, only more extreme. That is, 69 for each golfer, the model’s probability estimate is further away from the true probability than the public’s belief probability. 2.5.3 Robustness The robustness of the end wealth calculations are estimated by evaluating the results’ robustness to two factors: (1) which golfers there are betted on; (2) which tournaments there are betted on. To keep the following simple, only the two best performing strategies are considered. (1) Which golfers there are betted on: I test whether the results are driven by bets on golfers with specific values for ’odds’ and ’volume matched’. I run a grid search where the two betting strategies are evaluated for golfers with different ranges of odds and matched volume. That is, I evaluate the betting strategies on golfers with low odds (< 50) and high odds (≥ 50) as well as liquid (golfer that are matched > £5000 on) and less liquid golfers (golfer that are matched £1000 to £5000 on). I have chosen not to look at illiquid golfers, because only very small bets can be placed on them. From Table 2.15 it is clear that the positive return is mainly driven by bets on golfers with odds <50. Table 2.15: End wealth by betting on select group of golfers. 2a.fractional CL 2b.fractional Discounted CL £1000 to £5000 ≥ £5000 £1000 to £5000 ≥ £5000 Odds < 50 1.54 1.34 1.72 1.46 Odds ≥ 50 0.96 1.05 0.97 1.03 70 (2) Which tournaments there are betted on: To test whether the results are driven by bets on only a few tournaments, I create 1000 sets of bootstraped resamples (with replacement) of the 139 tournaments with odds (see Table 2.4). For each of the 1000 bootstraped samples with 139 tournaments, I calculate the end wealth of the strategies according to Equation 2.16. Figure 2.11 contains boxplots of the end wealth for the 1000 bootstraped samples for the two betting strategies ’2b.fractional CL’ and ’2b.fractional Discounted CL. The results indicate that the strategies could enable a bettor to make a very large positive return. The positive return does not appear to be driven by few specific tournaments in the sample. 71 Figure 2.11: Boxplot of end wealth for the bootstrapped resamples for the two best performing strategies. The numbering (e.g. 2a) refer to the numbered list on page 65; fractional refer to the fractional Kelly strategy. How much to bet? Before deciding whether to implement the above proposed betting system in the real world, it would be relevant to investigate how much money one could expect to be able to win. It is clear that the odds are going to change if you place big bets on a golfer. If you back a golfer, the odds will likely fall; if you lay a golfer, the odds will likely increase. The size of the increase/decrease is likely related to the liquidity of the golfer, but it is not possible to derive an explicit formula for how the odds will change as a function of the size of odds placed. 72 Chapter 3 Conclusion In this thesis I evaluate whether Betfair’s golf prediction markets are efficient. I create a novel dataset containing: (1) winning market prices from the biggest public prediction market, Betfair, for all golf tournaments in the PGA Tour and the European Tour in 2011 and 2012 and (2) historical golf results from the PGA Tour, European Tour, Champion Tour and Nationwide Tour from the beginning of 2002 to the end of 2012. I evaluate whether the golf prediction markets are efficient by testing whether market prices are well adjusted to two sets of relevant historical data. First, I perform weak form tests to see if prices are efficiently adjusted to historical prices. Secondly, I perform semi-strong form tests to see if prices are efficiently adjusted to results from previous golf tournaments. I test for for weak form and semi-strong form efficiency by building betting strategies which aim at achieving positive return. Positive return indicate market inefficiency. My approach of evaluating prediction markets efficiency differs a bit from other papers dealing with the prediction markets. For example, it is the aim of: Cowgill & Zitzewitz (2013); Forsythe et al. (1992); Smith et al. (2006) to evaluate whether prediction markets provide 73 more precise probability estimates than corporate experts, exit polls and bookmakers respectively. My approach is, however, in line with other studies of market efficiency on sports markets see e.g Bolton & Chapman (1986); Benter (1994) and Sung & Johnson (2012). My analyzes lead to the following two main findings: Firstly, I am not able to achieve positive return using the naive betting strategies - betting on favorites and longshots, respectively. I thus show no sign of weak form inefficiency in the market. Secondly, by using the most profitable proposed betting strategy based on results from previous golf tournaments, a bettor is able to more than double his start wealth over the two year period from the beginning of 2011 to the end of 2012. This robustness of the finding is evaluated via (1) bootstrapped resampling of the tournaments as well as (2) simulations of the betting strategies for golfers in different ranges of odds and matched volume. The result stands up to the robustness tests. The fact that the betting strategy enables a bettor to achieve positive return over the two year period indicate that prices not are well adjusted to e.g. results from previous golf tournaments; Betfair’s golf markets thus seem semi-strong form inefficient. My conclusions are weakened by the fundamental structure of golf tournaments. I have relatively few tournaments with many golfers, much competition and possibly important attributes that are hard to quantify (e.g. psychological state of the golfers). The shown ability to achieve positive return could have been caused by e.g. a sampling bias, whereby some tournament outcomes are overrepresented in the two year sample I analyze, compared to the ’true population of golf tournament outcomes’. More tournaments could 74 be included in the analyzes in the future to strengthen the conclusions. The findings in this thesis lead to the following two considerations. Firstly, it would be worth investigating the possibilities of implementing the betting strategy in real life. The ideas and approaches proposed in this thesis could be further refined in order to improve performance before a real life implementation. The existing attributes could be incorporated into the model in different ways and new attributes could be added to the model in order to obtain a better and more unbiased fit. The added attributes could describe the golfers individual weather preferences, psychological state, preferences with regards to type of course etc. Benter (1994) writes that it took him “approximately five man-years of effort to [...] organize the database and develop a handicapping model”. More time is thus likely to improve the performance of the proposed models and strategies. Secondly, this thesis provides evidence that suggests that prediction markets are inefficient estimators of event probabilities. I propose that more studies such as this is needed before prediction markets can confidently be used as efficient probability estimators. When deciding which method to use in order to answer the question “What is the probability of...?”, the amount of available data should play a key role. My findings indicate that prices in prediction markets do not always correspond to true event probabilities even for very liquid markets such as Betfair’s golf markets. An analytical approach could potentially provide more efficient estimates of event probabilities than prediction markets. 75 Bibliography Arrow, K. J., Forsythe, R., Gorham, M., Hahn, R., Hanson, R., Ledyard, J. O., Levmore, S., Litan, R., Milgrom, P., Nelson, F. D. et al. (2008). The promise of prediction markets. SCIENCE-NEW YORK THEN WASHINGTON- 320(5878), 877. Beard, H. (2009). Golf: An Unofficial and Unauthorized History of the World’s Most Preposterous Sport. Simonand Schuster. Beilock, S. L. & Carr, T. H. (2001). On the fragility of skilled performance: What governs choking under pressure? Journal of experimental psychology: General 130(4), 701. Benter, B. (1994). Computer based horse race handicapping and wagering systems: a report. In: Efficiency of Racetrack Betting Markets (Hausch, D. B., Lo, V. S. & Ziemba, W. T., eds.). Academic Press, pp. 183 – 198. Betfair (2014). Historical golf odds. http://data.betfair.com/. [Online; accessed: 2014-01-18]. Bolton, R. N. & Chapman, R. G. (1986). Searching for positive returns at the track: A multinomial logit model for handicapping horse races. Management Science 32(8), 1040–1060. 76 Cowgill, B. & Zitzewitz, E. (2013). Corporate prediction markets: Evidence from google, ford, and koch industries1 . Edelman, D. (2007). Adapting support vector machine methods for horserace odds prediction. Annals of Operations Research 151(1), 325–336. Ehrenberg, R. G. & Bognanno, M. L. (1990). The incentive effects of tournaments revisited: Evidence from the european pga tour. Industrial and Labor Relations Review , 74S–88S. Fama, M. B. G., Eugene F (1970). Efficient capital markets: A review of theory and empirical work*. The journal of Finance 25(2), 383–417. Forsythe, R., Nelson, F., Neumann, G. R. & Wright, J. (1992). Anatomy of an experimental political stock market. American Economic Review 82, 1142–1142. Fox, J. (2008). Applied regression analysis and generalized linear models. Sage. Franck, E., Verbeek, E. & Nüesch, S. (2010). Prediction accuracy of different market structures, bookmakers versus a betting exchange. International Journal of Forecasting 26(3), 448–459. Griffith, R. M. (1949). Odds adjustments by american horse-race bettors. The American Journal of Psychology . Hausch, D. B., Lo, V. S. & Ziemba, W. T. (1994). Efficiency of Racetrack Betting Markets, vol. 2. World Scientific Publishing. Hausch, D. B., Lo, V. S., Ziemba, W. T. & Ziemba, W. (2008). Efficiency of Racetrack Betting Markets, vol. 2. World Scientific Publishing. 77 Jones, E., Oliphant, T., Peterson, P. et al. (2001–). SciPy: Open source scientific tools for Python. URL http://www.scipy.org/. Kelly, J. L. (1956). A new interpretation of information rate. Information Theory, IRE Transactions on 2(3), 185–189. Lessmann, S., Sung, M.-C. & Johnson, J. E. (2007). Adapting leastsquare support vector regression models to forecast the outcome of horseraces. The Journal of Prediction Markets 1(3), 169–187. MacLean, L., Ziemba, W. T. & Blazenko, G. (1992). Growth versus security in dynamic investment analysis. Management Science 38(11), 1562–1585. Manski, C. F. (2006). Interpreting the predictions of prediction markets. economics letters 91(3), 425–429. McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior . Orszag, J. M. (1994). A new look at incentive effects and golf tournaments. Economics Letters 46(1), 77–88. Ottaviani, M. & Sørensen, P. N. (2008). The favorite-longshot bias: an overview of the main explanations. Handbook of Sports and Lottery Markets (eds. Hausch, DB and Ziemba, WT), North-Holland/Elsevier , 83–102. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. & Duchesnay, E. (2011). Scikit-learn: 78 Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830. Sauer, R. D. (1998). The economics of wagering markets. Journal of economic Literature 36(4), 2021–2064. Schmidt, C. & Werwatz, A. (2002). How accurate do markets predict the outcome of an event? . Shmanske, S. (2005). Odds-setting efficiency in gambling markets: Evidence from the pga tour. Journal of Economics and Finance 29(3), 391– 402. Smith, M. A., Paton, D. & Williams, L. V. (2006). Market efficiency in person-to-person betting. Economica 73(292), 673–689. Smith, M. A., Paton, D. & Williams, L. V. (2009). Do bookmakers possess superior skills to bettors in predicting outcomes? Journal of Economic Behavior & Organization 71(2), 539–549. Smoczynski, P. & Tomkins, D. (2010). An explicit solution to the problem of optimizing the allocations of a bettor’s wealth when wagering on horse races. Mathematical Scientist 35(1). Stevenson, A. & Lindberg, C. (2010). New Oxford American Dictionary, Third Edition. Oxford University Press. Sung, M. & Johnson, J. E. (2012). Comparing the effectiveness of one-and two-step conditional logit models for predicting outcomes in a speculative market. The Journal of Prediction Markets 1(1), 43–59. Tan, P.-N., Steinbach, M. & Kumar, V. (2013). Introduction to data mining. Pearson Education India. 79 Tanaka, R. & Ishino, K. (2012). Testing the incentive effects in tournaments with a superstar. Journal of the Japanese and International Economies 26(3), 393 – 404. URL http://www.sciencedirect.com/ science/article/pii/S0889158312000196. Tziralis, G. & Tatsiopoulos, I. (2012). Prediction markets: An extended literature review. The journal of prediction markets 1(1), 75–91. Verbeek, M. (2008). A guide to modern econometrics. John Wiley & Sons. Wolfers, J. & Zitzewitz, E. (2006). Interpreting prediction market prices as probabilities. Tech. rep., National Bureau of Economic Research. Yahoo (2014a). Historical golf results. http://sports.yahoo.com/golf/. [Online; accessed: 2014-01-18]. Yahoo (2014b). THE PLAYERS Championship. http://sports.yahoo. com/golf/pga/leaderboard/2013/13. [Online; accessed: 2014-01-18]. Yahoo (2014c). Yahoo data license. http://info.yahoo.com/guidelines/ us/yahoo/ydn/ydn-3955.html. [Online; accessed: 2014-01-18]. Ziemba, W. T. (2008). Chapter 10 - efficiency of racing, sports, and lottery betting markets. In: Handbook of Sports and Lottery Markets (Hausch, D. B. & Ziemba, W. T., eds.), Handbooks in Finance. San Diego: Elsevier, pp. 183 – 222. 80 Appendix A Appendix A.1 Golf dictionary Table A.1: Golf terms Term Description Bogey A score of one over par The area between the tee box and the putting green where Fairway the grass is cut even and short Golfer A person who plays golf Handicap A handicap is a numerical measure of a golfer’s potential playing ability based on the tees played for a given course Special areas on the golf course that have additional rules Hazard for play. There are generally two types: (1) water hazards, e.g. ponds, lakes, and rivers; and (2) bunkers, which are sand traps. 81 Continued from previous page Term Description The pre-determined number of strokes that a scratch (or 0 Par handicap) golfer should require to complete a hole or a round. A grass area on the golf course where the grass is cut higher than the grass on the fairway and the green. It is typically a Rough disadvantageous area to hit from. The most common scoring system in golf. It involves counting Stroke play the total number of strokes used on each hole during a given round, or series of rounds. The winner is the player who has used the fewest number of strokes over the course of the round, Tee box or rounds. The starting point of a golf hole. Source: Stevenson & Lindberg (2010) A.2 LR testing of parameter significance The likelihood-ratio test is used to assess the contribution of individual attributes to the models. The LR test is given by (Verbeek, 2008): D = −2 ln likelihood of the null modell likelihood of the alternative modell (A.1) The null model is the model without the attribute which contribution is to 82 be assessed. The alternative model contains the attribute. The test statistic, D, is approximately chi-squared distributed (Verbeek, 2008) with 1 degree of freedom (the difference between the number of free parameters in the two models). Based on the test, it can be concluded whether there is a significant association between the attribute (the predictor) and the outcome. 83