Thursday, November 15, 2012

Comparing World Golf Ranking Systems


(Note: World Rankings have gained much importance over the last decade.  To my knowledge there has not been any published research validating the methodology for ranking players. The paper below, written over ten years ago, tried to evaluate competing rankings systems by how well each predicted performance.  Neither ranking system did very well. Rankings are not a powerful predictor because of the random variation in scoring (i.e., the best golfer will often lose over one trial), past performance is not perfect indicator of future performance (i.e., a mutual fund that did well last year may not have similar results next year as every prospectus will tell you), and rankings are affected by the attributes of the course being played (i.e, there are horses for courses).  Though a ranking system is not a good predictor of performance, it does serve two important functions. First, the ranking system acts as a (hopefully) neutral arbiter in deciding who qualifies for major tournaments.  Second, it  feeds our desire for lists (i.e., the 10 best places to retire, the 10 best Mexican restaurants), which also makes for  endless 19th Hole banter--Luke Donald, are you kidding!)   
            Golf Digest printed an article (April 1999) critical of the World Golf Rankings (hereinafter termed IMG rankings).  The author, Dean Knuth, devised a new system for ranking players.  Knuth provided reasoning why the Golf Digest system would do a better job, but supplied no empirical evidence to substantiate his claim.[1]  The purpose of this paper is to evaluate the two ranking systems in their ability to explain performance at the Tournament Players Championship (TPC).
            Rankings from both systems were available on 59 players who entered the TPC.  The rankings were not the most current since the Golf Digest rankings were only available as of January 31, 1999.
            The players were ranked from 1 to 59 consistent with the rankings of both systems.  First, the rankings were correlated with the standings after the second round. The Spearman coefficient of rank correlation is shown in the table below.[2] A coefficient of 1.0 would mean the rankings and the order of finish were identical.  A coefficient of zero would mean there was no relationship between the rankings and performance.  The IMG ranking had a slightly higher correlation coefficient (.56>.54) as shown in the table below.
Table
Spearman Correlation Coefficient 
 
2nd Round
All Players-Final
Cut Players
Golf Digest
.54
.63
.54
IMG
.56
.63
.53
 
Next, rankings were correlated with the final standings.   Those players who missed the cut or were disqualified were ranked last.  There was no significant difference between the correlation coefficients for the Golf Digest and IMG systems. 
            Last, a study was limited to only those players who made the cut. The 38 players in the study who made the cut were re-ranked and that ranking was correlated with the final standings.[3]  The Golf Digest system did not have a significantly larger correlation coefficient (.54 vs. .53) than the IMG system.  A plot of the TPC standing versus the rankings (See the Appendix below) of each system demonstrate the similarity of results in using the two systems.  They also demonstrate that neither system explains a great deal of the variance in performance.[4] 
In summary, the Golf Digest system does not do any better in explaining performance than the IMG system.  There may be elements of the Golf Digest system worthy of adoption.  Further research should isolate the effect of each of Golf Digest’s recommended changes to determine if it advances or impedes equity in the rankings.
            Like all quasi-experiments there are significant caveats to the results. 
 
·         Only One Tournament - There is a large random component in performance among golfers from week to week.  A player’s performance will also vary with the type of course (long, tight, etc.).  Even in the best case studied here, the rankings only explained 34 percent of the variation in the order of finish.  Therefore, the superiority of a ranking system could easily be due to chance when only one tournament is considered.  The methodology demonstrated here should be repeated at the four majors and WGC tournaments to determine if the Golf Digest system (or other variation) is consistently better than the IMG system.
·         Use of Old Rankings - The only set of both rankings available were as of January 31, 1999.  Any definitive study would need rankings current with the tournament under study.
·         Missing Rankings - The study was done using only 59 players out of the 144 who played.  The research should have had rankings for all players so a more complete evaluation could have been done.  It is especially important to see how well the ranking systems do in the lower regions – 40th to 70th – because of the new criterion for admission to tournaments based on the World Rankings.  Without more rankings on players in the lower ranges, no judgment on the efficacy of the two systems over various regions could be made.
 
            Rankings have more than just the sole purpose of predicting performance.  The rankings clearly will be one factor in determining where and how often the best players play.  Tours will also be impacted.  Decreasing the points awarded on the Japanese tour may threaten that tour’s financial viability, and harm the popularity of the game in that country (e.g., What if no Japanese player qualified for a major event because of the ranking system?).[5]  It is important these incentives be shaped into a framework that will promote and serve the best interests of the game.
Within that framework, however, it is possible a more equitable ranking system could be built.  Many of the suggestions put forward by Golf Digest seem reasonable.  Policy should not be changed on what appears reasonable, however, but on what has been empirically tested to improve the accuracy of the rankings.

Appendix


COMPARISON OF RANKINGS VERSUS TPC STANDINGS

 

 
 
 


[1] This is consistent with Knuth’s defense of the Slope System – all theory and no empirical verification.  In setting forth his credentials for writing this article, Knuth again referenced a 30-year-old test score..
[2] Kendall, Maurice and Jean Gibbon, Rank Correlation Methods, Oxford University Press, NY, 1990.  The Pearson correlation coefficient gave approximately the same results as shown in the table.
[3] Thirty nine of the players made the cut (i.e., low seventy players and ties).  Nick Faldo was disqualified and was treated as a non-cut player.
[4] The two systems explain approximately 30 percent of the variance in performance.  There appears to be a large random component in determining performance.  That is why there are so many upsets in match play golf tournaments.
[5] Golf Digest’s claim of the inferiority of Japanese tour players was not borne out at the TPC.  Joe Ozaki led the tournament after two days, and finished well ahead of his ranking.  Brian Watts tied for the lead after the first day, but finished slightly lower than his IMG ranking.  Shegeki Maruyama and Carlos Franco both missed the cut, but because of their low rankings (43rd and 41st) this was not unexpected.  To assess the relative strength of the various tours, the performance of each tour’s players should be evaluated over an extended period.  It would then be possible to adjust the points for each tour to bring performance and ranking into parity.  This is a much more scientific and equitable approach than just slashing the points for the Japanese and Australian tours as Golf Digest recommends.