Friday, March 13, 2020

Why There Is No Perfect Golf Handicap System


Noted golf writer George Peper recently wrote a column where he proposed a personal handicap system (PHS) based on both course and player characteristics.   For example, he argued if a player sprays the ball, his handicap should be adjusted upward if he ventures onto a tree lined course with narrow fairways.  Similarly, if the course is wide open, the player’s handicap should be reduced. What may sound reasonable in a weekly column, however, may actually prove infeasible under closer scrutiny.  

Peper rejects such pessimism, however, and believes the PHS can be constructed by Big Tech applying its analytic tools to Big Data.  And from where does he derive his confidence that science can solve the quest for a perfect handicap that has plagued the sport since its inception?   Apparently, Peper is awed by Netflix recommending movies he might like based on his past viewing history and believes golf data can be studied to obtain similar results.  He fails to mention that a handicap system only as accurate as Netflix suggestions cannot be viewed as a step forward (e.g., if you liked Caddyshack does not mean you will like Caddyshack II). 
 
Peper goes on to argue that with “enough scores the computer knows your game, knows about your power outage, your two-way miss, your chip yips, etc., etc., etc.”   Peper is mistaken.  The computer knows none of this.  The computer only knows your adjusted score, and the Course and Slope Rating. Peper appears to suggest a handicap should be a function of more explanatory variables and specifically the obstacle value ratings in the USGA Handicap System now the World Handicap System (WHS).

“Can a system be devised to attain this dream handicap system? “  To answer this question, the three basic elements of any handicap system are reviewed for both the WHS and the PHS.  Those elements are: 1) rating the difficulty of a golf course, 2) measuring the player characteristics, and 3) a method of combining course and player ratings to determine a player’s handicap.   

 Rating Course Difficulty-

WHS - The WHS uses the effective distance of a hole and the rating of ten obstacle factors (e.g., trees, bunker, etc.) to determine the rating of a hole.  Obstacle factors are rated for only two types of players--the scratch and bogey golfer. The USGA assumes the relative difficulty of a course for all other players can be measured by a linear function of the Bogey Rating minus the Course Rating (i.e., the Slope Rating). 

PHS – Peper requires the PHS to describe a course by its obstacles values, but that raises the circular reasoning problem that faces rating systems.  For example, what is the course rating?  Historically, the course rating is the average of the ten best scores of a scratch golfer.  What is a scratch golfer?  It is a golfer whose ten best scores average the course rating.  To escape the circle, the USGA had to define a scratch golfer without regard to a course rating.  It chose competitors at the U.S. Amateur as scratch golfers.  

The development of the PHS would require either the course rating or a player’s characteristics to be fixed without regard to the other.   A solution, but not one without problems, is to rate courses by their effective distance and the ten obstacle values of the USGA Course Rating System.  The USGA Rating System rates holes which are then added together to get the Course and Bogey Ratings for the Course.  For simplicity, it is assumed a course rating under the PHS is its effective length and the average value of each of the obstacle value ratings (i.e., the sum of the scratch and bogey obstacle values for the ith obstacle variable summed over 18 holes and divided by 36).  A course is then described by its effective length and the rating of ten obstacles. 

Even with this simplification, there would still be two problems to overcome.   First, the definition of an obstacle variable is so obtuse as to be immeasurable—i.e., what is “psychological factor” and isn’t just a combination of the other nine factors?   Second, assigning values to obstacle factors is the responsibility of the rating committee which in most cases is not highly trained.   Golf associations do not sponsor seminars on how to distinguish a 5 from a 6 “green surface.”  It is unlikely rating committees would be consistent in assigning values to the nebulous definitions of the obstacle factors. 

Measuring Player Characteristics -

WHS - As discussed above, the WHS is not concerned player characteristics.   A player’s ability is only measured by his score, and not how it was obtained.

PHS – The PHS gives more handicap strokes to a “wild” player on a tree line course.  How does the PHS identify the wild player?  One approach would be to estimate the effect of each obstacle variable using linear regression analysis.  The estimated equation would be of the following form:

              Differential(j)  = Adjusted Score(j) – Course Rating(j) = a(0) + a(1)Y(j) + a(2)T(j) + a(3)F(j)                                    +a(4)R(j) +a(5))X(j) + a(6)W(j) + a(7)T(j) + a(8)B(j) + a(9)G(j) + a(10)S(j) + a(11)P(j)

  Where, the obstacle value ratings for the jth course are:

   Y(j)=Effective Playing Length, T(j)=Topography,  F(j)=Fairway, R(j)=Rough,
   X(j)=Out of Bounds, W(j)= Penalty Areas, B(j)=Bunkers, G(j)=Green Target, 
   S(j)= Green Surface, P(j)=Psychological

The linear regression analysis will yield estimates of coefficients (i.e., a(i)) which indicate how a player is affected by each obstacle value.   The player would not be defined by his WHS Handicap Index but by the value of twelve coefficients.  For example if a player is a short hitter, the value of a(1) (i.e., Effective Playing Length ) should be relatively high.  A player’s ability would no longer be identified by his Handicap Index, but by string of 12 numbers which will be termed his Peper Rating.  For example, a player could have a PHS Index of 3,3,4,2,6,7,4,8,3,2,4,6. (Are you starting to see the problem?)  

A player’s expected differential on course j would be:



            


Where
              a(i) = Player’s characteristic rating for the ith obstacle value,
              c(i,j) = Course characteristic rating for the ith obstacle value on course j

In estimating the equation, however, more problems arise.  First, a general rule of thumb is the minimum sample size should be twenty observations for each independent variable.  That would mean 220 observations (i.e. courses) would be required for each player. It is reasonable to assume many players will not play that many different courses in a year.  The inclusion of numerous “Home” scores would decrease the statistical significance of any estimate.   For example, if only Home scores were included, the coefficient of all variables would be zero and the estimated Differential would just be the the player’s average Differential.  To eliminate this problem, it is assumed that all players have the same free time and access to courses as Peper who notes he has played over 750 different courses. This assumption eliminates the sample size problem even though it is unrealistic.

The second problem is obstacle variables do not have a large impact on a player’s differential.  The total scratch obstacle value typically accounts for less than two percent of the Course Rating.[1]  Individual variables will then have an even smaller impact on scoring.  This would be like Netflix judging a viewer’s taste based on a movie’s sound editing.  It is likely the estimated coefficients of most variables will not be significantly different from zero.  

Third, it is likely the “independent “variables are not independent.   Tough courses may have high scores on most of the obstacle values.  For example, if courses had fast greens and numerous strategically placed bunkers it would be difficult to estimate the effect of each variable on a player’s differential.

Method Determining a Player’s Handicap -

WHS - The WHS computes a player’s Handicap Index by averaging his best 8 of 10 scoring differentials ((adjusted score – Course Rating) x 113/Slope Rating). The player’s course handicap is his Handicap Index multiplied by the Slope Rating/113 plus the (Course Rating - Par).


PHS - A player’s PHS at this course could be some percentage (e.g. 90%) of his Expected Course Differential that would reflect a player’s potential ability and not his average ability.

Major operational problems are inherent in the PHS.  For example, how is the PHS updated?  The present system is based on 20 scores and the oldest is eliminated when a new score is posted.  For most players, the present handicap system provides an acceptable estimate of current ability though there is some lag. Peper argues the PHS should be capable accessing a lifetime of rounds.  If it necessary to go back years to get enough data to satisfy the data requirements of the PHS, the player’s PHS may be a function of how he played years ago rather than how he is playing this month.  Therefore, if the PHS cannot reflect a player’s current ability, it fails an important criterion for an equitable handicap system.

Since a player’s handicap is now defined by 12 different coefficients, the process of determining course handicap would need a computer.   It’s possible an app could be constructed that would embed a player’s twelve-digit characteristic rating and apply it to a directory containing the obstacle ratings for each course to be played.  A handicap system should produce easily understood results.  The PHS would not provide such clarity.  

Conclusion – Thirty years ago the Handicap Research Team (HRT)of the USGA wrote:[2]

The HRT is considering a solution of adopting a normal model handicap formula which would mean a two dimensional handicap to the Slope System  The solution could result in a Steady Eddy receiving more strokes on a high Slope Rated Course than a Wild Willy of equal Handicap Index would receive.

The HRT never developed such a handicap system probably because of the problems outlined above.  Or perhaps the HRT realized such an advance was not important.  Handicaps should be used to measure improvement and in competitions with reasonable stakes.  To seek perfect equity in every handicap match is a fool's errand.  As Peper has written elsewhere, golf is not all about winning.  It is about camaraderie.  It is about testing yourself under pressure.   And, it is about the beauty a round of golf can present.  So, if you find yourself on a course that does not fit your game, consider yourself lucky and suck it up!




























[1] Dougharty, Laurence,” Is Your Course Overrated,” www.golfhandicaps.com
[2] Knuth, D. A two parameter golf course rating system, Science and Golf, E & FN Spon, London, 1990, p. 146.





Friday, January 31, 2020

World Handicap System Adopts a Few Recommendations from this Blog


Over the years, this blog has pointed out some deficiencies and oddities of the USGA’s Handicap System.  The new World Handicap System (WHS) has addressed some of the points made in the blog and instituted minor improvements. This post reviews what the blog recommended and how the World Handicap Committee(WHC) responded.

Bonus for Excellence- The Bonus for Excellence, .96 in 2019, is multiplied by the average of a player’s 10 best scoring differential to calculate a player’s Handicap Index.  Dean Knuth, former Senior Director of Handicapping for the USGA, described the purpose of the BFE when he wrote in Golf Digest:

 “Historically, the USGA wanted to reward the accomplishments of better players…For a six-stroke difference in handicaps the better player gains a one-shot advantage (due to the BFE) and should win 60 percent of the matches.”[1]

In a post, “The USGA’s Bonus for Excellence Ruse, January 15, 2013,” it was shown the BFE is neither an effective incentive to improve nor a reward for superior performance and should be eliminated from the USGA Handicap System.  The WHC, perhaps heeding the wisdom of the post, eliminated the BFE.
   
Treatment of Women – The USGA’s treatment of women was analyzed in “Why Does the USGA Treat Women Differently, October 2, 2014.“  Before the WHS, the USGA recommended different handicap allowances for men and women.  For example, in four-ball stroke play men are allowed 90 percent of their handicap while women are allowed 95 percent of their handicap.  Why are women treated differently?  Much of the USGA’s research on multi-team events was done over 35 years ago and there appears to be no mention of any differences due to the gender of the player. [2]   It is likely the USGA had no empirical evidence for the women’s allocation, and the percentage was just a consensus guess by members of the Handicap Procedure Committee.   If women were studied, it is probable any difference in the estimated optimal allowance for men and women would not be statistically significant.  Remember, all the studies used to justify four-ball allowances were completed long before the introduction of the Slope System.  With this error and others, it is likely any difference as small as five percent was not significant.  Since the USGA does not release its research for peer review, the accuracy and validity of the USGA’s allowance may never be known. 
The following recommendation was made in the post: To make a small step toward the equal treatment of women, however, the USGA could keep the hallowed men’s allowances and simply eliminate any allowance specific to women.  The WHC has followed this recommendation and eliminated separate allocations for women.

Sec. 10-3 - Index Reduction for Exceptional Tournament Performance – Section 10-3 supposedly cracks down on Sandbaggers by reducing their Handicap Index based on exceptional tournament performance.  A post, “The Truth About Section 10-3, April 15, 2014,” made the following observation:
“The USGA has never published any research on the effectiveness of Sec. 10-3.  When asked recently how many players receive a reduced index, the USGA replied “We do not keep such statistics.”[3] Apparently the USGA does not want to know the effectiveness of this section.  Sec. 10-3 lives on since it: 1) gives the illusion of curing the sandbagging problem, 2) does not generate negative feedback since so few are affected, and 3) relieves the indolent handicap committee of the responsibility for rooting out the unethical player.  In essence, Sec. 10-3 is the perfect bureaucratic solution.”   
The WHS has eliminated Sec. 10-3 and placed more responsibility on the Handicap Committee to monitor tournament performance.   Such an approach has not been successful in the past and is not likely to be successful in the future.   But the WHS did put the onus on the Handicap Committee where it belongs and not on a statistical formula that was never effective. 

Four-ball Stroke Play and Four-ball Match Play Allowances – Under the USGA Handicap System, players are assigned their full handicap (Sec 9-4aiii) in four-ball match play.  In four-ball stroke play, men are assigned 90 percent of their course handicap (Sec. 9-4bii). 
A post, “Chapman Handicaps and Sec. 3-5: Proposed Changes in Allowances, August 19, 2013,” questioned the different treatment of the two types of play.  If high handicap teams have an edge in stroke play, why don’t they also have an edge in match play?  And why does the USGA recommend a maximum difference in handicaps for four-ball stroke play, but not four-ball match play?  The USGA is of no help in answering these questions.  As mentioned above, the USGA’s research on the equity of multi-ball competitions (e.g., four-ball match four-ball-stroke play) is clearly out-of-date. 
The WHS includes several changes.  First, it now recommends a 90 percent allowance for four-ball match play.  Second, it reduces the allowance for four-ball stroke play from 90 percent to 85 percent.  Third, it omits any mention of a restriction of the difference in handicaps between partners.  It is difficult to describe these changes as an improvement.  The WHC has not presented any evidence the new allowances provide more equitable competition than the old allowances.  The new allowances do have one thing in common, however. All changes in the allowances favor the low-handicap player. This suggests the changes were based more on politics than statistics. 
  
Stroke Allocation -Under the USGA recommended stroke allocation procedure, holes were ranked by the difference in average score by low and high handicap players. The USGA argued this allocation would produce the most halved  holes but never explained why this should be a criterion for choosing an allocation procedure.  In a post “Problems with the USGA Stroke Allocation Procedure, January 17, 2015,” defects in the USGA’s method were exposed.  The USGA gave an example of where strokes should be given.  In the example, however, the high handicap player lost 5 and 3, hardly an equitable competition. The post recommended holes should be ranked by difficulty subject to certain guidelines such as spreading low stroke holes evenly over the 18 holes.
The WHS adopted this recommendation as presented in Appendix E: Stroke Index Allocation.  Holes are now ranked on playing difficulty relative to par subject to the same conditions mentioned in the blog post.  It is not known why Stroke Allocation was changed to Stroke Index Allocation. Index plays no part in the allocation procedure. But the change is gratefully received, and the superfluous language is overlooked.

Summary and Conclusion - The minor improvements discussed above do not impact on the efficacy of the WHS.  The evaluation of the WHS requires an experimental design that measures performance against various criteria (cost, equity, consumer satisfaction, etc.).  Does the World Handicap Committee have such an evaluation plan?  Probably not.  Bureaucracies rarely fund evaluations whose results could prove embarrassing.  




[1] Knuth, Dean,”Handicaps,” Golf Digest, September 2008  as reprinted at www.popeofslope.com.
[2] Ewen, Gordan, What the Multi-ball Allowances Mean to You, www.usga.org, Far Hills NJ, 1978.  The USGA has not released the original research for peer review. 
[3] E-mail to author from Annie Pollock, USGA, November 20,2013.

Friday, January 17, 2020

Dean Knuth: Is the Pope of the Slope Fallible?


The World Handicap System (WHS) has a feature called the par adjustment.  Basically, a player’s course handicap is just his former USGA handicap plus the difference between the course rating and par—usually a negative number.  In an article (The flaw in the new World Handicap System, Golfdigest.com, January 1, 2020) Dean Knuth, who has promoted himself as the “Pope of the Slope,” made arguments against the par adjustment.  He relies on his credentials (e.g., former Director of Handicapping at the USGA, etc.), and foregoes any reliance on theoretical or empirical evidence to make his case.   His arguments against the par adjustment are either specious, untrue, or unsubstantiated.  Unlike his namesake, the Pope of the Slope is clearly fallible.  Golf Digest did Knuth a disservice by printing his article without the proper vetting.     

Excerpts from Knuth’s article are presented below in italics.  After each excerpt, a brief analysis demonstrating the “flaws” in Knuth’s arguments is shown.

Knuth: Let’s start with the fact that par is hardly the most reliable measure of course difficulty (that would be course rating). Almost any golfer can list two courses that are both par 72s but vary greatly in how tough they play. Differences in length, in obstacles, in penalty areas, make one drastically harder than another even when they have the same par. Par as a metric, then, is somewhat arbitrary…. Maybe you don’t want to go that far, but calculating a handicap around a less reliable measure of difficulty inherently makes for a less equitable system.

Knuth is correct that “par” is not a accurate measure of course difficulty, but that claim is irrelevant to the equity of the WHS.  For stroke play (i.e., not Stableford) competitions, the WHS could have picked any number to subtract from the Course Rating and competitive results and handicap differentials would remain the same.  The par adjustment simply adds or subtracts a fixed number from a player’s handicap under the expired USGA Handicap System.  If players are competing from the same tees, differences in handicaps among players remain the same.  There might be small changes in handicaps due to rounding, but they are random and would not affect equity.    When players compete from different tees, the course handicap is calculated with all players playing to the highest (or lowest) par.  Again, what particular par number is used will not affect the equity of competition.  Knuth’s conclusion that the par adjustment makes for a less equitable system is not substantiated. 

Knuth: …the new formula changes course handicap values from tee to tee as you compare the WHS to the USGA system at any course.  For example, where once a course handicap was a 12 from the back and middle tees, and an 11 from the front, under the new WHS calculations there will be much larger variations—as many as 18 shots in some instances—between tees. Part of the reason for this is that during the calculation, an approximation is being approximated again by adding Course Rating minus Par creating an imperfect “over-spreading” of the course handicaps  Knuth adds “ It’s why, to me, the WHS produces an unacceptably large course handicap variation for the same ability player.”


Knuth never explains why a large course handicap variation for the same player is unacceptable.   The reader just has to take his word for it.  It is not the variation of handicaps, but the difference in handicaps among competitors that determine fairness as discussed above.  Knuth states the par adjustment creates an imperfect “over-spreading”   of the course handicap. He never sets forth the criterion for “perfect spreading.”  He simply hopes his readers will assume “over-spreading,” whatever it is, must be bad.  

Knuth:  Golfers competing from more forward tees will be receiving fewer strokes than is truly equitable.  And if you want to follow the USGA’s “Tee it Forward” initiative, there is a disincentive because playing from shorter tees more drastically lowers your course handicap.


This is where an editor at Golf Digest should have interceded and asked for evidence.  Instead, Golf Digest was complacent and just assumed Knuth, with all of his credentials, must know what he talking about.   If Knuth’s claim that the forward tee player is treated unfairly was true, it would drive a stake through the heart of the WHS.  Of course, it is not true.   A player competing from the forward tees will receive a reduction in handicap, but so will his fellow competitors.   There is no change in equity due to the par adjustment.   Even one with a rudimentary knowledge of the WHS (e.g., Jerry Tarde, Editor of Golf Digest) should know this. 



There is a lot not to like about the WHS and its par adjustment. The USGA simplified the Rules of Golf to make them more understandable to new and casual players.  Then, paradoxically, the USGA adopted an arcane handicap system that baffles and discourages these same players.   This blog has openly opposed the WHS, but firmly believe its efficacy should be determined on empirical evidence and not on conjecture by so-called experts.   Knuth’s article makes no contribution to that end.  If Golf Digest had any integrity, Knuth’s flawed article would be taken down from its website.

Thursday, December 19, 2019

Course and Slope Rating Uncertainties Continued


The USGA estimates Course Ratings to the nearest tenth and Slope Ratings to the nearest point.  As pointed out in a previous post (How Accurate is the Slope System?, 10/8/2012), there is larger uncertainty in the Ratings estimates than the USGA cares to admit.  A course that has recently been re-rated by the Southern California Golf Association (see Appendix) demonstrates the spurious accuracy of ratings.  This note is not written to complain about the Ratings, but to illustrate the problems of making accurate ratings.   Ratings are not a science and only an art in the same sense that finger painting is an art form.  This can be demonstrated by examining the new and old ratings at the re-rated course.

Course Set-Up – A major rating problem occurred when the course was shortened for men (green tees).   The green tees, according to the scorecard, are a combination of white and red tee placements. (Note: There are no green tee markers.)   The actual placement of the green tees, however, is different.  On 15 of the holes where the green tees and red tees should be of the same length, the green tees are typically set at least 10 yards behind the red tees. This is probably done so men can retain the illusion they are not playing from the red tees.   Similarly, when white and black tees are supposedly shared, the black tees are placed 10-15 yards back from the white tees.  Then there are outright errors in tee placement.   The green tees are supposed to set alongside the white tees on one hole.  Instead, they are set alongside the red tees making for an error of some 60 yards.

So what course did the SCGA rate?  Apparently, it rated the course shown on scorecard since that is the distances it reported.  In essence, the SCGA has rated a course that does not exist.

Ratings Changes – Sometimes Rating Committees make small changes just to justify their existence.  The small changes in the ratings at the course (e.g., one point change in the Slope Rating) are not due to changes in the courses, but changes in the Rating Committee. The Committee does not have to explain the ratings, but only send them along to the Club as if they were inscribed in stone.  Could the Committee actually state a physical reason for a Course Rating increasing by 0.1 or the Slope Rating by one point?

Posting 9-holes or 18-holes –Courses are rated by each 9-holes. The 18-hole Course Rating is the sum of the two 9-hole Course Ratings.  The 18-Hole Slope Rating is the average of the two 9 hole Slope Ratings rounded to the nearest integer.  The rounding error in the 18-hole Course Rating could be as much as 0.1.  The Slope Rating for 18-holes will be the same whether the difference in the 9-hole Slope Ratings are some odd number or that odd number plus one.  For example, the Course Rating from the Gold Tees is 73.7 and the Bogey Rating is 98.5.  The Slope Rating for 18-holes should be 133 (5.381(98.5-73.7)). The two 9-holes Slope Ratings are 130 and 137.  When the Slope Ratings are averaged and rounded, the 18-hole Slope Rating is134.

Whether a player posts an 18-hole score or two 9-hole scores can make a difference.  Suppose a player shoots a 90 from the gold tees.  His differential is 13.7 ((90-73.7)113/134).  The table below shows that if he posts various combinations of 90 as 9-hole scores, his differential can be as high as 14.0 and a as low as 13.6. In essence, player who shoots 40-50 is considered a better player than one who shoots 50-40. 

Table

Combined Differential for 9-hole Scores

Front Nine Score
CR =36.5 SR=130
Back Nine Score
CR = 37.2 SR = 137

Combined Differential
50
40
14.0
49
41
14.0
48
42
14.0
47
43
13.9
46
44
13.9
44
45
13.8
44
46
13.8
43
47
13.8
42
48
12.7
41
49
13.6
40
50
13.6



World Handicap System (WHS) – The USGA Course Rating System is based on taking a player’s best 10 out of 20 differentials.  The WHS will only use a player’s best 8 differentials in calculating his Index.  Therefore, as of January 1, 2020 every Course and Slope Rating will be in error.  Rather adjust the Ratings, the USGA will just let the Indexes of every player drop by approximately 0.5.

Lessons Learned – This post continues the blog’s efforts to document the uncertainty surrounding the accuracy of Course and Slope Ratings.  Recent changes in the Handicap System only introduce another layer of complexity without an accompanying benefit.  The best example is the Daily Course Rating (DCR) to correct for bad weather now part of the World Handicap System (WHS).  Below is the equation used by Golf Australia to make that adjustment. 

              DCR = SR +SUM(36+Par-SR-CPA-mh-b-S)/(m’h+b’)2)/SUM(1/m’h+b)2)+1/CSD2)

It is assumed the WHS has a similar equation.  Any regulation that is not understood by those being ruled is not a good one.   Moreover, the Handicap System is marked by rounding errors, measurement errors (see above), random errors, and systematic errors (i.e., sandbaggers).  To believe a quadratic equation can make a significant advance in the equity of competition is myopic.  Sadly, such claims are often made by “quants” and adopted by administrators who are dazzled by the mathematics.  This fulfills the bureaucrats need to do something even though it is of little or negative value.   

The major lesson in all of this is to not take handicap ratings too seriously.  They are not precise, but “good enough” and probably as good as can be done. Errors in ratings can cause you to lose and win a match.  Things should even out in the end



Appendix

Course and Slope Ratings


Tees
Old  Course Rating
New Course Rating
Old Slope Rating
New Slope Rating

Old Yardage

New Yardage
Gold
73.5
73.7
133
134
6972
6972
Gold/Black
72.2
72.3
129
130
6689
6689
Black
71.0
71.2
126
127
6445
6445
Tournament
70.0
70.0
124
124
6195
6195
White
68.4
68.4
119
120
5851
5870
Green
66.0
65.7
111
112
5365
5204


Tuesday, October 22, 2019

Eliminating the Blind Draw


(Note: This is a corrected version of a post of the same name from 2012.  The previous post omitted the Appendix.  The Appendix is shown in this version)
Introduction - Many tournaments consist of a format where foursomes compete against other foursomes in the field.  When the field cannot be divided evenly into foursomes, threesomes are created.  The threesome is then allowed a “blind draw” for the fourth player (i.e., the score of another player in the field is drawn and his score becomes that of the missing fourth player)'

While the “blind draw” is equitable it has several problems.  First, a team’s performance is determined in part by luck rather than on how well the team played.  Second, if the blind draw played well, his performance can help the threesome and therefore hurt the chances of his own team. Third, it is more difficult for the player in a threesome to evaluate risk/reward decisions when the performance of the fourth player is unknown.

This paper evaluates two methods around this problem:

·         Method 1: The threesome is allowed to use one player’s score twice on a hole.  The chosen player is rotated each hole so that each player’s score can be used twice on six holes.  A typical rotation would have the lowest handicap player take the first hole, the second lowest handicap the second hole, and the third lowest handicap player the third hole.  This rotation would be repeated every three holes.

·         Method 2: The threesome is assigned a player who always has a net par on each hole


The evaluation proceeds in four steps.  First, the basic probability model for the evaluation is described.  Second, probability values are estimated using data from two courses.  Expected hole scores for various methods are then computed to determine the preferred method for threesome competition.  Third, a sensitivity analysis is performed to see over what range one method is preferred over the other.  Fourth, conclusions are drawn as to the best method for achieving equitable competition.   



1. The Probability Model - Assume a player has three different outcomes when playing a hole.  A net birdie is assigned the value of 0, a net par is assigned the value of 1, and a net bogey is assigned the value of 2.  For demonstration purposes, probabilities are assigned to each outcome as shown in Table 1:

Table 1

Probability of Scoring 

Score
Probability
0
.25
1
.50
2
.25


The criterion for measuring equity is the expected hole score for each team.  The method that yields an expected score for the threesome closest to that of the foursome would be preferred.  

The foursome has 81 different scoring combinations as shown in Table A-1 of the Appendix.  Each combination has a team score and a probability of occurrence.  The expected score is the product of the team score and the probability of occurrence summed over all outcomes.  The expected two-best ball score of the foursome is 1.11.

For Method 1 where the threesome can use one ball twice, there are 27 different scoring combinations.  Those combinations and their associated probabilities of occurrence are shown in Table A-2 of the Appendix.  The expected two-best ball score on each hole for the threesome would be 1.25.  In an eighteen-hole competition, the foursome would have a two and a hall stroke ((1.25-1.11)·18=2.52) advantage over the threesome.

Under Method 2, the probabilities of each outcome for the three players is the same as in Method 1.  The value of the outcomes may differ, however, as shown in Table A-3.  The expected hole score under Method 2 is 1.28.  The foursome has a 2.5 stroke advantage over a threesome competing with Method 2.


2. An Empirical Test - The selection of the best method will depend upon the player’s probability function at a course.  The probability function was estimated for two courses using the same 88 players.  The net scores for each player were sorted into five categories as shown in Table 2.  The estimated probabilities are the number of hole scores in each category divided by the total number of hole scores.  These probabilities are presented in Table 2. 

Table 2

Estimated Probability Functions

Probability
Score
Course 1(CR=71.2)
Course 2(CR=71.7)
2 or More Under Par
.024
.027
1 Under Par
.191
.178
Even Par
.333
.319
1 Over Par
.307
.308
2 or More Over Par
.145
.168



Table 2 shows there is a significant probability that a player will have a net score of 2 over par or more.  The three-score model (0,1,2) used here does not take into account such high scores.   To have a score of two over par used in a foursome event, however, three players must have that score.  The probability of that outcome is small, so the bias introduced by the three-score model should not be large.

            To evaluate the expected scores under each scoring alternative, the probabilities of 2 under and over are combined with the probabilities for 1 under and 1 over, respectively, as shown in Table 3.   (Note: Par is considered “1” in the three-score model.)

Table 3

Estimated Probabilities

Probability
Score
Course 1
Course 2
P(0)
.215
.205
P(1)
.333
.319
P(2)
.452
.476


These probabilities result in the expected hole scores shown in Table 4 for each method.


Table 4

Expected Hole Scores


Course
Foursome
Method 1
Method 2
Course 1
1.48
1.64
1.46
Course 2
1.55
1.72
1.50



The table demonstrates Method 2 is the preferred format at these courses.  The expected differences in hole scores is .02 for Course 1 and .05 for Course 2.  For an 18-hole competition, a threesome would have a small edge of less than one-stroke.  Under Method 1, the threesome has an expected 18-hole score approximately three strokes higher than that of a foursome. 


3. Sensitivity Analysis - The expected value of the score will depend on the probability distribution of individual hole scores by a player.  Table 5 below shows the expected team scores for alternative  probability distributions.


Table 5

 Alternative Probability Distribution


Probabilities
Expected Hole Score
Alternative
P(0)
P(1)
P(2)
Foursome
Method 1
Method 2
1
.1
.5
.4
1.85
1.94
1.77
2
.2
.5
.3
1.38
1.46
1.44
3
.3
.5
.2
0.95
1.06
1.14
4
.4
.5
.1
0.62
0.74
0.86

            The table demonstrates the preferred method depends on whether a course is relatively easy or difficult.[1]  When net bogeys are likely (i.e., P(2)=.4 or .3) Method 2 is the most equitable format for threesomes.   On an easier course (i.e., P(2)= .2 or .1), Method 1 yields an expected score closer to the foursome expected score and would be the preferred format. 

            Realistically, courses where Method 1 is preferred are rare.  The expected net score of a player with 4th probability distribution, for example,  would be 5.4 under par.   This would imply that the course rating is approximately 9 under par.[2]   A review of the golf courses in Southern California found no golf course with such a wide disparity between par and the course rating. [3] 


4. Conclusion - The research found that Method 1—one player’s ball counting twice—is not an equitable format.  This method was found to be marginally superior only on courses that do not seem to exist.  On most courses, a threesome playing under Method 1 would have an expected score some three strokes more than a foursome (e.g., on Course 1 the difference would be (1.64-1.48)·18=2.88).   Method 2 appears to ensure equitable competition on courses where the course rating is around par.[4]  Since most course fall in this category, Method 2 is the recommended format.



Appendix A


Table A-1 presents the possible combinations of scores for a foursome (0 = Birdie, 1 = Par, 2=Bogey).   Column 2 shows the probability of each combination.  Column 3 presents the frequency of each combination.  That is, how many different ways can a foursome make two bogeys and two birdies for example?  As shown in the table, there are 6 ways that combination can occur.    The probability of having two birdies and two bogeys is 0.003906.  Since this combination can occur in six different ways, the probability of this outcome is.0234375 as shown in column 4.  The 2-best ball score for each combination is shown in column 5.  In the example there are two birdies so the two best ball score is zero.  The expected team score is the product of the Probability of Occurrence and the 2-Best Score summed over all combinations.   In this case, the expected team score for a foursome is 1.11.



The expected score of a threesome under Method 1 is derived using the same methodology as shown above.  The expected score is 1.44 as shown in Table A-2.  The 2-best score is found by taking the expected value for each combination.  For example, assume a team has scores of 2,1,0.  If the player scoring a 2 could be used twice, the 2-best score would be 1.  If the player scoring 1 could be used twice, the 2-best score would be 1.  If the player scoring 0 could be used twice, the 2-best score would be 0.  Since each player is equally likely to be able to use his score twice, the expected 2-bes score is .67 (1/31 + 1/3∙ +1/3∙0).  The expected team score under Method 1 is 1.25





Under Method 2 the probabilities stay the same but the 2-Best Scores are slightly different.  Having a guaranteed par on a hole reduces the size of a bad hole score.  The expected score under Method 2 is 1.28.







[1] The best measure of difficulty is the difference between the course rating and par.  If the course rating is much lower than par (e.g., 67 versus 72), the player would be expected to have fewer net bogeys than on a course with a course rating of 73.0. 

[2] A player’s index is determined by the average of his ten best scores out of the last twenty scores.  Depending on the variance in the player’s scoring distribution, the average used for his handicap will be around 3-5 strokes lower than his average for all scores (i.e., the course rating must be 3-5 strokes lower than his expected score).    

[3] Southern California Directory of Golf, Southern California Golf Association, North Hollywood, CA 2006

[4] On courses where the course rating is much higher than par, Method 2 may yield too big of an advantage to the threesome.   When adopting any method, records should be kept so that the equity of competition can be empirically tested.  That is, do threesomes or foursomes win more than their fair share of competitions?