Introduction  The introduction of the Slope System
to golf handicapping has given the illusion of scientific accuracy. Players have an index calculated to the first
decimal. Rather than being a 10handicap
as in the old days, one is now a 10.4 index.
The decimal gives the impression of increased accuracy which may not be
deserved.
When
a player finds his or her handicap on a slope conversion table, another level
of deception occurs. The table emits an
aura of authority with it rows and columns of neatly printed numbers. It may not occur to the player with a 10.4
index why he should get a ten percent increase in handicap when the slope of a
course goes from 114 to 115. The player
may not ask if a course can be rated with such precision to distinguish between
a 114 and a 115 Slope Rating. Has the
Slope System contributed to the precision of golf handicaps or merely added
another layer of mathematical obfuscation?
The
research presented here attempts to measure any contribution the Slope System
has made in decreasing the error in estimating golf handicaps. The Slope Rating of a course is estimated by
combining estimates of the Course Rating and the Bogey Rating. To understand the errors in the Slope Rating,
the errors in estimating the Course and Bogey Ratings are examined in turn.
Error in the Course Rating  To
estimate the uncertainty stemming from errors in estimating the Course Rating,
it is necessary to examine the USGA course rating model. Since the USGA has steadfastly refused to
release either statistical information on model estimation or the raw data,[1]
much of what follows has to be based on informed guesswork.
By
definition, the Course Rating should be the average of the better half of 20
scores submitted by a “scratch” golfer.
This suggests that if a statistically significant number of rounds by
scratch golfer were submitted, a very good estimate of the Course Rating could
be made. Unfortunately, you cannot tell
if a player is scratch without knowing the Course Rating. And you cannot determine the Course Rating
without knowing if a player is scratch.
The USGA gets around the “chicken
and the egg” problem by assuming that players in the United States Amateur
Championship are scratch players. The model
is estimated by using the average of the better half of their scores.[2] This methodology underestimates the Course
Rating. The USGA is taking the low 144
scores out of 288 competitors. But
handicaps are based on the best 10 out of 20 scores. To more accurately reflect the handicapping
system, the USGA should have randomly selected groups of twenty players, and then
taken the lowest ten scores from each group.
This sample selection method should have been used to estimate the
coefficients in the USGA Course Rating Model discussed below.[3]
The USGA course rating system is
based on the following model:
Scratch Average
Score = a + b•Yardage +c_{1}(R_{1}S_{1}) + c_{2}(R_{2}S)
+
…C_{10}(R_{10}S_{10})
Where,
Scratch Average Score = Average of the better half of the scores
of 288
players in the U.S. Amateur
Yardage
= Hole length
R_{i}
= ith Obstacle factor
S_{i}
= Reference value for the ith obstacle
factor
The USGA used a form of least squares regression
analysis to estimate the coefficients of the Course Rating Model.[4]
The
USGA model assumes yardage makes the same contribution to scoring over all
ranges of hole length. This implies that
adding 50 yards to a 180 yard hole would lead to the same marginal increase in
the average hole score as adding 50 yards to a 500 yard hole. An alternative hypothesis is that yardage has
a different effect depending on hole length.
One test of this hypothesis would be to take the USGA data and estimate
separate equations for par 3, par 4, and par 5 holes. If the coefficient of the yardage variable
was approximately the same in all three equations, then the USGA model would be
validated. If the coefficients were
different, then a specification error of unknown proportion has been introduced
into the estimate of the course rating.
The USGA claims an unbiased estimate of the holebyhole error
of predictions is:[5]
s
= ((SSE/(N12))^{.5}
s
= .067
Where,
s
= Holebyhole standard error of prediction
SSE
= Sum of squares of the difference
between
the average hole score and the
estimate
of the average hole scores
The USGA claims
the “rootmeansquare error” in rating a golf course is only .285 strokes
((s·(18)^{.5}). The USGA does not report either the value of the model coefficients or whether they are statistically significant. The USGA may not have even tested for statistical significance, since multiple regression programs were not readily available at the time the model was estimated. There is no public record of the USGA revisiting the Course Rating Model now that more data and powerful analytic techniques are easily accessible. Interestingly, the yardage variable has not changed in the past 20 years though players are hitting the ball much further. It is also doubtful that all ten obstacle coefficients are
significant. The inclusion of so many
variables has given the illusion of precision that probably cannot be justified by the
data.[6]
The variance is also underestimated
because of the colinearity among the yardage and obstacle values. This colinearity stems from both design
(architects tend to design more obstacles in long championship courses) and
definition (the “green target” obstacle for example is related to the length of
the hole. This implies the variables are
not independent which is one of the key assumptions of regression analysis.
In summary, the course rating model
may have specification, measurement, and colinearity errors. The size of these errors is unknown. To be conservative, these errors are
neglected, and the USGA error estimate of ±.3 strokes will be used in
estimating possible errors in the Slope Rating.
Error in the Bogey Rating  The Bogey Rating is the expected
average score of a bogey golfer who is defined as a player with a USGA Handicap
Index of between 17.5 and 22.4. The USGA
has taken a group of 60 golfers with indices within the required range to
create a norm reference group. How
golfers with these indices were found before the slope methodology was
developed had never been documented by the USGA. In a
correspondence, however, a USGA official stated that the standard deviation of
the bogey rating estimate was ±.5 strokes.[7] The estimate of the standard deviation is
subject to all of the errors discussed previously for the Course Rating. The USGA is given the benefit of the doubt,
however, and this estimate of the standard deviation is accepted here for the
purposes of this paper.
Error in the Slope Rating  The Slope Rating is a measure of the
course difficulty for a bogey golfer. It
is calculated by the equation:
Where,
BR
= Bogey Rating CR = Course Rating
5.381(CBOV  CSOV)
Where,
CBOV
= Bogey Obstacle Value for 18 holesCSOV = Scratch Obstacle Value for 18 holes
An example of two courses will
demonstrate the impact of the obstacle values on the variation in slope rating
among courses. The contribution to the
Slope Rating of yardage and obstacle values for the Stadium Course at PGA West
and for Rancho Park, a municipal course of moderate difficulty are shown in
Table 1.
Table 1
Contribution to the
Slope Rating
Course

Constant

Yardage

Obstacle Value

Total

PGA West
7265 Yards

52.7 
66.6 
31.7 
151 
Rancho Park
6271 Yards

52.7 
57.5 
6.8 
117 
When
only the constant and yardage contributions are considered, the slope rating
for PGA West is only 8 percent higher than for Rancho Park. When all contributions are considered,
however, the slope rating for PGA West is 29 percent higher. The point is that the Slope Rating is heavily
impacted by the most subjective, and hence the more prone to error, part of the
rating system.
The
variance in the estimate of the Slope Rating is:
Where,
BRV
= Variance in the estimate of the Bogey
RatingCRV = Variance in the estimate of the Course Rating
Variance(Slope) = 28.96 (.25 +.09) = 9.85
The standard deviation of the
estimate of the Slope Rating would be the square root of the variance or
approximately 3.1 rating points.
Impact of the Slope System on the Accuracy of Handicaps  Does the Slope System enhance or
degrade the accuracy of handicaps? The
answer is “It depends.” Let’s examine
three cases to judge the efficacy of the Slope System.
Case 1: No
Slope Effect  The existence of the slope effect (i.e., bogey golfers doing
relatively worse on a course with a high slope rating) has never been
empirically demonstrated. In
this case where there is no slope effect, possible errors associated with the
estimating methodology are not a key concern.
It is not the errors in estimating the slope, but the entire model that
needs to be reexamined. This is not the
purpose here, however, so we go on to other cases where the validity of the
Slope System is not challenged.
Case 2: Small
Range in Slope Ratings  In this case, it is assumed the slope effect does
exist but the range of true slope ratings is quite small. For our purposes, it is assumed that courses have a
range of ten Slope Rating points. Such a small
range is not as unlikely as it may appear.
In a survey in Southern California, 48 percent of all courses with a
Slope Rating over 100 (i.e., a rough measure of being a regulation length golf
course) had Slope Ratings from the regular tees between 114 and 123.
When courses have about the same
true slope, measurement errors become more important. Assume for example, that two courses have the
same true slope. The probability of
estimating the true Slope Rating at any one course is only around .13i.e., the USGA estimated Slope Rating will only equal the true Slope Rating in about 1 in 7 tries.[8]
More
important, however, is the difference in the estimate of the two slopes since
this will partially determine the size of any portability error. The standard error of the estimate of the
difference in two slope ratings is approximately 4.4 rating points. The probability that the difference in two
slopes will be estimated correctly is about .09.
The probability of making errors of various size in the estimate of the
difference in two slope ratings is shown in Table 2.
Table 2
Probability of an Error in Estimating
the Difference in Two Slope Ratings
N= Difference in Slope Ratings

Probability of an Error of More than N Rating Points

0

.91

1

.73

2

.57

3

.42

4

.31

5

.21

6

.14

7

.09

8

.05

9

.03

10

.02

The
table indicates there is a one in five chance the difference in Slope Ratings will be off by 5 or more rating points. In this example, the Slope System can actually lessen the
equity of competition when the true slope of courses are closely clustered.
Case 3: Large
Range in Slope Ratings  The value of the Slope System is more evident
where there is a large difference in the true slope among courses. If the true difference were 10 rating points,
for example, the more difficult course would almost always have a higher Slope
Sating. That is a player’s handicap (as long as it not below scratch) will
stay the same or increase while going to the course that is tougher for the
bogey golfer. The increase may not
reflect the true difference in difficulty because of measurement errors. Measurement errors, however, may be small in
comparison to the true difference in the Slope Rating.
Summary  The overall assessment of the
efficacy of the Slope System rests in part on the range of slope ratings a
player encounters. Are the errors
involved in rating similar courses outweighed by the more accurate handicap at
courses with a substantially higher or low slope rating? The answer to this question lies within data
controlled by the USGA. It is only with
this data that the size of other possible error can be estimated. Remember, this analysis used the best
possible case for the Slope System. And
most importantly, USGA data could reveal whether there is indeed a slope effect
large enough to justify the Slope System.
The USGA Responds
The USGA was not pleased with this
paper. They responded with little understanding of the statistics underlying their own model.[9] The USGA denied there could be such
large errors in the Slope Ratings:
In
the USGA Rating System we admit that scratch rating are only accurate to within
±.3 stokes and bogey rating to within ±.5 strokes. Assuming the widest spread… the resulting
error in the slope rating is .8•(5.381) or 4.3.
So the certainty that 130 is more difficult than 125 is a little better
than the certainty we assign to course ratings.
First,
the USGA makes an error in assuming the widest spread can be .8. The standard error does not mean that the estimate
cannot be off by more that ±.3. The
standard error only means that the course rating methodology has at best only a
68 percent chance of bracketing the true mean to with ±.3 strokes. This is an elementary mistake in statistics
and reflects the level of sophistication behind the USGA ratings model.
Second, he USGA response only deals
with the error at one course. What is
important to the success of the Slope System is the error in the difference in
Slope Ratings between courses. If you
only play one course, any error in the slope rating will have no effect on your
handicap. The errors between courses,
however, can be substantial and distort the computation of a player’s handicap
and lead to inequitable competition in interclub matches.
[1]
Letter from David Fay, Executive Director of the USGA to the author, May 31,
1991
[2]
It is assumed the USGA used the better half of the 18 hole scores rather that
the better half of the scores on each hole.
Using the latter scores would lead to an underestimate of the course
rating.
[3]
In order to estimate the magnitude of the difference in the two different
sampling methods, 280 scores from the 1991 Atlanta Open were examined. Using the better half of all scores, the
average score was 68.3. Taking the 10
best scores out of sets of 20 scores (players were grouped alphabetically into
groups of 20), the average was 68.5 strokes.
The method of selecting the sample led to a difference in the course
rating estimate of .2 strokes. The
difference is likely to be larger in the USGA estimate of the course rating
since there is probably a greater variance in scores at the USGA Amateur than
there is at most professional tournaments.
[4]
One of the major assumptions of this type of regression analysis is not
met. Least squares regression techniques
assume there are no errors in measuring the independent variables (i.e., the
yardage and obstacle values.) This is
not the case with obstacle values, however.
For example, with what level of precision can a water hazard be rated a “2.” Would other raters put the value at 3. Is the hazard really a 2.45, but rated at 2
because the rating methodology does not allow for such fine distinctions.? The model proposed by the USGA belongs to the
class of problems termed “errors in both variables.” The USGA has made no attempt to estimate the
size of this error. Estimating
techniques for such models can be found in Wonnacott and Wonnacott, Econometrics,
Wiley and Sons, New York, 1970, p. 164.
[5]
Knuth, D., “A Two Parameter Golf Course Rating System,” Science and Golf: the
Proceedings of the First World Scientific Congress, Rutledge, Chapman and Hall,
London, 1990
[6]
Another problem with the methodology employed by the USGA is the sample of
holes selected. The USGA had data for
126 holes, but chose to use data from only 74 holes. In essence over 40 percent of the sample was
not used. The criteria for the selection
of holes has not been given. The
omission of so much data, however, could lead to vary different estimates of
the coefficients or much larger error estimates than the USGA has presented..
[7]
Letter from Warren Simmons, Executive Director of the Colorado Golf Association
to Dean Knuth, Director of Handicapping of the USGA. February 14, 1991
[8]
To estimate true slope, the estimate must be off by less than .5 of a rating
point. The standard deviation of the
estimate of the mean is 3.1 rating pints.
Therefore, the estimate must be within .16 stand error of the true
mean. Assuming a normal distribution of
the error, such an estimate should occur only 13 percent of the time.
[9]
Letter from Warren Simmons, op. cit.
http://www.golfdigest.com/story/howfardoaveragegolfersreallyhititnewdistancedatawillsurpriseyou
ReplyDeleteLet me know if you think the data makes sense. Equipment is helpful but not as much as you would think. Still have to hit on the face, most golfers can barely accomplish that.
I believe the general argument of the article is the average player has not gained much from technical advancements in clubs and balls. While one might quibble with the methodology, the finding is consistent with my experience. The back tees at most clubs are rarely used. The purchase of new clubs is essentially a case of hope overcoming experience. I'd write more but I am off to demo day...
Delete