Dec 11 2009

The Relationship Between Age, Usage, and 3FG%

In my last two posts I have taken a look at modeling a player’s 3FG% based on their age, and I’ve also estimated the relationship between usage and 3FG%. In this post I would like to bring these two topics together and estimate the relationship between age, usage, and 3FG%.

The Data

To put this data set together I again used Basketball-Reference.com’s Player Season Finder, but this time I collected the advanced statistics to go along with the player’s 3pt makes and attempts. Also, I used more years of data, as this data set is from the 1989-90 to 2008-09 seasons.

My original threshold for including the player season in the data set was to require at least two 3pt shot attempts during the season. I have, however, increased this threshold to eighty two 3pt shot attempts in an effort to isolate the data set to only players that we expect to shoot 3pt shots. This means that I’m attempting to quantify those players that are “regular” or “semi-regular” 3pt shooters and disregard those that do not consider the 3pt shot a part of their game.

The Model

To estimate the relationship between age, usage, and 3FG%, I’ve fit the following model:

Pr({\tt 3FG make}) = {\tt logit}^{-1}(\alpha + \beta_{1}({\tt USG\%}) + \beta_{2}({\tt age}) + \beta_{3}({\tt age}^{2}))

I fit this logistic regression as a multilevel model to allow the intercept and coefficients for USG% and the age quadratic to all vary by player. This type of model allows us to estimate the player ability while allowing us to estimate individual USG% lines and individual player aging curves.

The Results

The average player results are as follows:

  • Coefficients: \alpha = -1.62, \beta_{1} = -0.0061, \beta_{2} = 0.081, \beta_{3} = -0.00136. The p-values for testing if the true values of these parameters are equal to zero are all less than 0.01.
  • USG%: The coefficient for usage, \beta_{1} = -0.0061, suggests that for each additional 1% in an individual’s USG% the odds the individual makes a 3FG attempt are decreased by 0.6%. As we would expect, this suggests that a player that increases their usage from 20% to 21% would expect to see their odds of making a 3pt FG attempt decrease by 0.6%
  • Age: The coefficients for the aging curve, \beta_{2} = 0.081 and \beta_{3} = -0.00136, suggest that the average player’s peak in 3pt shooting ability occurs when they are 30 years old.

Trevor Ariza

Trevor Ariza was the source of the original motivation for looking at the relationship between usage and 3FG%, so I thought it would be appropriate to present a graph of his estimated aging curve at usage levels of 10% (blue), 20% (black), and 30% (red). The dots represent the sample 3FG% for Trevor at the specified age:

Trevor Ariza: Estimated Usage% and Aging Curve

One thing you’ll notice is that we only have one data point on this graph. This is because Trevor did not shoot many 3pt shots until last season with the Lakers.

That said, using just last year’s data for Ariza we would predict him to shoot 34% this year with the Rockets at age 24 using 23.6% of his lineup’s possessions. Thus far this year he’s shooting 34.3%. Don’t read too much into the closeness of this predicted% to his actual%, as a 95% confidence interval for his 3FG% this year is (26.6%, 42.8%).

One thing to note is that this model suggests that last year’s 31.9% performance isn’t a fair representation of his true ability. This model estimates his fair ability of making a 3pt FG attempt to be 34.3% last year with the Lakers at age 23 using 16.7% of his lineup’s possessions.

Other Players

Here are some other player graphs that have more than a single season’s data, where lines for the estimated aging curve at usage levels of 10% (blue), 20% (black), and 30% (red) are shown. The dots represent the sample 3FG% for the players at the specified age:

Ray Allen: Estimated Usage% and Aging CurveSteve Nash: Estimated Usage% and Aging CurveDirk Nowitzki: Estimated Usage% and Aging CurveRobert Horry: Estimated Usage% and Aging Curve

More Work…

The next step is to try and validate these models using out of sample data. One thing I would like to do is to use cross-validation to measure the expected prediction error of this model. Also, I would like to quantify the uncertainty around these estimates. Current efforts to do this have left me unsatisfied, but there are certainly some confidence bounds we could generate for these estimtes, and they should prove to be worthwhile to create.

I’ll have to wait to do this, as my final exams start tomorrow, and I’ve blown off studying for them about as long as I possibly can. 8)

If you enjoyed this post, use RSS to get notified of new posts.

TAGS: , ,

15 Comments on this post

Trackbacks

  1. College Basketball: Rating Individual 3FG% wrote:

    [...] last post presented a model for predicting 3FG% based on a player’s ability, age, and role in the offense. A comment by DSMok1 inspired the [...]

    December 29th, 2009 at 2:32 am

  1. DSMok1 said:

    Good work, Ryan!

    I was considering how best to create an “equalized” measure of 3pt and 2pt % for college players, based on the opposition played and the usage percentage. In other words, I would create a notional percentage for each player based on a usage rate of 20%, playing NCAA-average opposition.

    Do you think that you could do a similar regression for 2Pt%, and post it? Do you think that using league-average values for beta-1 would be reasonable?

    December 17th, 2009 at 5:15 pm
  2. Ryan said:

    Thanks!

    In reference to college hoops, I certainly wouldn’t apply any of this to that arena. Opponent strength is going to be the most important thing to control for, some modeling that and taking usage into account shouldn’t be too tough if you’ve got the data.

    December 17th, 2009 at 6:13 pm
  3. DSMok1 said:

    What I’m honestly looking for is the definitive usage vs. 2pt% and usage vs. 3pt% curves. Is this study not the best available for that? I can control for opponent strength just fine; I can use the same system I used to control for opposition in football (see the link in my name). It is very hard to derive usage vs. % curves for college due to extra variables and very few data points.

    December 17th, 2009 at 6:31 pm
  4. Ryan said:

    I wouldn’t have a lot of confidence applying this to the college game. I’d first try and fit lines for usage and age while taking opponent strength into account to see what the data says with respect to the college game. I think that given the lack of data issue you mention, this type of model would be perfect to apply to the college data.

    December 17th, 2009 at 6:44 pm
  5. DSMok1 said:

    If one had the data set and knew how to do that… I’m not a mathematician, just an engineer! I like to segment the problems and solve them incrementally–get an age curve, solve the opponent issue, and get the usage/% curve separately.

    December 17th, 2009 at 7:51 pm
  6. DSMok1 said:

    There is no boxscore data readily available as far as I know (NCAA makes this year’s data available, but not previous years…)

    December 17th, 2009 at 8:00 pm
  7. Ryan said:

    I think I’ve found a source, but one issue will be identifying how much experience the players have. At the very least we can create a usage curve of some sort. Would like to take experience into account, but I’m pretty sure I’ll need to find another source for this info.

    December 17th, 2009 at 8:12 pm
  8. DSMok1 said:

    http://web1.ncaa.org/stats/StatsSrv/careersearch has lineups for the last few years, but I don’t know how to scrape it. Their football stats site is a lot more scrape-able!

    December 17th, 2009 at 8:15 pm
  9. Ryan said:

    Thanks, that should help, even if I have go get some of that stuff by hand.

    December 17th, 2009 at 9:12 pm
  10. DSMok1 said:

    I like your Twitter statement! I really hope to see some good, hard numbers for this. I’m preparing to assemble statistical plus/minus for the last 5 or 6 years, using the DX data and KenPom for the team efficiencies…

    December 23rd, 2009 at 2:16 pm
  11. Ryan said:

    Yeah I’m working on getting the data in a format I can use for an analysis, but hopefully I’ll have something ready in a few days.

    As for statistical plus/minus, do you plan on using college play-by-play to estimate the coefficients? Not sure the NBA ones are what you’d want to use for the college game.

    December 23rd, 2009 at 5:29 pm
  12. DSMok1 said:

    No option on what coefficients to use, honestly–there is not (yet) any set of NCAA adjusted plus/minus ratings to regress box score stats onto. That is an issue; the fact that there is a squared term for shot attempts skews exceptional players playing in lower conferences that shoot a ton (Steph Curry, Lester Hudson last year). There isn’t a way to translate box score statistics into a NCAA average environment… yet…

    December 24th, 2009 at 4:47 pm
  13. Ryan said:

    Honestly I don’t think it will be that tough. We should be able to get some play-by-play data from Jon Nichols for the 2009-10 season that has a decent sample of NCAA players that can be used to get the coefficients. The box scores I’ve got will allow us to rate statistics to league average competition, and we can combine the two to get some real deal statistical +/- of NCAA players. :)

    December 24th, 2009 at 4:52 pm
  14. DSMok1 said:

    That would be wonderful! I would note, though, that the current SPM numbers come off 6 years of APM’s. That’s a much bigger and better sample size for some thing as unstable as APM.

    You might drop J Givony a line at DX… he’s looking for more college statistical input…

    December 24th, 2009 at 9:22 pm