Measuring the Relationship Between Players and their Lineup’s Shot Distribution
 10 Comment
In my last post I looked at how we might rate a player’s impact on their lineup’s FG% in the low paint. With this came the obvious question of: “What about shot distribution?”
With this question in mind, I’ve finally put forth efforts into trying to make sense out of how players fit together. In this case I’m simply trying to figure out the relationship between players and shot distribution. Coaching certainly matters, but I’ve gotta start somewhere!
The Model
Similar to my last post, I’ve fit an adjusted plus/minuslike logistic regression for approximating the shot distribution for these 5 locations on the court:
 Low Paint – The area in the paint within 6 feet of the basket
 High Paint – All other shots in the paint
 MidRange – All 2pt shots outside of the paint
 Corner 3s – 3pt shots on the sidelines up to 14 feet from the baseline
 Other 3s – All other 3pt shots
Also, like before, I’ve used data from the ’07’08 season and accounted for all players that took part in at least 1600 shots.
The Results
For this model I feel the best way to present the results is by spreadsheet:
http://spreadsheets.google.com/ccc?key=pLJimPjd7oqtOTygaQH2dWw
In this spreadsheet you will find the relationship between each player and their lineup’s offensive and defensive shot distributions with respect to the average player at each specific position.
Take Steve Nash as an example (line 9).
On offense, Nash is associated with: a 0.5% increase in shots from the low paint; a 0.6% decrease in shots from the high paint; a 5.8% decrease in shots from midrange; a 1.3% increase in corner 3pt shots; and a 4.5% increase in all other 3pt shots.
On defense, Nash is associated with: a 0.4% increase in shots from the low paint; a 1.4% increase in shots from the high paint; a 1% increase in shots from midrange; a 1% decrease in corner 3pt shots; and a 1.8% decrease in all other 3pt shots.
Again, these numbers are with respect to the average point guard in this data set.
Combining Players
Based on the construction of this model, we can combine players to get an approximation of what their lineup’s offensive and defensive shot distributions would look like.
Clearly this is not without error. From an offensive standpoint, a coach can in some ways control each individual player’s shot distributions, which affects the lineup’s overall shot distribution. From a defensive standpoint, a common held belief is that a “system” can play a big role (see the Celtics last year).
With these obvious realites that the model does not take into account, we can take a peek at what the prediction would be.
Average Offense and Defense
We can use the average offensive and defensive lineup as a starting point. The average offensive and defensive shot distributions would look something like this:
 Low Paint – 32.3%
 High Paint – 11.4%
 MidRange – 37.1%
 Corner 3s – 4.5%
 Other 3s – 14.5%
The ’07’08 Boston Celtics
We’ll first take a look at the ’07’08 champion Boston Celtics most used lineup of Rajon Rondo, Ray Allen, Paul Pierce, Kevin Garnett, and Kendrick Perkins. Against the average lineup from this data set, their offensive shot distribution would look something like:
 Low Paint – 38%
 High Paint – 10.5%
 MidRange – 31.9%
 Corner 3s – 4%
 Other 3s – 15.5%
Their defensive shot distribution would look something like:
 Low Paint – 31.2%
 High Paint – 10.6%
 MidRange – 37.9%
 Corner 3s – 4.9%
 Other 3s – 15.3%
Let the Fun Begin
Lets imagine a world in which the Rockets loaned the Celtics Yao Ming in exchange for some time with Kendrick Perkins. What sort of shot distribution would this lineup of Rajon Rondo, Ray Allen, Paul Pierce, Kevin Garnett, and Yao Ming have?
Based on this model, their offensive shot distribution would look something like:
 Low Paint – 32%
 High Paint – 13.5%
 MidRange – 36.4%
 Corner 3s – 4.6%
 Other 3s – 13.4%
Their defensive shot distribution would look something like:
 Low Paint – 24.3%
 High Paint – 11.9%
 MidRange – 49%
 Corner 3s – 3.3%
 Other 3s – 11.5%
Since this model doesn’t account for coaching affects, we’d naturally assume there is some extra error involved with taking a player from another team, in this case Yao Ming, and placing him with this new lineup.
What I find most interesting, however, is the defensive aspects of this. I think it is fair to say that this lineup with Ming would do a better job of keeping shots out of the low paint. This shouldn’t surprise anyone, but it is nice to be able to put some numbers to this.
Shooting Percentages
The shot distribution is just part of the picture. The next step is to look at shooting percentages from all locations on the court. I’ve already looked at the low paint, but by examining the other areas I believe we can come up with a model that would allow us to attach an offensive and defensive eFG% to a given lineup.
If we were able to do this, then we’d have a metric that would allow us to gauge the effectiveness of a lineup with respect to shooting. Getting that far would allow us to look at doing similar things for the other four factors: turnovers, rebounding, and free throws, as controlling the ball, getting boards, and getting to the line and keeping your opponent off of the line are all important parts of the game.
Replicate these Results
First, you’ll need to download the dist.zip archive (4MB).
The first thing you might want to replicate is the regressions. In the *.dist directories you will find an associated R file that will run the logistic regression for that location on the court. Simply source() these files from R and everything should run without issue.
The other area of interest is the lineup combinations. Inside of the dist.results directory, you will find a dist.R file that contains functions to obtain results from the fitted models. The function of most interest will be the dist.combine_players() function. To use this function, you’ll first need to run source(“dist.R”). Note: You do not need to run the regressions to use this function.
Without arguments, dist.combine_players() displays results for league average players at each position. This function, however, takes 5 arguments: PG, SG, SF, PF, and C. These arguments allow you to specify players at each position. So to see results for a lineup of Allen Iverson, Dwyane Wade, Paul Pierce, LaMarcus Aldridge, and Tim Duncan, run:
 dist.combine_players(PG=”Allen Iverson”, SG=”Dwyane Wade”, SF=”Paul Pierce”, PF=”LaMarcus Aldridge”, C=”Tim Duncan”)
If you run this, you should get the following offensive shot distribution:
 Low Paint – 38%
 High Paint – 23%
 MidRange – 23.8%
 Corner 3s – 5.5%
 Other 3s – 9.5%
Also, you should get the following defensive shot distribution:
 Low Paint – 30.1%
 High Paint – 10.3%
 MidRange – 42.5%
 Corner 3s – 4.7%
 Other 3s – 12.3%
Summary
As mentioned before, the shot distribution is just half the battle. We need to attach shooting percentages to each of these locations, as that will allow us to truly determine the effectiveness of a lineup with respect to shooting.
One area of concern is predictability. Would this model (or one like it) do a better job of fitting players together than a coach or GM? Once I look at shooting percentages, the next step will be to look at data for historical seasons to see what sort of yeartoyear relationships exist, and to see how well predictions can be made from one year to the next.
As of now, that’s a complete unknown. But clearly that will determine just how effective this type of model is.
10 Comments on this post
Trackbacks

Neil Paine said:
This is great stuff, Ryan. I mean, really fantastic work.
March 4th, 2009 at 1:21 pm 
Ryan said:
Thanks Neil.
I like the view this gives us, but I’m even more interested in seeing the impact on FG% from the shot locations for creating offensive and defensive eFG%.
Of course I also would like to look at how players change from year to year (aging curves a possibility?). Just need to parse the data.
So, back to work! haha
March 4th, 2009 at 3:02 pm 
Kevin F said:
Hi Ryan,
I know with regular adjusted plus minus rankings there is quite a bit of noise in the estimates that emerge. Is that the case here as well? If so, how big are the standard errors?
March 5th, 2009 at 2:01 am 
Ryan said:
Kevin, there is some noise, certainly, although less so than adjusted +/. There are some players (those that are say close to average compared to others at their position) who have larger standard errors that don’t show a tendency to have higher or lower shot distributions from any one area.
For the full details, see the *.o.csv and *.d.csv files inside of the dist.zip archive linked above. The 2nd column is the coefficient and the 3rd column is the standard error.
March 5th, 2009 at 2:03 am 
BEsmirched said:
EAsports should give you a contract.
March 5th, 2009 at 4:19 am 
mitchell_m said:
This is absolutely some of the most inspired stats work I’ve seen in a while. So simple, yet so elegant.
BEsmirched is absolutely right: if an NBA team won’t hire you, a video game company certainly should.
March 5th, 2009 at 8:35 am 
Coach Perez said:
Keep ’em coming…very interesting article. It challenges the way I approach the game as a coach.
March 9th, 2009 at 2:31 pm 
Raffi said:
Does anyone know where to find the fg% from corner threes vs. the fg% from all other threes? Would be interesting to see the difference
May 20th, 2009 at 2:34 pm 
Mountain said:
This is good stuff. There is a ton of other variations on lineup shot distributions that teams should do. How many have explore in detail? 4? Less?
May 25th, 2009 at 7:04 pm
[…] In my last post I presented a method for measuring the relationship between players and their lineup’s shot distribution. […]