Power Rankings, SSR Updates, and More
 21 Comment
Over the past few days I’ve done a little house keeping and updated various sections of the website. Here is a quick list of the updates:
 NBA Power Rankings: @kpelton‘s recent Basketball Prospectus article inspired me to rank and rate NBA teams using some of the more recent tools I’ve added to my toolbox. You can find these power rankings at the new NBA Power Rankings section of the website, and these rankings will be updated daily.
 Statistical Scouting Reports: After initially releasing 200809 offensive statistical scouting reports, I’ve added 200910 data to go along with 200708 and 200607 data. One word of caution about the 200910 data: if you see range values that have the same numbers on the left and right (i.e. 50%50%), then this means the models do not detect a difference between players at that position for the given statistic. This means we don’t have enough data yet, so don’t go too crazy when you see these kinds of numbers for some player statistics. I’ve also added some defensive measures to these reports. I’m not satisfied with the current groupings of some of the stats, so if you have any suggestions please voice them.
 PlaybyPlay Data: I’ve created an archive for last year’s 200809 playbyplay data, and you can now download playbyplay data from the 200910 regular season. The 200910 playbyplay data should be updating on a daily basis. Please let me know if you find any errors, and enjoy!
I hope you find these updates useful!
If you enjoyed this post, use RSS to get notified of new posts.
21 Comments on this post
Trackbacks

Deepak said:
Thanks for the updates. Those parsed play by plays you put together are a great resource.
November 29th, 2009 at 7:57 pm 
Deepak said:
There looks to be an inconsistency between your statistical report and the stats on HoopData. Specifically for Trevor Ariza, Hoop Data shows his 2point shooting to be only 41%, while your report suggests it is at 48%.
November 30th, 2009 at 6:21 pm 
Ryan said:
Even if we’re using the exact same data, my estimates will be different due to the way I model the statistics.
The data is noisy, especially early in the season, so I use multilevel models to shrink player estimates to the league wide mean for each position for shooting percentages and shot distribution.
November 30th, 2009 at 6:29 pm 
Deepak said:
My mistake. I thought the numbers represent what has actually happened up to this point. Is the idea to project what the player will do over the course of the season? How do you go about estimating shot distribution, for example?
November 30th, 2009 at 6:39 pm 
Ryan said:
Yeah, the goal is to best estimate the true shooting%, shot distribution, etc. for the players.
For the shot distribution, we can think of it in a binary case: shoot from corner 3, or don’t shoot from corner 3. This is done for all locations, and the results can be merged to estimate the shot distribution. If you’re wondering how the underlying model works, each player’s estimate is considered to come from a similar distribution (a normal distribution with a fixed mean). This is what is also referred to as a random effect.
November 30th, 2009 at 6:48 pm 
Deepak said:
Could you explain why the estimated mean would not simply be what the players have done up to this point in the season? Are you also factoring in what they’ve done in past seasons, or what the league average player at his position does?
November 30th, 2009 at 6:54 pm 
Deepak said:
Nevermind … I think you answered that question.
“The data is noisy, especially early in the season, so I use multilevel models to shrink player estimates to the league wide mean for each position for shooting percentages and shot distribution.”
November 30th, 2009 at 6:55 pm 
Ryan said:
I think this brings up two good points of clarification.
This estimated mean is only based on what this player and other players in the league have done this year. No previous season information is included.
So this means that only data about the league average for the player’s position is used. This has given me some good ideas for what to show next, such as the variability we expect at each position.
November 30th, 2009 at 6:57 pm 
DSMok1 said:
Ryan, could you explain more about the Power Rankings system you use, and why it could be so far different from (for instance) the SRS values that Basketball Reference has? SRS, I know, is using point margin instead of efficiency margin–but both systems adjust for opponents, so it seems unusual that (for instance) the Lakers should be 6th in your ranking and 3rd in theirs.
Your system is a true adjusted efficiency margin system, right?
The issue is this–I intend to compile statistical +/ data for the last 10 years or so, 1 need true efficiency margin rankings to sum to, and I don’t know where to find them. Any ideas?
January 15th, 2010 at 10:51 am 
Ryan said:
I model every possession. This means that I estimate the probability of there being zero points, one point, two points, etc. scored on any possession.
These are hard to interpret, so I transform them to the efficiency ratings shown where every team “plays” every other team once at home and once on the road. In other words, these are the efficiency ratings if each team played a “fair” schedule against all other teams in the NBA.
So to answer you question, I would consider these to be fair estimates of a team’s efficiency rating against the entire league.
January 15th, 2010 at 11:11 am 
DSMok1 said:
Why do you think the Celtics get such a huge boost? That doesn’t quite make sense. I have them down for an efficiency margin of 6.8 (best in the league) but also the easiest schedule in the league–while your numbers imply about the hardest schedule in the league.
January 15th, 2010 at 11:29 am 
DSMok1 said:
I ran a pure power rating for this season: each game counting evenly, using pure efficiency margins, and adjusting for location and opponent. The Celtics show an efficiency margin of 7.68 and a SoS of 1.22, for a rating of 6.46 (#2 in the league). Here were these “pure” rankings (should be analogous to Pomeroy’s NCAA ratings).
Team____AdjMar_Basketball Geek
CLE_____7.0____7.8
BOS_____6.5____8.3
SAS_____6.1____6.1
ATL_____5.8____6.1
LAL_____5.8____5.5
ORL_____5.1____5.7
DEN_____4.2____3.9
UTA_____4.1____3.9
POR_____3.4____4.2
PHO_____3.3____3.7
DAL_____3.3____3.3
OKC_____2.9____3
HOU_____1.6____1.4
MIA_____0.3____0.5
CHA____0.3____0.8
MEM____0.8____0.7
TOR____1.4____1.2
MIL____1.6____2.1
NYK____1.6____1.7
NOH____1.8____1.9
SAC____2.5____2.7
LAC____2.7____2.4
CHI____3.3____4.4
PHI____3.8____4.1
GSW____4.1____4.1
WAS____4.5____3.8
DET____5.5____5.5
IND____5.7____5.9
MIN____9.5____9.3
NJN____11.9___12.5I’m a little puzzled by the wide variance for Cleveland, Boston, and the Bulls.
January 15th, 2010 at 12:58 pm 
Ryan said:
Their current mark of 7.5 isn’t far from your 6.8, so I don’t figure it’s a huge boost compared to what you have. Also, my data is missing the 76ers game they lost which could explain a small part of the difference. The goal of my model is to predict the number of points scored on each possession, and I use this to then create the efficiencies.
January 15th, 2010 at 12:59 pm 
Ryan said:
I’m confident we can’t say the numbers are statistically significant from each other in the ratings you posted. Again, make sure you have the latest ratings that updated earlier. Cleveland is not #1 at 7.7 and Boston #2 at 7.5.
January 15th, 2010 at 1:02 pm 
DSMok1 said:
When do they update? I thought I got them this morning… oh well. That does look better!
I’ll probably just create a power ranking for each season I want to create SPM for… it’s not as hard as I thought.
January 15th, 2010 at 1:10 pm 
DSMok1 said:
I’m continuing to notice a big discrepancy in the ratings between mine and yours with the Boston Celtics. Remember, I’m weighting each game equally and calculating the adjusted efficiency margins for each team.
I corrected 1 flaw in my ratings (they didn’t sum to 0), but I still see the big Boston discrepancy. Do your ratings somehow favor the first part of the season (using a faulty updating algorithm?) I show Boston as #10 with a margin of only 2.8; you have them at #4 with a margin of 5.6. I’m running a simple minimization of residuals (not of squared residuals).
February 12th, 2010 at 12:39 pm 
Ryan said:
I’d say there are a few possible reasons why we’re not matching up. You weight by game, where as I weight by possession. I control for home court, where as you may not. I’m using a multinomial logistic regression, so I’m minimizing something a little different than what you’re minimizing.
February 12th, 2010 at 1:52 pm 
DSMok1 said:
I most definitely do control for home court, but it is home court leaguewide rather than teamspecific. Other than the “per game” vs. “per possession” (which I could do rather easily), I’m not exactly sure why it would make such a big difference. I’ll run the “per possession” to check what the difference would be.
February 12th, 2010 at 3:27 pm 
DSMok1 said:
“per possession” means that teams with a correlation between pace and how well they play will be skewed. That doesn’t account for enough, though.
How does your “home court” factor in? Is it league wide or for each team?
February 12th, 2010 at 4:34 pm 
DSMok1 said:
I looked at calculating individual home court advantages; my ratings match yours more now. But is that reasonable?
February 12th, 2010 at 5:31 pm 
Ryan said:
Home court is estimated for the entire league.
February 12th, 2010 at 5:51 pm