Dec 5 2009

Estimating Individual Impact on Defensive Efficiency

Earlier this week in @johnschuhmann‘s excellent column The Numbers Game, John looks at how Andre Miller impacts defensive efficiency. This, along with @kpelton‘s note that “the Blazers have been more effective defensively when Joel Przybilla is playing than when Greg Oden is playing” (from Blazersedge) has helped motivate me to look at what estimates we can draw about the impact these players have on defensive efficiency. The goal of this study is to do two things: 1) estimate the impact these players have on defensive efficiency, and 2) quantify the uncertainty we have about these estimates.

Constructing the Model

I have been doing some research to figure out how to best model the number of points teams score and allow on an individual possession, so I will be using that type of model for creating these estimates of an individual’s impact on defensive efficiency. The biggest difference between what I’m doing and adjusted plus/minus is that I consider what happens on each possession rather than what happens over a span of possessions for each combination of players. This means that I can better estimate the individual impact on allowing points to be scored on an individual possession rather than simply estimating the individual impact on the mean number of points allowed per hundred possessions.

There are many different ways to construct this model for estimating individual player impact on defensive efficiency, but I’ve chosen the modeling option I feel most comfortable with:

  • Fitting the model using all NBA possessions, where individual teams, Blazers’ players, home court advantage, number of offensive reserves, and being in the penalty are considered as predictors.

This modeling option controls for things we know to be important like opposing team strength, home court advantage, number of offensive reserves, and being in the penalty. The number of offensive reserves is intended to be a proxy for individual opponent strength. Although that was the intention, it is also true that the number of offensive reserves is correlated with game situation, like blowouts. Thus this is certainly one area of the model that can be improved on in the future.

Examining Andre Miller’s Defense

In The Numbers Game, John first writes about what Andre did with Philadelphia last year, so I’ll start there: what was Andre’s impact on Philadelphia’s defense? To measure this, we need a player to compare him against. Like John, I will compare Andre to Lou Williams.

To do a comparison in terms of efficiency, I must select teammates for these players. In this case, I have chosen the players from Philadelphia’s most used lineup last year: Andre Iguodala, Samuel Dalembert, Thaddeus Young, and Willie Green. Also, these estimates of defensive efficiency come from assuming there are zero opponent reserve players and that the lineup is not in the penalty.

Under these assumptions, the model estimates that this lineup with Andre performs at 0.18 points per hundred possessions worse than with Lou. A 95% confidence interval for this estimate is (-9.4, 9.2). This estimated difference is small, and there is a lot of uncertainty around this estimate. Even after a full season we do not have much confidence in saying either player has a better impact on defensive efficiency in the context of this lineup. Strictly in terms of defensive efficiency, this model suggests we could plausibly get by with either player. Defense is only half of the game, but for our purposes of evaluating defense we wouldn’t prefer one player over the other.

Thus this analysis doesn’t agree with John’s conclusion that “… Miller’s -3.2 differential was aided by the amount of time he spent on the floor next to Iguodala and Thaddeus Young, but Lou Williams’ +5.9 differential last season makes it pretty clear that he’s not the defender that Miller is.” This model suggests that Andre Iguodala and Thaddeus Young were Philadelphia’s best defenders last year, so perhaps this means that John isn’t giving them enough credit for what they’re doing on defense.

Andre Miller versus Steve Blake

Looking at last year is fun, but what we’re most interested in right now is comparing Andre’s defensive impact to the defensive impact of one of his current teammates, Steve Blake. To estimate the difference between Andre’s and Steve’s impact on defensive efficiency, I’ve selected Brandon Roy, Greg Oden, LaMarcus Aldridge, and Martell Webster to be their teammates.

Under these conditions, the model estimates that the lineup with Andre performs 0.05 points per hundred possessions worse than the lineup with Steve Blake. A 95% confidence interval for this difference is (-15.3, 13.8), and this means that similar to Andre versus Lou, the model suggests that we shouldn’t prefer either player in terms of their defensive impact.

Greg Oden versus Joel Przybilla

Although Kevin Pelton pointed out the difference between Greg’s and Joel’s defensive play this year, I want to first look at what conclusion we’d draw about the defensive play of these players at the end of last year. To do this, I’ve selected Brandon Roy, LaMarcus Aldridge, Nicolas Batum, and Steve Blake to be their teammates.

Under these conditions, the model estimates that the lineup with Greg performs 6.6 points per hundred possessions worse than the lineup with Joel. A 95% confidence interval for this difference is (0.65, 13.1), suggesting that we can be confident that in 2008-09 Joel’s defensive impact with this lineup was better than Greg’s defensive impact with this lineup.

For this season’s estimate I have selected Andre Miller, Bradon Roy, LaMarcus Aldridge, and Steve Blake to be their teammates. Under these conditions, the model estimates that the lineup with Greg Oden performs 5.9 points per hundred possessions worse than the lineup with Joel Przybilla.

A 95% confidence interval for this difference is (-7.0, 17.1), and this means that even though we estimate Joel’s defensive impact with this lineup to better than Greg’s impact with this lineup, we need more data before we can confidently make this statement like we could in 2008-09. Because the estimate is practically significant, I’d still prefer Joel over Greg if forced to make a choice strictly in terms of defensive impact.

But…

This model isn’t perfect. The way I control for individual opponent strength could be improved. And even though this type of model has the best of intentions, it will not tell us why players are having the impacts we estimate. It gives us more information than adjusted +/-, such as the impact an individual has on the specific number of points given up on defense, but we can still make use of other statistics for trying to dig into the why. Even these other statistics don’t tell us everything, so it is not surprising to me that coaches prefer video to statistics.

Lastly, this analysis doesn’t exactly clear up any debates Blazers’ fans may be having, like should Andre or Steve be starting, or should Greg or Joel get more playing time? This model is just one way of looking at the data, and defense counts for just half of what teams do to win games. Thus I’ll leave it up to rabid Blazers’ fans to weigh the deficiencies of this model and to figure out which players are better on offense. :)

If you enjoyed this post, use RSS to get notified of new posts.

7 Comments on this post

Trackbacks


  1. Ryan said:

    In light of last night’s injury to Oden that happened literally minutes after publishing this post, I figure I should provide details on what this model suggests in terms of Greg versus Joel on offense. The model estimates that the lineup with Greg performs 12 points per hundred possessions better than with Joel, and a 95% confidence interval for this difference is (0.27, 23.9). Based on Joel’s estimated difference on defense, the model estimates that the Blazers are a net +6 points per hundred possessions better with Greg than with Joel. Clearly not having both Greg and Joel hurts, but it appears as if Greg was likely the player that should be getting the most minutes of the two, which he was.

    Also, for the curious, this year’s model estimates that Andre’s offense performs at 0.6 points per hundred possessions worse than with Blake. The 95% confidence interval is (-15.5, 14.3), so we don’t really prefer either on offense based on this model. The styles are different, though, as with Andre we expect more twos to be scored, but with Blake we expect more threes to be scored.

    December 6th, 2009 at 12:24 pm
  2. Ziller said:

    Great stuff as always, Ryan. This is a great way of looking at it. Have you looked at supplemental lineups for the players you’re comparing? In other words, since the Miller-Williams question is resolved such that neither is preferable over the other for defensive purposes based on your model with the Young-Iguodala-Dalembert-Green line-up, would including other shared line-ups (such as Young-Iguodala-Dalembert-Brand) decrease the uncertainty? Perhaps not in this example due to Brand’s lack of minutes. But that’s the idea. Is that sensible?

    December 7th, 2009 at 12:17 pm
  3. Ryan said:

    It is plausible, and in this case I only considered the uncertainty between the two players. Thinking about it more, I think a more correct analysis would be to consider the uncertainty with all player coefficients, as there is likely some correlation between Miller and this more used lineup that would perhaps make him the preferred choice, or at least give a tighter bound on that lineup’s estimate that would make him look better.

    December 7th, 2009 at 1:41 pm
  4. Deepak said:

    I’m trying to understand your multinomial logistic model. What does the log( pi_i / pi_0) represent from your poster?

    Is it correct to say that your approach is similar to adjusted +/-, except you’re using individual possession outcomes as your observations rather than point differential over “time segments” with no player substitutions?

    Thanks.

    December 7th, 2009 at 2:50 pm
  5. Ryan said:

    Deepak, this model considers the possibility of scoring zero, one, two, or at least three points. Those are the “categories” of the model, so pi_i is the probability that i number of points are scored (or allowed), for i = 1, 2, 3. Zero is the “baseline” category, and that’s where pi_0 comes from.

    pi_i / pi_0 are the odds that points were scored in category i versus category zero, and log(pi_i / pi_0), the log of these odds which is the “link” function for this generalized linear model.

    I think that your description comparing what I’m doing to adjusted +/- is correct. That said, I’m not considering a model with every possible player in the league, rather I only estimate coefficients for players on a specific team of interest.

    December 7th, 2009 at 3:15 pm
  6. Deepak said:

    Like adjusted +/-, your model will predict the same defensive impact of one player with respect to another, regardless of the other players on the floor. Is that right? So when you determine, for instance, that Pryzbilla was 6 points better than Oden for the lineup you chose, will that hold regardless of the other Blazers on the floor?

    December 7th, 2009 at 3:36 pm
  7. Ryan said:

    This model is only linear in terms of the log of the odds (the log(pi_i / pi_0) piece). The most likely affect of this is that the magnitude of the impact could change based on the other players, but the end result should be similar unless there is a player with some extreme coefficients. These extreme coefficients aren’t reasonable for any player, but they do exist for players that play few minutes that we’re going to disregard anyway.

    December 7th, 2009 at 3:55 pm