[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

A blog about baseball, hockey, life, and whatever else there is.

Sunday, September 11, 2016

WPA in NBA

?I'm quite enjoying this article by Shane Jensen (of SAFE fielding metric in baseball) et al.

The intro had the effect of making me thinking he was being "analytic political".  You can tell that because you can make virtually the same argument and come to the opposite conclusion.  But, by the end of the intro, he was quite clear and honest, that he redeemed himself.

I'm reading the part with the three images, and I think figure 1A is so completely outlandish as a comparison point that it should never have  been introduced.  I mean NO ONE thinks like this, so why treat is as some sort of benchmark.  Anyway, if we just focus on 1B and 1C, that's where the payoff is.  Except there's no reason to do the smoothing by promixity points like he's doing.  When you have something complicated (like wOBA by exit speed, launch angle, spray angle), then yes, that becomes almost necessary (until we can create a function, which is my plan).  And that is how I have been doing it in Statcast Lab.  

But in this particular case, with just two simple parameters, time and lead, you can simply draw a smooth line for each lead, and the only thing you make sure is that the higher the lead the better chance of winning. Plotting like a heat map like they did is what I call "mathematical gyrations".  Whenever you can, make the solution as simple as possible... but no simpler.

I consider my comments more of a technical nitpick, because the overall conclusions won't change much if at all.  I'm now on section 2.2 and will update this blog post after I finish that.

***

Ok, this hasn't been addressed, but this is, to me, the main problem with all these plus/minus metrics.  There is a HUGE amount of "sharing" in hockey and basketball, unlike baseball, that the "opportunity space" isn't so clear cut.  If you put the top 5 NBA players on the same team, their "sum of the parts" will be less than the whole.  There is a diminishing returns aspect.

But, then we get back to "retrospective" v "predictive".  The title of the paper itself is a problem, because it talks about "chances": Estimating an NBA player’s impact on his team’s chances of winning.  You can read that both restrospectively and predictively.  Since in the intro they are quite clear that this is retrospective, they then don't need to consider the issue of diminishing returns.  They are taking, as a given, the context.  And if a top NBA player plays with great players, then it's possible that his impact is being muted, compared to a top NBA player who plays with nobodies.  OR, it's possible there is a leverage aspect that a top NBA player NEEDS at least one other top player in order to fully shine.  

I don't know the answer.  And this issue is also besides the point of this paper.  I just want the Straight Arrow readers to be aware that this issue, seemingly semantic, would have required a (potentially) drastically different approach, if the topic was about predictive.

Onto section 3.

***

Tremendous work on the "leverage profiles".  This was exactly what I was thinking about with my note above.  I love research papers that anticipate my questions and answer them.

***

[quote]A natural question to ask about any player evaluation metric is how stable it is year-to-year.[/quote]

And this is my fear.  Since the thrust of their evaluation was based on retrospective, now turning to the predictive, WHILE NOT CHANGING THEIR METHODOLOGY is a curious choice, and the semantic point I just made now becomes a central point.  Let's see how they handle it, if at all.

In many ways, this is like FIP and FutureFIP.  FIP is retrospective.  FutureFIP is predictive.  If you wanted me to talk about the past, I'd use FIP.  If you wanted me to talk about the future, I'd use FutureFIP.  It would be therefore foolish to compare a predictive metric to FIP.  

Back to reading...

***

[quote]We observe that the correlation between 2012–2013 and 2013–2014 Impact Score is 0.242, indicating a rather moderate positive trend.... Because it is context-dependent, we would not expect the year-to-year correlation for Impact Scores to be nearly as high as the year-to-year correlation for PER (correlation of 0.75), which attempts to provide a context-agnostic assessment of player contribution.[/quote]

Right, exactly.  So, in order to compare apples to apples, the score/lead should be removed from consideration in order to compare to something like PER.  The context-neutral Impact Score can actually be better than PER.  But, we don't know that!  Which is why this paper is screaming for that work to be done.  Well, maybe they do it.  Let me keep reading...

***

The paper is beautiful in asking the right questions.  If the authors of the paper provide some of the aggregated data, I'd love to put my little spin on it.

One of the things that is intriguing is the 5-man combo.  If I'm reading it right, the Curry-led combo has a net impact of +.183 wins per 48 minutes, or a .683 win%.  By the way, this is what they should be showing.  They show an "impact score" and minutes, but the scale that everyone cares about is win%.  I guess it's the difference between what you see in a research paper and what you see in a blog.

They talk about that unit facing the unit with the worst impact score.  Except I can't tell WHAT their score is.  And I can't tell what the final matchup win expectancy is, nor what log5 would suggest.  They do show the results as a distribution (not the mean).  But even that is problematic, because their scale is so wide that when you see Figure 14, you hardly see it as a huge difference.

***

So, this paper has all the ingredients you need to have one of the best saber-level research pieces of the year.  We just need to change the recipe a bit.

Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

THREADS

September 11, 2016
WPA in NBA