Thursday, September 27, 2018
Uncertainty… of what exactly?
There are three kinds of uncertainties:- Uncertainty due to random variation
- Uncertainty in understanding the impact
- Uncertainty in measuring the context
1.
Let's take the first one. This uncertainty is the random variation, which you can think simply of a binomial event (like OBP) or a mulitnomial event (HR, BB, SO, 1B, etc). In EITHER case, the random variation is proportionate, to a GREAT degree, to the square root of the playing time (or opportunity) of the events. In baseball, that's means IP or PA. And it virtually doesn't matter if you hit or give up alot of HR or few. The uncertainty has a correlation of r=0.9+ (likely r=0.99) to the root of playing time.
And what does this uncertainty imply? It implies that even if you observe 30 HR and 100 walks and 200 strikeouts in 600 PA that maybe we might have actually observed something else if the player was given a new set of 600 PA. This uncertainty is a big "what if we replayed the season". And from that standpoint, I think very very few people would even entertain this as the kind of uncertainty they would want.
2.
How about the second one? That one is easiest describe with the HR, which has a run value of 1.40... on average. Solo HR are worth 1 run, and the more runners, the more the HR is worth. And even then, events are worth more in close games than in blowouts. Or maybe not. That's our uncertainty. We observe 30 HR, we count 30 HR, we do NOT play a what-if he would have hit 22 or 41 HR. What we are asking is, with the 30 observed HR, what does that mean in terms of runs (or wins).
So a guy with 30 HR is going to be worth say 35 to 50 runs, all depending on how you treat the base-out scenarios, or even the inning-score.
And this is really more about assumptions than uncertainty. We do have Linear Weights (the metric that assigns a fixed 1.40 runs for each HR) and we do have RE24 (the metric that assigns a run value to each HR based on the base-out states). If we say 30 HR is 42 runs using Linear Weights, there's no uncertainty there. And neither is there uncertainty if we say 30 HR is 38runs or 46 runs using RE24. It's simply an assumption as to what you want the HR to represent in terms of its impact.
3.
So, is it the last one? Yes, yes, it is. It may not be what people are saying is the uncertainty, but this is exactly the uncertainty we are talking about. If we say that a hitter at Coors created 90 runs, and that is below league average, we are making that conclusion based on our opinion as to how Coors impacts run creation. And we are uncertain about that. Indeed, we are uncertain as to the quality of opponents faced. We are uncertain with just about everything about the context the player has faced over 600 PA.
Our observations are our observations. He hit 30 HR. We don't take those away. And neither do we adjust those out. 30 HR is 30 HR.
What saberists do is put those observation IN CONTEXT. And THAT is what we are estimating. We are estimating the level of difficulty of those observations or the level of impact.
And that, very much, is where an uncertainty level would be helpful. When we look at Aaron Nola's context or Jacob deGrom's context, so that we can evaluate their performance, it's not a given that we can establish their context. No. We are estimating it. And anything we estimate is a candidate for an uncertainty level.
Recent comments
Older comments
Page 1 of 150 pages 1 2 3 > Last ›Complete Archive – By Category
Complete Archive – By Date
FORUM TOPICS
Jul 12 15:22 MarcelsApr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref
Apr 12 09:43 What if baseball was like survivor? You are eliminated ...
Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method)
Jul 13 10:20 How to watch great past games without spoilers