[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

A blog about baseball, hockey, life, and whatever else there is.

Thursday, September 27, 2018

Uncertainty… of what exactly?

There are three kinds of uncertainties:
  1. Uncertainty due to random variation
  2. Uncertainty in understanding the impact
  3. Uncertainty in measuring the context

1. 

Let's take the first one. This uncertainty is the random variation, which you can think simply of a binomial event (like OBP) or a mulitnomial event (HR, BB, SO, 1B, etc). In EITHER case, the random variation is proportionate, to a GREAT degree, to the square root of the playing time (or opportunity) of the events. In baseball, that's means IP or PA. And it virtually doesn't matter if you hit or give up alot of HR or few. The uncertainty has a correlation of r=0.9+ (likely r=0.99) to the root of playing time.

And what does this uncertainty imply? It implies that even if you observe 30 HR and 100 walks and 200 strikeouts in 600 PA that maybe we might have actually observed something else if the player was given a new set of 600 PA. This uncertainty is a big "what if we replayed the season". And from that standpoint, I think very very few people would even entertain this as the kind of uncertainty they would want.

2. 

How about the second one? That one is easiest describe with the HR, which has a run value of 1.40... on average. Solo HR are worth 1 run, and the more runners, the more the HR is worth. And even then, events are worth more in close games than in blowouts. Or maybe not. That's our uncertainty. We observe 30 HR, we count 30 HR, we do NOT play a what-if he would have hit 22 or 41 HR. What we are asking is, with the 30 observed HR, what does that mean in terms of runs (or wins). 

So a guy with 30 HR is going to be worth say 35 to 50 runs, all depending on how you treat the base-out scenarios, or even the inning-score. 

And this is really more about assumptions than uncertainty. We do have Linear Weights (the metric that assigns a fixed 1.40 runs for each HR) and we do have RE24 (the metric that assigns a run value to each HR based on the base-out states). If we say 30 HR is 42 runs using Linear Weights, there's no uncertainty there. And neither is there uncertainty if we say 30 HR is 38runs or 46 runs using RE24. It's simply an assumption as to what you want the HR to represent in terms of its impact.

3.

So, is it the last one? Yes, yes, it is. It may not be what people are saying is the uncertainty, but this is exactly the uncertainty we are talking about. If we say that a hitter at Coors created 90 runs, and that is below league average, we are making that conclusion based on our opinion as to how Coors impacts run creation. And we are uncertain about that. Indeed, we are uncertain as to the quality of opponents faced. We are uncertain with just about everything about the context the player has faced over 600 PA.

Our observations are our observations. He hit 30 HR. We don't take those away. And neither do we adjust those out. 30 HR is 30 HR.

What saberists do is put those observation IN CONTEXT. And THAT is what we are estimating. We are estimating the level of difficulty of those observations or the level of impact.

And that, very much, is where an uncertainty level would be helpful. When we look at Aaron Nola's context or Jacob deGrom's context, so that we can evaluate their performance, it's not a given that we can establish their context. No. We are estimating it. And anything we estimate is a candidate for an uncertainty level.

(3) Comments • 2018/09/27 • Statistical_Theory

Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

THREADS

September 27, 2018
Uncertainty… of what exactly?