[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

A blog about baseball, hockey, life, and whatever else there is.

Sunday, December 20, 2020

Verifying our Baseball Guts

Fifteen years ago, I tried to guesstimate the quality of plays.

Notably, I presumed 60% of the plays were "easy outs" with an average out rate of 98.3%.  And another 20% that were "easy hits" with an average out rate of 5%.  The other 20% of the plays were uniformly distributed between 10 and 95%, for an average of 50%.

Well, now with Statcast, we can come up with more precise numbers. 43% were easy outs at 98.4% and another 22% were almost-easy outs at 91.4%.  The two combined are 65% of the plays at an our rate of 96.0%.  So, not bad in terms of my baseball guts.

There's another 21% that were auto-hits (0% out rate), which compares very favorably to what I presumed.

The other 14% were 0 to 85% out rate for an average of 47.1%.  That's reasonably in-line as well.

Why was I able to come up with reasonable estimates 15 years ago with no granular data to speak of?  There's two reasons: 

  1. I knew that the out rate had to average to 70% since ~70% of batted balls are outs
  2. We've all watched enough baseball to realize that there are a good deal of gimme hits and gimme outs

Given that, anyone would have been able to come up with a similar distribution. That's how you need to approach analysis, not just in sports.  You have to have some level of understanding of the data to expect.  Without a prior distribution, you are really not going to be able to have much confidence in what you are doing.

(1) Comments • 2020/12/21 • Statistical_Theory

Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

THREADS

December 20, 2020
Verifying our Baseball Guts