[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

A blog about baseball, hockey, life, and whatever else there is.

Thursday, May 20, 2021

Math behind no-hitters

The Math

In the 10 years from 2010-2019, there have been an average of 8.7 hits per 9 IP. The hit rate (meaning the batting average if we include SF in the denominator) was .252. Which means the out rate is one minus .252 or .748. In order to get a no-hitter, you need 27 outs. And the odds of that is .748 to the power of 27. That works out to 0.0004 per game, or 4 per 10,000 games, or 1.9 per full season. There have actually been an average of 3.4 no-hitters in that time span. Why is the math off?

The Assumptions

Well, the math is not off. The assumption is off. We are assuming that each pitcher has a .748 out rate. But, some pitchers are much better and some are much worse. And some teams are a bit better and some teams are a bit worse. And when you raise that number to the power of 27, you get an exponential difference, not a simple difference. In order to "adjust" for the distribution of players, we can modify the mean out rate of .748 upwards by .016 to .764. So that becomes our "effective" out rate for the population. Raise that to the power of 27 and you get 0.0007 per game or 7 per 10,000 games, or 3.4 per full season. And that's what we've witnessed, 34 no hitters over the ten years of 2010-2019.

So, with our model in place, how did 2020 compare and how does 2021 compare? For 2020, we would have expected 1.72, and we got 2. So, that's reasonable enough.

The 2021

In 2021 however, given about a quarter of the season, and even with the reduced hit rate this season down all the way to 8.0 per 9 IP, we'd have expected 1.65 so far. We instead have 6. That difference, +4.35 no hitters above expected, is 3.4 standard deviations from the mean. A z-score of 3.4 is not something you expect to find unless you are looking at hundreds or thousands of scenarios. From 2010-2019 for example, the z-scores ranged from -1.25 to +2.02, with a standard deviation of 1.12. We typically expect to see the range at -2 to +2 with a standard deviation of 1. So, that's why we didn't think twice in 2015, when we had 7 no hitters compared to the 3.3 we expected. Being at +3.7 no hitters above expected is a z-score of 2.0. That's not a story.

The Story

A 3.4 z-score is a story. Seeing 6 no-hitters already would argue in favor of a league allowing 7.5 hits per 9 IP, not the 8.0 we're actually seeing. So we have a conflict here. We do see 8 hits per 9 IP, but we also see 6 no-hitters. Now, while a 3.4 z-score is high, it's not equivalent to a z-score of 5 or 10. In other words, it's not astronomical.

The benefit of the season being 1/4th over is we have another 3/4ths of the season to go. To maintain a z-score of 3.4, we'd have to end the season with ~15 no-hitters. So, if what we are seeing is in fact real, then we should see another 9 no-hitters. On the other hand, if there's nothing extra-special happening, and if we just rely on the 8 hits per 9 IP, then we should see another 4-5 no-hitters this season.

And that's what the argument is going to boil down to: 4-5 more no-hitters, and it's just an early-season story. 9 more no-hitters, and we've witnessed something extra special beyond whatever Random Variation would explain.

(4) Comments • 2021/08/15 • History

Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

THREADS

May 20, 2021
Math behind no-hitters