Tangotiger Blog

Sunday, July 26, 2020

Statcast Lab: No Nulls Update

By Tangotiger

Back in 2016, we noticed that about 10% of batted balls had no tracking. If the lack of tracking was random, it would mean it is unbiased (*). But, most of the reasons we weren't tracking was for biased reasons: high popups or sharp grounders. And so if you looked at data like launch angles, having a bunch of low anglers or high anglers missing would lead to wrong conclusions, both at the player and at the league level. So, we implemented a stopgap reasonable solution: no nulls. Every untracked batted ball was given a speed and angle based on whether (a) the cause was biased or unbiased, (b) the stringer marked it as a GB, LD, FB, PU and (c) the actual outcome, single, double, triple, HR, error, out. We were going to enhance that process based on (d) who caught or fielded the ball and even more exciting (e) look at how the players moved to determine where the ball landed. (We tracked the players 98% of the time. It's the ball that was at 90%.) We only did a-c because it satisfied the immediate need, and working on d-e was pushed back in favor of other higher priority items.

(*) Not totally unbiased. For example, we have no tracking in London Games, but London Games had tons of scoring. So, we lack tracking for unbiased reasons, but based on outcomes, the lack of tracking ends up biasing the data. Technically. But, it's only two games.

Now we're in 2020 and the returns on Hawkeye testing in 2019 showed that the lack of tracking was low, and it was mostly unbiased. So, we decided for 2020 to not introduce the no-null solution. If we track it, we report it. If we don't track it, then we don't. This is the same solution we have always had for pitch tracking. It makes as much sense for pitch tracking as it does for 2020 hit tracking: low frequency and unbiased.

Since we've done that, we are revisiting the handling of 2015-2019. To be consistent with 2020, any lack of tracking in 2015-19 for unbiased reasons won't have the no-null solution implemented. And so, what will be left is that the no-nulls solution (i.e., fill-in data generated) will apply only for the 2015-19 data for untracked for biased reasons.

We're figuring out how to present this on Savant, so, stay tuned for how we'll handle it.

() Comments • • Statcast

Recent comments

Nov 23 14:15		Layered wOBAcon
Nov 22 22:15		Cy Young Predictor 2024
Oct 28 17:25		Layered Hit Probability breakdown
Oct 15 13:42		Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is
Oct 14 14:31		NaiveWAR and VictoryShares
Oct 02 21:23		Component Run Values: TTO and BIP
Oct 02 11:06		FRV v DRS
Sep 28 22:34		Runs Above Average
Sep 16 16:46		Skenes v Webb: Illustrating Replacement Level in WAR
Sep 16 16:43		Sacrifice Steal Attempt
Sep 09 14:47		Can Wheeler win the Cy Young in 2024?
Sep 08 13:39		Small choices, big implications, in WAR
Sep 07 09:00		Why does Baseball Reference love Erick Fedde?
Sep 03 19:42		Re-Leveraging Aaron Judge
Aug 24 14:10		Science of baseball in 1957
Aug 20 12:31		How to evaluate HR-saving plays, part 3 of 4: Speed
Aug 17 19:39		Leadoff Walk v Single?
Aug 12 10:22		Walking Aaron Judge with bases empty?
Jul 15 10:56		King Willie is dead. Long Live King Reid.
Jun 14 10:40		Bias in the x-stats? Yes!
Jun 13 17:05		Bat Swing Checklist
Jun 07 12:10		Spray Angle is not needed, part 32
Jun 02 17:37		Stanton Swing Speed and Acceleration Curves
Jun 01 14:44		Statcast Lab: Pre-introducting Bat Acceleration
Jun 01 12:14		Bill James and Tango talk WAR
Older comments Page 1 of 150 pages 1 2 3 > Last ›
Complete Archive – By Category Complete Archive – By Date 2024 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov 2023 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2022 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2021 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2020 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2019 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2018 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2017 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2016 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2015 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2014 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2013 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec FORUM TOPICS Jul 12 15:22 Marcels Apr 16 14:31 Pitch Count Estimators Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS Jan 29 09:41 NFL Overtime Idea Jan 22 14:48 Weighting Years for NFL Player Projections Jan 21 09:18 positional runs in pythagenpat Oct 20 15:57 DRS: FG vs. BB-Ref Apr 12 09:43 What if baseball was like survivor? You are eliminated ... Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method) Jul 13 10:20 How to watch great past games without spoilers

Tangotiger Blog

Sunday, July 26, 2020

Statcast Lab: No Nulls Update

Recent comments

Older comments

Complete Archive – By Category

Complete Archive – By Date

FORUM TOPICS

Latest...