[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Thursday, June 06, 2024

Bias in the x-stats?  Yes!

By Tangotiger 10:09 AM

Having thoroughly refuted several times, both by myself and other independent researchers, that the spray direction is the missing ingredient in the x-stats, the question remains: what are missing ingredients?

Someone brought up the case of Isaac Paredes, who is a heavy pull batter. However, there is another attribute of Paredes: he does not hit the ball hard. Now, you may think that the x-stats ALREADY account for the exit velocity. After all, the two main ingredients is launch angle and speed. We account for the launch speed. Don't we? Well, once again, I must again talk about the difference between modeling a PLAY and modeling a PLAYER. The x-stats, traditionally, evaluate PLAYS. But, since we are interested in PLAYERS, we limit the variables so that we focus on the PLAYERS. In other words, yes, we evaluate each play, one at a time. But instead of considering AS MANY variables as we can that went into that play we consider AS FEW variables as we can that went into that play that the player themselves have a strong influence.

Launch speed is an easy one to include on an event by event level. Launch angle as well (the easiest one that separates groundballs from home runs). The Spray Direction is one that is needed on the play, but is not needed for the player (as we've learned many times). So, we ignore that one. We include the Seasonal Sprint Speed of the runner, as that's important on groundballs.

Which gets us back to Launch Speed. Remember last night, I created a profile of each batter, to establish their Spray Tendency? Well, what if we do the same thing, but with Launch Speed? That is, let's create a profile of a batter based on how hard they hit the ball.

Now, you may think: we ALREADY account for this on a play level right? Yes, we do. But, what if a 100mph batted ball by Isaac Paredes is different from a 100mph battedball by Giancarlo Stanton, even when both are hit at 20 degrees of launch? In other words, we want Launch Speed to pull double-duty: we want to know the launch speed on that play, but we also want to know the batter's seasonal launch speed.

So, do we see a bias based on a batter's seasonal launch speed? Yes. Yes, we do.

Here's what I did, so you can feel free to replicate. I'm focused on 2016-2019 years as one seaons and the 2020-present (thru June 5, 2024) years as a second season. I do this on the idea that a player has a general speed tendency that spans multiple years. This lets me increase my sample size for each season. I also make sure that a batter that hits on both sides is considered two distinct players.

The speed tendency follows the Escape Velocity method for Adjusted speed: greatest(88, h_launch_speed). For every batted ball, I take the greater of the launch speed and 88. And I average that.

Anyway, I use the same Pascal method of binning I did last night, the 10/20/40/20/10 split.

So, on to the fascinating results. For the weakest batters, the Paredes and Arraez and so on, their xwOBA was .306, while their actual wOBA was .318. That is an enormous bias of 12 wOBA points. The next weakest batters had .339 xwOBA and .345 actual wOBA for a bias of 6 points.

The strongest batters had an xwOBA of .452 and a wOBA of .442, for a 10 point shortfall. The next set of strongest batters had an xwOBA of .411 and a wOBA of .402 for a 9 point shortfall. The middle group were pretty much even.

Now, before we get TOO excited, what else could cause this? I have a few thoughts, but let me just leave this here for now.


#1    Pat Senechal 2024/06/06 (Thu) @ 11:55

The simplest explanation seems to be fielder positioning. A powerful hitter will be played deeper, changing the results of plays. But xwOBA should already use fielder positioning to determine odds of catching, and expected wOBA for a play.

So assuming that’s being done correctly, I wonder why the weak hitters get better results than expected and vice versa. Maybe the positioning is suboptimal, or our approach is misjudging things on the whole.

We’d have to look at plays that are undercounted, and see if there’s a pattern.

(Also, as someone who is toying with things on the play side, the fact that I can’t get spray angle in Savant data is kind of annoying.)


#2    Tangotiger 2024/06/07 (Fri) @ 12:10

No, xwOBA does NOT use fielder positioning.

Catch Probability DOES.


#3    Pat Senechal 2024/06/08 (Sat) @ 18:44

Aah. So maybe it is a positioning issue then, as defense will definitely play to a hitter’s power level, but xwOBA will not consider that. Interesting!


#4    Tangotiger 2024/06/08 (Sat) @ 22:12

Right, agreed.  When a power hitter hits it 380 feet to CF, it’s more likely to be caught then when a weak hitter hits it to the exact same place.

Now, we DO make it different for GB, based on the batter’s running speed.

So, we SHOULD make it different for FB based on the batter’s power.

If I could figure out the impact of the Spray tendency of a batter, I’d be happy to include that.  But so far, I don’t see it.


#5    WanderingWinder 2024/06/14 (Fri) @ 10:20

It would be interesting to see if these batters are getting more singles than expected, or if it’s more xbh. The above hypothesis would suggest xbh. Could probably also check based on how hard hit the ball is - of the guys who don’t hit it hard often, is their overperformsnce in the cases where they do hit it hard, or when they hit it more softly?

My alternative hypothesis is that the softer-hitting guys have higher skill at controlling the location where they’re hitting it, so as to “hit it where they ain’t”. This is a difficult skill with relatively low upside, so most guys are advised to go more for a power approach. But someone like Arraez (admittedly, there aren’t many of those anymore) makes his living off the bloop single.

And if you are getting a lot of Major League AB despite hitting this softly, you’ve hot to have some skill going for you, or you’d lose your playing time. (Although to be fair, this is probably often fielding skill).

Part if me also wonders how count (2 strike approach) plays into this kind of question, but that’s obviously an entire other can of worms to open.


#6    Tangotiger 2024/06/14 (Fri) @ 10:40

That is an excellent point.

So, in this case, exit velocity may be a proxy for bat speed (aka “entrance” velocity).


Click MY ACCOUNT in top right corner to comment

<< Back to main


Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

Aug 20 12:31
How to evaluate HR-saving plays, part 3 of 4: Speed

Aug 17 19:39
Leadoff Walk v Single?

Aug 12 10:22
Walking Aaron Judge with bases empty?

Jul 15 10:56
King Willie is dead.  Long Live King Reid.

Jun 14 10:40
Bias in the x-stats?  Yes!