As those who follow me know, the Cy Young Predictor has worked spectacularly well. Until Colin Burnes won it in his FIP year. I created a FIP-enhanced version as well, given that we may be in a paradigm shift.
Chris Sale (and Tarik Skubal) are running away with the predictor using the FIP-enhanced version. Skubal is ALSO running away with it with the classic predictor. So, we won't learn anything there.
However, Sale is barely holding back Wheeler with the Classic Predictor. This means that Wheeler has a chance for an upset here... as long as there are enough old-school voters whose behaviour is being captured by the Classic Predictor.
How many of the 30 voters are Classic voters? I don't know, but let's say that there are 20 Classic voters and 10 FIP-enhanced voters. This means that Wheeler is already 0-10, and he needs to perform well enough over his next 5 starts (and/or Sale pitch poorly enough) that Wheeler can get 16 of the 20 Classic voters. Wheeler and Sale are going to get all 1st and 2nd place votes, regardless of mindset.
In order for Wheeler to get 16 of 20 votes, he probably has to lead with the Classic Predictor by about 5 points. Right now, Sale is ahead in the Classic scoring by 1.4 points. So over the next 5 starts (assuming they each get 5 more starts), Wheeler needs to get about 6 or 7 more points than Sale.
How doable is that?
Sale is averaging 13.5 points per 5 starts with a standard deviation of 4.9 points per 5 starts. Wheeler is 12.25 points and 6.3, respectively. In terms of the difference of two distributions, the standard deviations is the RSS, or one standard deviation is 8 points.
With Wheeler 1.4 points behind already, and 1.25 points expected behind over the next 5 starts, he's 2.65 points behind and he needs to be about 5 points ahead, or a swing of almost 8 points.
In other words: one standard deviation. Which will happen about 16% of the time.
Of course, all this is pretty rough, and if you want to say 10% or 15% or 20% or 25%, that's fine. I can't really give you that precision.
I can tell you the current market is at 82% for Sale and 18% for Wheeler. So, it seems that the market is basically in line with the Predictor.
Fedde is at 3.9 wins above average (WAA), which is the same as the eventual NL Cy Young winner Chris Sale, and 0.1 wins below the eventual AL Cy Young winner Tarik Skubal. Hunter Greene leads at 4.3.
His WAR also follows similarly: 5.5 for Greene, 5.4 Skubal, 5.2 Fedde and Sale.
Fangraphs has Fedde at 2.9 WAR, tied for 24th, with the eventual Cy Young winners as 1-2: Sale 5.7 and Skubal 4.8.
So, what is going on here, how does Reference have Fedde squeezed in between Sale and Skubal?
For that, we have to give thanks to Sean Forman and his team for being ridiculously transparent about it all. Not only do they give you the step by step explainer for WAR, but then they present it component by component so we can understand what is going on.
The first thing to know is that Reference doesn't care about SO and BB and HBP and HR. What they principally care about is Runs Allowed. Not ERA, but RA/9.
Let's compare Fedde to Sale directly. Fedde has 1 more IP than Sale, while giving up 15 more runs (and 15 more ER for that matter). Right off the bat, we start with Fedde behind 15 runs behind Sale.
So how does he make up that difference? That 1 more IP gives him a 0.5 run advantage.
The first thing that jumps out here is the fielding support: Sale is being charged with 0.11 runs per 9 IP of fielding support, while Fedde is supposedly hurt with -0.41 r/9 of fielding support. That is a gap of 0.52 runs per 9 IP. And since they've each pitched the equivalent of 17 9-inning games, then 17 x .52 = 9 runs.
Is it possible for two pitchers to have a gap of 9 runs in fielding support? I actually track that right here:
Eovaldi and Bassitt have benefitted from 9 runs of fielding support above average when he was on the mound (that last part is key). Stroman and Spence have been hurt by 9 runs. So, comparing these pitchers specifically and we have an 18 run gap, which is huge. Therefore, a 9 run gap between two pitchers, while noteworthy, is reasonable.
The gap between Fedde and Sale however is only 4 runs... and it is FEDDE that has been getting better fielding support.
See, the difference in the two approaches is that on Savant we track the fielding support while that pitcher is on the mound. On Reference, it is a team-level adjustment. So, regardless of how the Braves fielders did with Sale on the mound, what matters is whatthe Braves fielders did for ALL their pitchers. Then that is proportioned out to each Braves pitcher. This is akin to a great hitting team counting as the same offensive support, even if in games pitched by one pitcher they only scored 3 runs per game and they scored 6 runs for another pitcher. When you make an overall team-level adjustment, ALL the pitchers are treated with the same run support. And that's what's happening here with the fielding support.
Indeed, Fedde has a .263 BABIP, while Sale is at .317. While not dispositive, it certainly argues in favor that Fedde has not been hurt by his fielders, while Sale has been. Which is what the Savant play-by-play evaluation supports (not to THIS extent, but to some extent).
Anyway, let's keep going.
Fedde is treated as pitching more in batter's parks, while Sale is neutral. I won't look into it some more, but let's assume this is accurate. The net impact is about 3% of runs, and so that's about 2 runs.
Fedde also faced tougher competition. Again, let's assume this is accurate. Reference shows an advantage of 0.13 runs per game, which works out to another 2 runs.
Let's add it up:
0.5 runs: IP advantage to Fedde
9 runs: fielding support to Fedde
2 runs: park support to Fedde
2 runs: opponent quality to Fedde
Add it up and it's 13.5 runs. That's close enough to the 15 runs that we've pretty much explained why Reference loves Fedde.
But, that fielding support number is what is carrying all the weight here. As I said, it should go 4 runs the other way. And once you do that, then all those components end up cancelling out down to .... 0 runs.
And we are left with Sale being 15 runs ahead of Fedde.
In order to buy into the Reference WAR, you have to buy into two things:
1. The overall fielding evaluations at the team level is correct
2. The partitioning of these evaluations at the pitcher level is fair
Unfortunately, there is no uncertainty level in these adjustments. And so, you end up with isolated issues like Fedde v Sale every year.
As a result, single-season WAR may be 90% reliable, but you have some one-offs like these that are off-putting.
That said, things like this work themselves out over a period of years to the point that being off by 1 or 2 wins here or there might be bothersome at the seasonal level, it ends up not really mattering at the career level.
I should also mention that I love Reference, it is an indispensible site for both me and the industry.
I introduced Leverage Index about twenty years ago. One of the early things I did with it back then, which has not really been followed-thru by anyone, is Re-Leveraging the data. I will explain what that means, using Aaron Judge as the example.
Leverage Index (LI) is simply a measure of how much impact that particular moment has on the game, in real-time. The average moment is 1.000. The highest leveraged moment (think bottom of the 9th, bases loaded, down by a run or two) will be around 10. Naturally, you can have an LI approach 0 in a blowout.
The top ace reliever will average an LI of 2.0, basically saying that the moment they come into the game has twice the impact as a random moment in a game.
AARON JUDGE
Aaron Judge, because he plays for the Yankees, and because games seem to be decided one way or the other earlier than normal, has an LI of only 0.9. That's not a reflection of HIM, but rather his circumstances. Right away, we can see that whatever he does, on average, it will be depressed by 10%. We'll take care of that in a moment.
The most crucial moment that Aaron Judge hit a HR is with an LI of almost 4, which is quite high. He has three more HR with an LI of around 2. Another 13 HR with an LI above 1. Another 13 with an LI above 0.5 Then 21 more HR with an LI of under 0.5. The average LI of when he hits a HR is only 0.78. This is much lower than his average circumstance of an LI of 0.9. When folks say that Judge hits alot of useless HR, this is what they are actually saying. How many useless HR is he hitting? I'll get back to that in a moment.
He has 31 doubles and triples. The average LI of those is 0.77, pretty much the same as his HR. This is not looking good for Judge. So far, his extra base hits are coming in substantially lower-leverage situations, even accounting for his overall low-LI to begin with.
His singles have an LI of 0.91, which is the almost the same as his overall average LI. His unintentional walks and HBP are at 0.84. His outs are also at an LI of 0.91.
Ok, so we have our evidence that Judge is actually not rising to the occasion. How can we measure that?
RE-LEVERAGING
When Judge hit that high-LI HR, the one with the LI of almost 4, that in essence meant that this plate appearance will swing the outcome of the game 4X as much as a random plate appearance. In other words, it's practically as if he had a 4-PA game in one PA. And so when he hit the HR in this situation, it is essentially as if he went 4-4 with 4 HR. And that's what we'll do: we will leverage this single PA and single HR as a 4PA event, counting it as 4 HR.
Of course, when he hits a HR in a 0.01 LI circumstance, that will count as 0.01 PA and 0.01 HR.
When we apply this to all his plate appearances, we end up with 491 plate appearances (instead of his actual 561, sans IBB). In order to properly re-leverage, we will bump up all his leveraged-stats by ~10%, so that we end up with 561 re-leveraged PA.
And when we do that, what happens? His actual 51 HR are re-leveraged as 45.4 HR. In other words, he loses 5.6 HR. And so we can say 5.6 of his HR are useless.
His 31 2B+3B become 27 when re-leveraged. So he loses 4 more extrabase hits. He gains 4 singles, loses 3 walks+HBP. And gets an extra 11 outs.
In the end, his actual wOBA of .497 ends up being re-leveraged as .467. This is a 30 point drop in wOBA, which we can easily convert to runs: divide by 1.2 and multiply by his PA of 561 to give us a loss of 14 runs.
IMPACT
In other words, whatever context-neutral value you may have as his run production, you need to drop it by 14 runs in order to properly account for the game situation. These are 14 runs that Aaron Judge did contribute to, but that the Yankees did not benefit from. So, when you translate his performance into wins, via WAR, you can consider removing 1.4 wins from his total. It all depends on whether you think it matters if his performance impacts a game in real-time or whether the circumstances are irrelevant. If the impact matters, then remove 1.4 wins. If the circumstances are irrelevant, then keep those rose-colored glasses on, I don't want to keep you from enjoying your own reality.
I will say this: the choice usually depends on how it affects your player. Had his re-leveraged performance would have gained him 1.4 wins, I am sure his legion of fans would accept the premise of Re-Leveraging.
A walk is as good as a hit, is essentially a true statement when the bases are empty. Which has been true for most of baseball history (with the exception being the extra inning placed runner, the XIPR).
In a Markov chain, the presumption is how you entered a state is immaterial. Being in a state is the information you need in order to know what's to come. So, if you have a runner on 1B with 0 outs, does it matter HOW you got there? If it doesn't, then that's your Markov state: runner on 1B, 0 outs. If it DOES matter, then your Markov state has to include how you go there, so that your actual Markov state is 1B-or-BB-or-HBP-or-Err, and the runner on 1B and 0 outs.
In an award-winning presentation at SABR52, Bailey Hall tackled that issue. The main overall point is that the number of runs that followed the runner on 1B, 0 outs state was essentially the same, regardless as to how the state was entered (0.94 to 0.93 runs following a leadoff BB or single respectively). But, Bailey did note that there may be a pitcher-by-pitcher effect, that maybe some pitchers are more affected by one or the other, and maybe even at the inning-level.
Most important to all this is that the question was asked, a solution has been offered, and the presentation is beyond outstanding (with pure baseball themes wherever you look). This is what an #AspiringSaberist should do: ask the question, roll up their sleeves, and show off the work. Because others will be watching, and they will remember any good work.
I am not looking at EVERY bases empty scenario. In the bottom of the 2nd, with Yanks ahead by 3, Aaron Judge was IBB with the bases empty (!). There were two outs, so maybe it's not so bad?
Let's go to the tape!
Aaron Judge is worth about 0.13 runs above average in a random PA, which means he's worth about 0.013 wins per PA. In this particular instance, the leverage index is 0.22, so his leveraged-wins impact is .013 x .22 = .003 wins
(Click to embiggen) The win expectancy for an average batter in this situation is .810. With Judge batting, that goes up a bit by .003 to .813.
An IBB puts the win expectancy at .817.
So, it's still a bad call to IBB Judge in this situation.
If instead of him being a .475 wOBA batter he was instead a .630 wOBA batter, then that's the breakeven point to walk him. Bonds at his best was around .540. So, no, you can't walk him here, which is why he doesn't get walked here.
Nothing bothers me more than people trying to report running speed in terms of Miles Per Hour. Actually, one thing bothers me more: trying to report it as some sort of instantaneous speed.
Let's take it one step (no pun intended) at a time. In a 100 m race that typically lasts close to 10 seconds, an Olympian will take some 40-50 steps. For the sake of ease of illustration, we'll say 100 m takes 50 steps, or 2 m per step.
You get maximum acceleration when your foot leaves the ground, while you get maximum deceleration when both feet are in the air. Usain Bolt for example would peak at 13 m / sec and bottom out at 11 m / sec, when he is in the middle of the race. That is a HUGE difference.
So, your window of measurement is critical here. If you take the instantaneous maximum speed, naturally it will be that blink-of-an-eye moment as the foot leaves the ground. That particular speed, on its own, is really irrelevant.
What you do want is a full cycle, a full step, at a minimum. And so, you measurement window will capture both the acceleration and deceleration phase.
Now, one step, 2 m in this illustration, will still give you some sort of measurement error. First, not every runner will take 2 m for one step. Some might be 2.5 m or 2.3 m. So, you are not capturing a full cycle here. It's going to be a mix-and-match of the acceleration-deceleration phase, where, depending on your start/stop, some runners will have a bit more of the accel-phase, while others will have a bit more of the decel-phase.
This is why we report running times based on the 10m split times. A 10 m window will give you 4 to 5 steps. Let's say 10 m is 4.5 steps. That gives you 4 steps in the accel-phase, 4 steps in the decel-phase, and then that leaves half-a-step which will be in the accel or the decel or in-between phase. What kind of uncertainty does this give us?
Let's go back to Bolt: with 4 full steps covering the accel-decel phases, he's running at 12 m / sec. The other half-step is either 11 m / sec at worst or 13 m / sec at best.
The weighted average therefore is 11.9 m/sec at worst and 12.1 m/sec at best. Therefore, out uncertainty range in terms of picking the "perfect 10m window" is 0.1 / 12 or 1%. That's our error range. That's probably what we can accept.
So, when you look at swimmers or runners or skaters, figure out the accel-decel phase for each step or stroke or cycle. Figure out what the speed is for each accel and decel. And figure out what uncertainty level you can accept. Once you do that, then you can figure out what window your distance and time will be measured against.
And please, report it in terms of seconds and metres (or feet or yards as your sport needs). Don't do MPH or KPH.
Having discussed Presence or Attentiveness plays, as well as Timing plays, we now turn our attention to the third kind of HR Saving plays: Speed
Setting aside the fence, Catch Probability is largely focused on how much Distance an outfielder has to run (from his starting point) and how much Time the ball is in the air for the outfielder to catch. Distance over Time is Speed. This is how we evaluate outfield defense. We intuitively understand this, even if we don't explicitly say it. That's because we don't have any easy reference points to say how many feet and how much time the play is. Until Statcast.
With Statcast, we know the Opportunity Time, and we know the Opportunity Distance. And so, we know the Opportunity Space.
The wall presents an extra challenge for us. The outfielder sees the fence as an impediment because in these particular HR saving plays, they are about to crash into a wall. This is unlike the Presence and Timing plays where the outfielder won't crash into the wall.
Even within the Speed plays involving the wall, there's a subset of plays as to whether the outfielder has to run-and-jump into the wall, or run-thru the wall. Each presents their own challenge. When it comes to tracking a ball 400 feet away, how high the ball is up the wall is sometimes difficult. Each foot matters a great deal vertically much more than it means horizontally. A ball measured 5 feet closer or deeper has a much smaller impact in our evaluation than a ball measured 5 feet higher or lower.
These speed plays are analogous to the 1-run save for a relief pitcher who comes into the game with the bases loaded and 0 outs being much different than a 3-run save with the bases empty.
There is probably no play more at odds between the eye test and the value conclusion than the HR saving play. And it all comes down to distinguishing about the different kinds of HR Saving plays.
The second kind of HR-saving play is based on timing. Similar to the Presence play, the outfielder has plenty of time to camp themselves under the ball. However, the wall is a bit higher, and the ball is a bit higher at the fence clearing point. And so, the fielder will need to jump. And because they need to jump in order to get to the ball, this will be based on timing the jump just right.
How hard is this to do? I don't know. But I would think such a play likely results in an out at least 70% of the time, and maybe as high as 90 or 95% of the time. For purposes of illustration, I'll say 80%. Don't forget that not all fielders are the same height or can jump just as high. So, timing a play for some fielders, they may have a larger margin of error than for other fielders.
In terms of evaluating the play, it works the same way as we do everything else: we compare to the average. We ALWAYS compare to the average. In everything. This does not mean that being 0 OAA means you have no value. This is probably the single-worst fallacy that is spewed by folks. 0 OAA just means you have AVERAGE value. And average value has... value.
Suppose you have a .500 starting pitcher, whose ERA and FIP and xERA and component ERA and whatever else you want to consider is exactly league average. Let's use the W-L record as representative of their performance. So, this average pitcher might be 14-14, and so is 0 Wins Above Average (WAA). These SP actually are in demand. Even if their WAA is 0.
See, the problem is that we've filtered down 14-14 as a two-dimensional value down to the one dimension of 0. This is a problem in presentation. This is why WAR (wins above replacement) took hold as well as it did: it keeps it one-dimensional, but it merges quantity with quality. Such a 0 WAA pitcher would be something like a 2.5 WAR pitcher.
Getting back to our outfielder who made the HR-saving timing play: if they made 4 such plays and mistimed one play, that's 80% out rate, which in this illustration is league average and so is 0 OAA. If they made all 5, they'd be +1 OAA. If they mistimed ALL five, they'd be -4 OAA. On average, league-wide, these outfielders are 0 OAA. That 0 OAA still has some value.
There are three types of HR saving plays. I will go thru each one, then give you the analogies in the infield, and in other parts of baseball.
The first is the easy one: the presence play. The wall is low enough, say under 8 feet. The ball is in the air long enough. And the outfielder is playing deep enough that they can lightly jog to the spot they need to be. All they really have to do at that point is lift their arm. This is a play that the best or worst outfielder is going to make, whether you are Kevin Kiermaier or not.
The infield equivalent is a 4-3, 5-3, or 6-3 play. All that is needed of the 1B is to get to the bag, and wait for the throw. While the 1B is obviously critical in the play (he is after all getting the putout), the skill required is one that does not differentiate itself among the best and worst 1B. Every 1B will be there.
If the 1B is not there, it was probably because he counted on the pitcher to be there, almost always because the 1B is the one who fielded the ball to begin with. This kind of misplay, the pitcher not covering, happens often enough that it is both embarrassing and not newsworthy.
The way we handle the evaluation of the pitcher not covering is to figure out how often the AVERAGE pitcher does not cover. For the sake of illustration, let's say that the average pitcher fails to cover 10% of the time, and so does cover 90% of the time. (I don't know what the actual number is, maybe it's 5% or 1%, I'm using 10% for illustrative purposes only.) So, when a pitcher DOES cover in scenarios where the out is otherwise assured, he would get +.10 outs. When the pitcher does NOT cover, he would get -.90 outs. The average pitcher in this illustration would get some combination of +.10 and -.90 such that the total Outs Above Average (OAA) is exactly 0. A pitcher who ALWAYS covers would get +.10 for each putout, and if they had say 100 putouts, would have +10 OAA... just by being attentive. Again, this is based on the 90% coverage scenario. If it was 95% is the average, then the fully attentive pitcher would get +.05 OAA times 100 plays, or +5 OAA.
The attentiveness or presence play happens elsewhere on the field. The 3B failing to cover on a steal of 3B, or a pitcher not backing up the catcher on a throw from the outfield. So, you can figure out the OAA of a fielder's attentiveness simply by counting. We don't do this, but we should. That's a gap in the fielding recording.
You can see it elsewhere as well, say the 3-run save. A 3-run save is VERY different from a 1-run save. But, a save is a save in the record books. This is why we don't like saves as a category, because we know a save is NOT a save all the time. A 3-run save is going to be saved say 96% of the time (or whatever the number is). So, a 3-run save is actually going to be worth +.04 "Saves Above Average", while not holding that lead is worth -.96 SAA.
Back to the outfielder: how often will an outfielder fail to camp themselves at the warning track for what would otherwise have been an easy out? Let's say that's 1%. So, if an outfielder in these kinds of HR saving plays simply does their job and holds up their arm without jumping, they will get +.01 OAA. If they don't get there in time or they don't make the out, that's -.99 OAA. On average, this will be 0 OAA league-wide.
Suppose you disagree. Suppose instead you want to give out +.50 OAA on these kinds of plays (high fly, just clearing a low fence). Well, for every 100 plays, you'd give out +.50 OAA 99 times and -.50 OAA 1 time. That's a total of +49 OAA for every 100 such plays. Does that make sense? No. You can't make every outfielder above average.
So, this was alot longer that I thought I'd write, so I'll break this up into multiple parts. See you tonight for part 2.
Having thoroughly refuted several times, both by myself and other independent researchers, that the spray direction is the missing ingredient in the x-stats, the question remains: what are missing ingredients?
Someone brought up the case of Isaac Paredes, who is a heavy pull batter. However, there is another attribute of Paredes: he does not hit the ball hard. Now, you may think that the x-stats ALREADY account for the exit velocity. After all, the two main ingredients is launch angle and speed. We account for the launch speed. Don't we? Well, once again, I must again talk about the difference between modeling a PLAY and modeling a PLAYER. The x-stats, traditionally, evaluate PLAYS. But, since we are interested in PLAYERS, we limit the variables so that we focus on the PLAYERS. In other words, yes, we evaluate each play, one at a time. But instead of considering AS MANY variables as we can that went into that play we consider AS FEW variables as we can that went into that play that the player themselves have a strong influence.
Launch speed is an easy one to include on an event by event level. Launch angle as well (the easiest one that separates groundballs from home runs). The Spray Direction is one that is needed on the play, but is not needed for the player (as we've learned many times). So, we ignore that one. We include the Seasonal Sprint Speed of the runner, as that's important on groundballs.
Which gets us back to Launch Speed. Remember last night, I created a profile of each batter, to establish their Spray Tendency? Well, what if we do the same thing, but with Launch Speed? That is, let's create a profile of a batter based on how hard they hit the ball.
Now, you may think: we ALREADY account for this on a play level right? Yes, we do. But, what if a 100mph batted ball by Isaac Paredes is different from a 100mph battedball by Giancarlo Stanton, even when both are hit at 20 degrees of launch? In other words, we want Launch Speed to pull double-duty: we want to know the launch speed on that play, but we also want to know the batter's seasonal launch speed.
So, do we see a bias based on a batter's seasonal launch speed? Yes. Yes, we do.
Here's what I did, so you can feel free to replicate. I'm focused on 2016-2019 years as one seaons and the 2020-present (thru June 5, 2024) years as a second season. I do this on the idea that a player has a general speed tendency that spans multiple years. This lets me increase my sample size for each season. I also make sure that a batter that hits on both sides is considered two distinct players.
The speed tendency follows the Escape Velocity method for Adjusted speed: greatest(88, h_launch_speed). For every batted ball, I take the greater of the launch speed and 88. And I average that.
Anyway, I use the same Pascal method of binning I did last night, the 10/20/40/20/10 split.
So, on to the fascinating results. For the weakest batters, the Paredes and Arraez and so on, their xwOBA was .306, while their actual wOBA was .318. That is an enormous bias of 12 wOBA points. The next weakest batters had .339 xwOBA and .345 actual wOBA for a bias of 6 points.
The strongest batters had an xwOBA of .452 and a wOBA of .442, for a 10 point shortfall. The next set of strongest batters had an xwOBA of .411 and a wOBA of .402 for a 9 point shortfall. The middle group were pretty much even.
Now, before we get TOO excited, what else could cause this? I have a few thoughts, but let me just leave this here for now.
I have to write one of these blog posts every year because folks are so disbelieving. And it's not just my research. I MUCH prefer when others do this research so that there's no conflict of interest.
I'll lay out my method, and you can feel free to reproduce it however you can. There's some data you may not necessarily have, but you'd be able to estimate it.
Anyway, here we go. Again.
First the data: 2016 to present (thru Jun 4, 2024), regular season and playoffs. Only hit-into-play. We want actual wOBA and xwOBA. Minimum 500 batted balls for each batter over the entire time period. This gives me 593 batters. Hopefully you get something pretty close to that.
Next, we create a spray tendency for each batter. In the past, I would just take all their battedballs to create their spray tendency. But, inspired by the point Ben Clemens recently made (who was studying a similar issue), this time I've focused on batted balls with a launch angle of 4 to 36 degrees, for balls hit 200+ feet. This is basically line drives and flyballs, and the type of batted balls that folks talk about when they talk about pull hitters and spray hitters.
But, just for completeness, I'll also do it my usual way of looking at all batted balls to establish the spray tendency. I'll do that at the end. For now, we'll follow the Clemens-inspired approach.
For the 593 batters, I take the 10% most extreme pull hitters. There's 59 of them. Their spray tendency is a pull of 9.5 degrees. Then I take the 20% next most extreme pull hitters. That's 119 batters with a spray tendency of -7.0 degrees.
I take the 10% most extreme spray hitters. There's also 59 of them, with a spray tendency of +1.6 degrees. The next 20% are at -1.6 degrees. Finally, the middle 40%, 237 batters, have a spray tendency of -4.3 degrees.
Next, for each group, we look at their actual wOBA and their xwOBA. Now remember, the xwOBA does NOT look at a batter's spray direction, whether at a single play level, or at a player tendency level. It is simply ignored. So, if we find that there is a difference between actual wOBA and xwOBA then this is evidence that the spray variable needs to be added to the model. If they are a match, then we don't need it (or at least, we haven't found any evidence with this method that it is needed).
What do we find with the most extreme pull hitters, those at -9.5 degrees of spray? Actual wOBA of .386, xwOBA of .385. How about going the other way, the most extreme spray batters, those at +1.6 degrees of spray? They have a .362 actual wOBA... and .362 xwOBA. Identical.
How about the rest of the three bins? Bin 2 is .379 actual and .379 xwOBA. Identical. Middle bin is .370 actual and .370 xwOBA. Identical. Bin 4 is .363 actual and .365 xwOBA.
So... yeah... we don't need to consider the spray tendency of the batter to model their effectiveness.
***
I said I would rerun everything doing it my usual way of using all batted balls to establish spray tendency. The results are almost as boring, but I'll lay it out, from bin 1 (most pull tendency) to bin 5 (most spray tendency). Actual wOBA first, xwOBA second, difference third. Ready?
-12 degrees, .382, .385, -.002 (rounding)
-10 degrees, .372, .371, +.001
-8 degrees, .378, .379, -.001
-6 degrees, .364, .363, +.001
-3 degrees, .348, .345, +.003
So... yeah... as it turns out, it doesn't really matter how I establish the spray tendency. We just get similar conclusions.
***
Now... How? HOW? HOW is it possible to ignore spray tendency and still be able to get the player wOBA to match to their xwOBA? Simply put: opposing teams know the pull/spray tendency of the batters and position their fielders accordingly. How about the HR? Well, that's true, but if you miss the HR, guess what, there's an outfielder who was positioned close by to turn that almost-HR into an out.
The reality that we found in 2016, when we had so very little data, such limited data, that allowed us to ignore the spray variable is being upheld with tons of more data. And this conclusion has been reinforced by other researchers who also found the same thing.
Long story short: while you need the spray angle to describe the PLAY, you do NOT need the spray angle to describe the (effectiveness of the) player. You can use the spray angle to show the PROFILE of the player, but it won't alter our opinion as to their overall performance.
I'll see you again in six months, where I'll do similar research in different ways. Again.
The above shows three different angles, all related to this HR by Adolis Garcia.
Let's start with the blue line on top. That is what we call the Vertical Bat Angle. We compare the position of the head of the bat to the position of the handle of the bat. If the head is above the handle, then it has a positive vertical angle. The head below the handle has a negative vertical angle. Naturally, when the head and handle are both parallel to the ground, then the vertical angle is zero. If you watch the video, you can see that the bat is parallel to the ground a little bit before contact and a little bit after contact.
The green line in the middle is the Vertical Attack Angle. Whereas the blue line measures bat position in 2D space, the green line measures bat velocity in 2D. In other words, the green line measures the direction of the bat. You can see that at the point of impact, the bat has its velocity moving in an upward direction.
Finally, the orange line is the Vertical Path Angle, the Swing Plane. Once the bat approaches the intercept point, the bat is essentially moving in a single plane. If you can imagine a (tilted by 30 degrees) sheet of paper, the bat is passing through that sheet of paper, and it does so from about 30 msec prior to the Intercept Point, and onwards beyond the intercept point.
I know that all this is not very obvious. The analogy I make is to consider a golf swing. The Vertical Path Angle, the Swing Plane, approaches closer to 90 degrees (maybe it's 70, I don't know, someone out there can tell us). The Vertical Attack Angle is similar to a baseball swing, eventually going to a small positive angle. And naturally the Vertical Golf Angle starts off at a huge positive angle on the backswing, down to a huge negative angle (approaching that 70 or 90 degree angle of that Swing Plane), and continuing back on a huge positive angle.
Anyway, I hope some of that made sense. We'll make it make more sense next time.
I took all of Stanton's swings with a launch speed of 95+ and determined when he reached his maximum acceleration. In his case, he reached max-accel from 7 frames prior to contact back to 13 frames prior to contact. At 300 frames per second, that translates to 23 to 43 msec prior to contact. But I'll just talk about frames here. You can see that basically all his swings are the same, and they are just either early or late. If you shift each curve they'd basically all overlap. (click to embiggen)
Here is how those swings look like based on the swing speed. That red-swing, the early-swing, when he reaches max accel early, his swing speed reaches its max at 6 frames prior to contact (20 msec), and essentially stays there for the duration. As to whether any of this is good or bad, well, we'd have to see his performance for each of these seven groups of swings. At this very moment, I have no idea. But, I'll do that in the comments later today.
I'll show you two charts, both very similar. (click to embiggen)
The first is looking only at batted balls that were hit 400+ feet. As the average HR is about 400 feet, we're essentially treating these as the perfect hits.
The second takes all the swings for each batter's 50% fastest exit velocity. This ensures proportionate representation (as opposed to the above which is biased toward batters who can hit it deep).
Either way, this shows that the acceleration is maximized from 65 msec prior to the impact time to 25 msec prior. The swing speed at these two points is about 30-35 mph at the start to about 65 mph at the end.
This is sortof the reason in my prior article that I was using 0 to 30 for the initial acceleration and 30 to 60 for the main acceleration. While we may be tempted to change that to something like 0-35 and 35-65, this will throw out all of those swings where the batter did not even reach 65mph. Take a look at Arraez for example. Even limiting swings that reached 60mph will remove a decent portion of his swings. At 65mph we'll be removing most of his swings.
In any case, we can report all the acceleration values. The 0-30mph, 30-60mph, as well as the point for each swing where acceleration was maximized.
Here is how Arraez and Giancarlo Stanton look in terms of their acceleration. Stanton reaches his peak acceleration much earlier than Arraez (about 10 msec). Though they both start the ramp up at around the same level of speed (29 mph for Arraez, 36 mph for Stanton), Stanton gets to a much higher level (70 mph) than Arraez (57 mph) in the same amount of elapsed time (40 msec).
If you are trying to imagine what 40 msec represents: a typical fastball will reach home plate in about 400 msec. So, the acceleration phase of the swing is about one-tenth the time it takes for the ball to each home plate from the pitcher's hand.
Suppose we decide that the start time of a swing is when the pitcher releases the ball. That seems a natural point to choose. A pitch however can be thrown from 105 mph all the way down to 35 mph. Even if you take a less exaggerated range, we're still talking about a pitch that will reach the front of home plate between 375 msec and 525 msec. That's a range of 150 msec, which basically (almost) allows a batter to check his swing and restart his swing! Clearly, only focusing on a common distance (~53 feet of release) is not going to work.
How about we focus on a common time, say 250 msec from plate crossing? So, regardless of the speed of the pitch, we're saying the batter's swing is dependent on the same amount of time. However, choosing the plate crossing presumes that all batters are trying to make contact at the plate crossing. But batters stand at different parts of the box, facing different sided pitchers with pitches thrown at different trajectories that will reach home plate inside or outside or up or down. Not to mention the ball-strike count affects expectations as well.
What if we take the actual point of intercept (meaning the actual impact point on contacted balls or the point where the ball and bat are closest for whiffs)? This presumes the batter know what the actual point of intercept will end up being. Working backwards from a known quantity comes with its own issues.
Finally, what is the right way? Well, the closest we can come is try to determine an expected intercept point. Using the identity of the pitcher, the identity of the batter, the ball-strike count, and the batter's location in the batter's box, we can try to predict where the intercept point will end up being for any particular pitch.
Can we come close to the right way in a simple manner? On an aggregate level, the pitcher and ball-strike count will not matter much for any particular batter. So, we can establish a batter's intercept point by looking at all his swings, as LHH or RHH, over the course of the season.
So, in terms of which method to use, I would suggest the most preferred to least preferred is this:
Predict the intercept point by using the variables in play for that particular pitch
Presume the intercept point by using that batter's seasonal average
Treat the actual intercept point on that pitch as the presumed intercept point
Use the front of plate as the intercept point
Use the pitcher release time plus some constant as the intercept point
Acceleration is a tough nut to crack. Not so much in terms of finding the acceleration curve, which you can see here as an example (and its derivative, the confusingly named jerk). But rather, in how to present acceleration.
For a pitch, it's quite straightforward: once the ball is released, the ball is, essentially, traveling at a constant deceleration. Not exactly but close enough. That allows us to create metrics quite easily from it, like the break of a pitch.
A bat is different, because it is constantly increasing in speed, and doing so at a non-constant rate. In other words, the acceleration is constantly going up... until it invariably starts to be reduced (though still positive). The tangent of the acceleration, the jerk, is somewhat constant. But by that point, most people are not going to understand what that even means.
So, we turn to runners. We can think of out-of-the-block, we can think of burst, we can think of cruising speed. We can also turn our attention to cars, going 0 to 60 in X.Y number of seconds.
Here's ONE approach. Suppose the first critical speed point is 30 mph. Maybe it's 40 mph, since that's closer to the max speed of a successful checked swing. But, let's go with 30 mph for now. Ideally, a batter is going to take as long as possible before they even get to 30 mph and rely on their acceleration. Or maybe, ideally a batter is going to want to get to 30 mph as fast as possible because they don't have the acceleration. Or, well, who knows right now. Let's take a look at the data. (Click to embiggen)
I always look for Giancarlo Stanton first, so I can understand what I am seeing it. And there he is, in 2023 and 2024, with Jason Heyward. This is what I call the Lambo Swing: they immediately ramp up to 30mph as fast as they can, and they go 30 to 60 as fast as anyone. In the case of Stanton, he just keeps going to 80+.
The next name I look for is Luis Arraez. And there he is, a Slow and Steady Swing: takes his time to get to 30 mph, and then a slow acceleration to 60 mph. And that's pretty close to his final speed.
In the top left quadrant is the Kokomo Swing: they get to 30 mph as fast as possible, but are pretty slow to 60 mph. Altuve, JD Martinez, Arenado are all there. Maybe it's batters that are just getting old, and so, are relying on their experience to start their swing as early as possible, because they don't have the acceleration to sustain it? We'll see.
Finally, the bottom right quadrant is the Pants on Fire Swing. They start their swing as late as possible and then explode 30mph to 60mph as quick as possible. Jo Adell, Corbin Carroll are representative here. So, this is probably what I think batters are after, taking as much time as possible to size up the pitch, then rely on their explosiveness to get to 60mph as fast as they can.
Are there other ways to describe acceleration? Sure. Instead of measuring elapsed time between two fixed speeds, we can instead measure change in speed between two fixed timestamps. For example, maybe we look for the change in speed from 70 msec prior to the intercept point to 40 msec prior.
Or, instead of two fixed points in time, maybe it's any 30 msec window where we can find the maximum increase in speed.
We'll try different methods to see what we can learn.
According to Baseball Savant, Salvador Perez has been minus 89 runs in Catcher Framing since 2016. Fangraphs has him at a similar -80 runs. Baseball Prospectus at -82 runs. DRS has a more compressed scale, still with Salvador as 2nd worst at -46 runs (compared to the low of -48 runs).
So, it is clear, Salvador Perez is a disaster at Pitch Presentation. And his other skills, his throwing (+17 runs) and blocking (+1 run) just can't compensate for that. Overall, he's a big net negative at -72 runs. But, is that all there is to being a catcher? Surely there's more to it. The calling of the pitches, the confidence the pitcher has.
See, when we look at Pitch Presentation, that's basically a segment of the pitch. Everything else that goes into it is not considered. Maybe he is so good at everything else about being a catcher that it not only overcomes such a huge deficit, but he may in fact even be a net positive. Is that possible?
Well, let me give you a number to blow your mind. Since 2016, the Royals with Salvador Perez have given up 3079 runs while making 17,369 outs. That is a rate of 4.79 runs allowed over the equivalent of 643 9-inning games (or just about exactly 4.0 full 162 game seasons). The Royals without Salvador as catcher have played the equivalent of 3.6 full seasons, while allowing 5.28 runs per 27 outs.
In other words, the Royals, with Salvador, have given up nearly 0.5 fewer runs per 9 innings. And since we just said he played the equivalent of 643 9-inning games, that's 316 fewer runs allowed with Salvador. This stands in STARK contrast to the 72 MORE runs that Salvador allows when we consider Framing, Throwing, and Blocking. We have a 388 run gap here to bridge.
Now, you may be saying: is Salvador catching with the better pitchers maybe? Well, that's the question I ask. As you know, I pioneered the WOWY method (With or Without You), which really at its heart is a simplified mixed-effects model. WOWY has the advantage of being completely transparent, easy to explain, and with a great theme song.
Let's start with Danny Duffy. With Duffy and Salvador, the Royals gave up 152 runs on 1222 outs. With Duffy and without Salvador, the Royals gave up 178 runs on 965 outs. So, that's more runs allowed and fewer outs. This is a feather for Salvador. Pro-rating the 178 runs on 965 outs to the 1222 outs that Salvador caught, that gives us a weighted runs allowed of 225. As you can see, Duffy loves Salvador: he gave up 152 runs instead of the expected 225 runs, or 71 fewer runs.
However, repeating this process for the next most popular pitcher, Ian Kennedy, he gave up 27 more runs with Salvador than without Salvador. Brad Keller also gave up more, at 13 more runs. As did Brady Singer at 46 more runs. And Jakob Janis at 20 more runs.
The Duffy pitchers, those what gave up fewer runs with Salvador gave up a total of 721 fewer runs. The Singer/Kennedy pitchers, those that gave up more runs with Salvador than without gave up 417 more runs. Add it up, and Salvador still allowed 305 fewer runs. This is after controlling for the quality of pitchers.
Remember, we had him at 316 fewer runs allowed without any controls at all. With this level of control, the identity of the pitcher, all we did was reduce that to 305 fewer runs allowed.
Can we do more here? Yeah, we can look at it year by year. Maybe we can focus only on strikeouts and walks, or at least separate the K, BB numbers from the other numbers. We can do alot.
But, the main point here is to say that Pitch Presentation, as real as it is, and as large of an effect as it has, still may pale in comparison to everything else that a catcher does beyond Throwing and Blocking. There's Game Calling. And maybe even just an overarching skill that we can call it Presence if we want.
Whatever it is, it's important that focusing on a very specific skill in a very myopic way doesn't keep us from looking at the overall impact of the catcher.
As you know, I have a win expectancy chart using the inning, score differential, runners on base, and outs. That's posted online, here and here, as well as in The Book.
Did I ever tell you I have a win expectancy chart that ALSO include the ball-strike count? I might have, I don't remember. I rarely if ever use it. I think it's time to break it out for the Ron Washington called, batter not-executed bases loaded squeeze play. Wash noted it wasn't that hard a play to make, but it sure seemed incredibly hard.
Anyway, on to the data. The scenario is this. It's the bottom of the 8th, 1 out, bases loaded, Angels down by 1, with a 1-1 count. The win expectancy is .539 with a 1-1 count (it was .549 at the start of the PA).
Now, let's go thru some possibilities.
First, what actually happened: runner is out, batter gets a strike, but both runners advance. In effect, the runner on 1B was putout. The win expectancy plummeted to .284.
Next, what did Wash hope to happen? In that case, runner scores to tie it up, other runners advance, the batter is putout. In that case, the win expectancy jumps to .611.
There are of course other possibilities. Batter could be safe (win expectancy goes to .774), there could have been a double-play, as we nearly saw (win expectancy goes to .158). The batter could have bunted foul (win expectancy goes to .501). I won't go thru all those scenarios, unless someone really wants to know.
Anyway, so, let's recap:
what could have happened was: .611- .539 = +.072 wins
what did happen was: .284 - .539 = -.255 wins
Well, that makes the breakeven point 78% under these two possibilities. Maybe it goes down to 70-75% if we consider all the ways it could have turned out. If Wash thinks that the batter had a more than 3/4 chance of pulling this off, then it's a good call. If batter had a less than a 3/4 chance of pulling this off, then it's a bad call. Notwithstanding the half-dozen other scenarios this could have played out.
Also note: I did not consider the lefty/lefty/sinkerballer note that Wash suggested, nor any quality of batter. There are of course many considerations, which is why this is a starting point to the discussion.
Recent comments
Older comments
Page 1 of 150 pages 1 2 3 > Last ›Complete Archive – By Category
Complete Archive – By Date
FORUM TOPICS
Jul 12 15:22 MarcelsApr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref
Apr 12 09:43 What if baseball was like survivor? You are eliminated ...
Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method)
Jul 13 10:20 How to watch great past games without spoilers