I am not looking at EVERY bases empty scenario. In the bottom of the 2nd, with Yanks ahead by 3, Aaron Judge was IBB with the bases empty (!). There were two outs, so maybe it's not so bad?
Let's go to the tape!
Aaron Judge is worth about 0.13 runs above average in a random PA, which means he's worth about 0.013 wins per PA. In this particular instance, the leverage index is 0.22, so his leveraged-wins impact is .013 x .22 = .003 wins
(Click to embiggen) The win expectancy for an average batter in this situation is .810. With Judge batting, that goes up a bit by .003 to .813.
An IBB puts the win expectancy at .817.
So, it's still a bad call to IBB Judge in this situation.
If instead of him being a .475 wOBA batter he was instead a .630 wOBA batter, then that's the breakeven point to walk him. Bonds at his best was around .540. So, no, you can't walk him here, which is why he doesn't get walked here.
Nothing bothers me more than people trying to report running speed in terms of Miles Per Hour. Actually, one thing bothers me more: trying to report it as some sort of instantaneous speed.
Let's take it one step (no pun intended) at a time. In a 100 m race that typically lasts close to 10 seconds, an Olympian will take some 40-50 steps. For the sake of ease of illustration, we'll say 100 m takes 50 steps, or 2 m per step.
You get maximum acceleration when your foot leaves the ground, while you get maximum deceleration when both feet are in the air. Usain Bolt for example would peak at 13 m / sec and bottom out at 11 m / sec, when he is in the middle of the race. That is a HUGE difference.
So, your window of measurement is critical here. If you take the instantaneous maximum speed, naturally it will be that blink-of-an-eye moment as the foot leaves the ground. That particular speed, on its own, is really irrelevant.
What you do want is a full cycle, a full step, at a minimum. And so, you measurement window will capture both the acceleration and deceleration phase.
Now, one step, 2 m in this illustration, will still give you some sort of measurement error. First, not every runner will take 2 m for one step. Some might be 2.5 m or 2.3 m. So, you are not capturing a full cycle here. It's going to be a mix-and-match of the acceleration-deceleration phase, where, depending on your start/stop, some runners will have a bit more of the accel-phase, while others will have a bit more of the decel-phase.
This is why we report running times based on the 10m split times. A 10 m window will give you 4 to 5 steps. Let's say 10 m is 4.5 steps. That gives you 4 steps in the accel-phase, 4 steps in the decel-phase, and then that leaves half-a-step which will be in the accel or the decel or in-between phase. What kind of uncertainty does this give us?
Let's go back to Bolt: with 4 full steps covering the accel-decel phases, he's running at 12 m / sec. The other half-step is either 11 m / sec at worst or 13 m / sec at best.
The weighted average therefore is 11.9 m/sec at worst and 12.1 m/sec at best. Therefore, out uncertainty range in terms of picking the "perfect 10m window" is 0.1 / 12 or 1%. That's our error range. That's probably what we can accept.
So, when you look at swimmers or runners or skaters, figure out the accel-decel phase for each step or stroke or cycle. Figure out what the speed is for each accel and decel. And figure out what uncertainty level you can accept. Once you do that, then you can figure out what window your distance and time will be measured against.
And please, report it in terms of seconds and metres (or feet or yards as your sport needs). Don't do MPH or KPH.
Having discussed Presence or Attentiveness plays, as well as Timing plays, we now turn our attention to the third kind of HR Saving plays: Speed
Setting aside the fence, Catch Probability is largely focused on how much Distance an outfielder has to run (from his starting point) and how much Time the ball is in the air for the outfielder to catch. Distance over Time is Speed. This is how we evaluate outfield defense. We intuitively understand this, even if we don't explicitly say it. That's because we don't have any easy reference points to say how many feet and how much time the play is. Until Statcast.
With Statcast, we know the Opportunity Time, and we know the Opportunity Distance. And so, we know the Opportunity Space.
The wall presents an extra challenge for us. The outfielder sees the fence as an impediment because in these particular HR saving plays, they are about to crash into a wall. This is unlike the Presence and Timing plays where the outfielder won't crash into the wall.
Even within the Speed plays involving the wall, there's a subset of plays as to whether the outfielder has to run-and-jump into the wall, or run-thru the wall. Each presents their own challenge. When it comes to tracking a ball 400 feet away, how high the ball is up the wall is sometimes difficult. Each foot matters a great deal vertically much more than it means horizontally. A ball measured 5 feet closer or deeper has a much smaller impact in our evaluation than a ball measured 5 feet higher or lower.
These speed plays are analogous to the 1-run save for a relief pitcher who comes into the game with the bases loaded and 0 outs being much different than a 3-run save with the bases empty.
There is probably no play more at odds between the eye test and the value conclusion than the HR saving play. And it all comes down to distinguishing about the different kinds of HR Saving plays.
The second kind of HR-saving play is based on timing. Similar to the Presence play, the outfielder has plenty of time to camp themselves under the ball. However, the wall is a bit higher, and the ball is a bit higher at the fence clearing point. And so, the fielder will need to jump. And because they need to jump in order to get to the ball, this will be based on timing the jump just right.
How hard is this to do? I don't know. But I would think such a play likely results in an out at least 70% of the time, and maybe as high as 90 or 95% of the time. For purposes of illustration, I'll say 80%. Don't forget that not all fielders are the same height or can jump just as high. So, timing a play for some fielders, they may have a larger margin of error than for other fielders.
In terms of evaluating the play, it works the same way as we do everything else: we compare to the average. We ALWAYS compare to the average. In everything. This does not mean that being 0 OAA means you have no value. This is probably the single-worst fallacy that is spewed by folks. 0 OAA just means you have AVERAGE value. And average value has... value.
Suppose you have a .500 starting pitcher, whose ERA and FIP and xERA and component ERA and whatever else you want to consider is exactly league average. Let's use the W-L record as representative of their performance. So, this average pitcher might be 14-14, and so is 0 Wins Above Average (WAA). These SP actually are in demand. Even if their WAA is 0.
See, the problem is that we've filtered down 14-14 as a two-dimensional value down to the one dimension of 0. This is a problem in presentation. This is why WAR (wins above replacement) took hold as well as it did: it keeps it one-dimensional, but it merges quantity with quality. Such a 0 WAA pitcher would be something like a 2.5 WAR pitcher.
Getting back to our outfielder who made the HR-saving timing play: if they made 4 such plays and mistimed one play, that's 80% out rate, which in this illustration is league average and so is 0 OAA. If they made all 5, they'd be +1 OAA. If they mistimed ALL five, they'd be -4 OAA. On average, league-wide, these outfielders are 0 OAA. That 0 OAA still has some value.
There are three types of HR saving plays. I will go thru each one, then give you the analogies in the infield, and in other parts of baseball.
The first is the easy one: the presence play. The wall is low enough, say under 8 feet. The ball is in the air long enough. And the outfielder is playing deep enough that they can lightly jog to the spot they need to be. All they really have to do at that point is lift their arm. This is a play that the best or worst outfielder is going to make, whether you are Kevin Kiermaier or not.
The infield equivalent is a 4-3, 5-3, or 6-3 play. All that is needed of the 1B is to get to the bag, and wait for the throw. While the 1B is obviously critical in the play (he is after all getting the putout), the skill required is one that does not differentiate itself among the best and worst 1B. Every 1B will be there.
If the 1B is not there, it was probably because he counted on the pitcher to be there, almost always because the 1B is the one who fielded the ball to begin with. This kind of misplay, the pitcher not covering, happens often enough that it is both embarrassing and not newsworthy.
The way we handle the evaluation of the pitcher not covering is to figure out how often the AVERAGE pitcher does not cover. For the sake of illustration, let's say that the average pitcher fails to cover 10% of the time, and so does cover 90% of the time. (I don't know what the actual number is, maybe it's 5% or 1%, I'm using 10% for illustrative purposes only.) So, when a pitcher DOES cover in scenarios where the out is otherwise assured, he would get +.10 outs. When the pitcher does NOT cover, he would get -.90 outs. The average pitcher in this illustration would get some combination of +.10 and -.90 such that the total Outs Above Average (OAA) is exactly 0. A pitcher who ALWAYS covers would get +.10 for each putout, and if they had say 100 putouts, would have +10 OAA... just by being attentive. Again, this is based on the 90% coverage scenario. If it was 95% is the average, then the fully attentive pitcher would get +.05 OAA times 100 plays, or +5 OAA.
The attentiveness or presence play happens elsewhere on the field. The 3B failing to cover on a steal of 3B, or a pitcher not backing up the catcher on a throw from the outfield. So, you can figure out the OAA of a fielder's attentiveness simply by counting. We don't do this, but we should. That's a gap in the fielding recording.
You can see it elsewhere as well, say the 3-run save. A 3-run save is VERY different from a 1-run save. But, a save is a save in the record books. This is why we don't like saves as a category, because we know a save is NOT a save all the time. A 3-run save is going to be saved say 96% of the time (or whatever the number is). So, a 3-run save is actually going to be worth +.04 "Saves Above Average", while not holding that lead is worth -.96 SAA.
Back to the outfielder: how often will an outfielder fail to camp themselves at the warning track for what would otherwise have been an easy out? Let's say that's 1%. So, if an outfielder in these kinds of HR saving plays simply does their job and holds up their arm without jumping, they will get +.01 OAA. If they don't get there in time or they don't make the out, that's -.99 OAA. On average, this will be 0 OAA league-wide.
Suppose you disagree. Suppose instead you want to give out +.50 OAA on these kinds of plays (high fly, just clearing a low fence). Well, for every 100 plays, you'd give out +.50 OAA 99 times and -.50 OAA 1 time. That's a total of +49 OAA for every 100 such plays. Does that make sense? No. You can't make every outfielder above average.
So, this was alot longer that I thought I'd write, so I'll break this up into multiple parts. See you tonight for part 2.
Having thoroughly refuted several times, both by myself and other independent researchers, that the spray direction is the missing ingredient in the x-stats, the question remains: what are missing ingredients?
Someone brought up the case of Isaac Paredes, who is a heavy pull batter. However, there is another attribute of Paredes: he does not hit the ball hard. Now, you may think that the x-stats ALREADY account for the exit velocity. After all, the two main ingredients is launch angle and speed. We account for the launch speed. Don't we? Well, once again, I must again talk about the difference between modeling a PLAY and modeling a PLAYER. The x-stats, traditionally, evaluate PLAYS. But, since we are interested in PLAYERS, we limit the variables so that we focus on the PLAYERS. In other words, yes, we evaluate each play, one at a time. But instead of considering AS MANY variables as we can that went into that play we consider AS FEW variables as we can that went into that play that the player themselves have a strong influence.
Launch speed is an easy one to include on an event by event level. Launch angle as well (the easiest one that separates groundballs from home runs). The Spray Direction is one that is needed on the play, but is not needed for the player (as we've learned many times). So, we ignore that one. We include the Seasonal Sprint Speed of the runner, as that's important on groundballs.
Which gets us back to Launch Speed. Remember last night, I created a profile of each batter, to establish their Spray Tendency? Well, what if we do the same thing, but with Launch Speed? That is, let's create a profile of a batter based on how hard they hit the ball.
Now, you may think: we ALREADY account for this on a play level right? Yes, we do. But, what if a 100mph batted ball by Isaac Paredes is different from a 100mph battedball by Giancarlo Stanton, even when both are hit at 20 degrees of launch? In other words, we want Launch Speed to pull double-duty: we want to know the launch speed on that play, but we also want to know the batter's seasonal launch speed.
So, do we see a bias based on a batter's seasonal launch speed? Yes. Yes, we do.
Here's what I did, so you can feel free to replicate. I'm focused on 2016-2019 years as one seaons and the 2020-present (thru June 5, 2024) years as a second season. I do this on the idea that a player has a general speed tendency that spans multiple years. This lets me increase my sample size for each season. I also make sure that a batter that hits on both sides is considered two distinct players.
The speed tendency follows the Escape Velocity method for Adjusted speed: greatest(88, h_launch_speed). For every batted ball, I take the greater of the launch speed and 88. And I average that.
Anyway, I use the same Pascal method of binning I did last night, the 10/20/40/20/10 split.
So, on to the fascinating results. For the weakest batters, the Paredes and Arraez and so on, their xwOBA was .306, while their actual wOBA was .318. That is an enormous bias of 12 wOBA points. The next weakest batters had .339 xwOBA and .345 actual wOBA for a bias of 6 points.
The strongest batters had an xwOBA of .452 and a wOBA of .442, for a 10 point shortfall. The next set of strongest batters had an xwOBA of .411 and a wOBA of .402 for a 9 point shortfall. The middle group were pretty much even.
Now, before we get TOO excited, what else could cause this? I have a few thoughts, but let me just leave this here for now.
I have to write one of these blog posts every year because folks are so disbelieving. And it's not just my research. I MUCH prefer when others do this research so that there's no conflict of interest.
I'll lay out my method, and you can feel free to reproduce it however you can. There's some data you may not necessarily have, but you'd be able to estimate it.
Anyway, here we go. Again.
First the data: 2016 to present (thru Jun 4, 2024), regular season and playoffs. Only hit-into-play. We want actual wOBA and xwOBA. Minimum 500 batted balls for each batter over the entire time period. This gives me 593 batters. Hopefully you get something pretty close to that.
Next, we create a spray tendency for each batter. In the past, I would just take all their battedballs to create their spray tendency. But, inspired by the point Ben Clemens recently made (who was studying a similar issue), this time I've focused on batted balls with a launch angle of 4 to 36 degrees, for balls hit 200+ feet. This is basically line drives and flyballs, and the type of batted balls that folks talk about when they talk about pull hitters and spray hitters.
But, just for completeness, I'll also do it my usual way of looking at all batted balls to establish the spray tendency. I'll do that at the end. For now, we'll follow the Clemens-inspired approach.
For the 593 batters, I take the 10% most extreme pull hitters. There's 59 of them. Their spray tendency is a pull of 9.5 degrees. Then I take the 20% next most extreme pull hitters. That's 119 batters with a spray tendency of -7.0 degrees.
I take the 10% most extreme spray hitters. There's also 59 of them, with a spray tendency of +1.6 degrees. The next 20% are at -1.6 degrees. Finally, the middle 40%, 237 batters, have a spray tendency of -4.3 degrees.
Next, for each group, we look at their actual wOBA and their xwOBA. Now remember, the xwOBA does NOT look at a batter's spray direction, whether at a single play level, or at a player tendency level. It is simply ignored. So, if we find that there is a difference between actual wOBA and xwOBA then this is evidence that the spray variable needs to be added to the model. If they are a match, then we don't need it (or at least, we haven't found any evidence with this method that it is needed).
What do we find with the most extreme pull hitters, those at -9.5 degrees of spray? Actual wOBA of .386, xwOBA of .385. How about going the other way, the most extreme spray batters, those at +1.6 degrees of spray? They have a .362 actual wOBA... and .362 xwOBA. Identical.
How about the rest of the three bins? Bin 2 is .379 actual and .379 xwOBA. Identical. Middle bin is .370 actual and .370 xwOBA. Identical. Bin 4 is .363 actual and .365 xwOBA.
So... yeah... we don't need to consider the spray tendency of the batter to model their effectiveness.
***
I said I would rerun everything doing it my usual way of using all batted balls to establish spray tendency. The results are almost as boring, but I'll lay it out, from bin 1 (most pull tendency) to bin 5 (most spray tendency). Actual wOBA first, xwOBA second, difference third. Ready?
-12 degrees, .382, .385, -.002 (rounding)
-10 degrees, .372, .371, +.001
-8 degrees, .378, .379, -.001
-6 degrees, .364, .363, +.001
-3 degrees, .348, .345, +.003
So... yeah... as it turns out, it doesn't really matter how I establish the spray tendency. We just get similar conclusions.
***
Now... How? HOW? HOW is it possible to ignore spray tendency and still be able to get the player wOBA to match to their xwOBA? Simply put: opposing teams know the pull/spray tendency of the batters and position their fielders accordingly. How about the HR? Well, that's true, but if you miss the HR, guess what, there's an outfielder who was positioned close by to turn that almost-HR into an out.
The reality that we found in 2016, when we had so very little data, such limited data, that allowed us to ignore the spray variable is being upheld with tons of more data. And this conclusion has been reinforced by other researchers who also found the same thing.
Long story short: while you need the spray angle to describe the PLAY, you do NOT need the spray angle to describe the (effectiveness of the) player. You can use the spray angle to show the PROFILE of the player, but it won't alter our opinion as to their overall performance.
I'll see you again in six months, where I'll do similar research in different ways. Again.
The above shows three different angles, all related to this HR by Adolis Garcia.
Let's start with the blue line on top. That is what we call the Vertical Bat Angle. We compare the position of the head of the bat to the position of the handle of the bat. If the head is above the handle, then it has a positive vertical angle. The head below the handle has a negative vertical angle. Naturally, when the head and handle are both parallel to the ground, then the vertical angle is zero. If you watch the video, you can see that the bat is parallel to the ground a little bit before contact and a little bit after contact.
The green line in the middle is the Vertical Attack Angle. Whereas the blue line measures bat position in 2D space, the green line measures bat velocity in 2D. In other words, the green line measures the direction of the bat. You can see that at the point of impact, the bat has its velocity moving in an upward direction.
Finally, the orange line is the Vertical Path Angle, the Swing Plane. Once the bat approaches the intercept point, the bat is essentially moving in a single plane. If you can imagine a (tilted by 30 degrees) sheet of paper, the bat is passing through that sheet of paper, and it does so from about 30 msec prior to the Intercept Point, and onwards beyond the intercept point.
I know that all this is not very obvious. The analogy I make is to consider a golf swing. The Vertical Path Angle, the Swing Plane, approaches closer to 90 degrees (maybe it's 70, I don't know, someone out there can tell us). The Vertical Attack Angle is similar to a baseball swing, eventually going to a small positive angle. And naturally the Vertical Golf Angle starts off at a huge positive angle on the backswing, down to a huge negative angle (approaching that 70 or 90 degree angle of that Swing Plane), and continuing back on a huge positive angle.
Anyway, I hope some of that made sense. We'll make it make more sense next time.
I took all of Stanton's swings with a launch speed of 95+ and determined when he reached his maximum acceleration. In his case, he reached max-accel from 7 frames prior to contact back to 13 frames prior to contact. At 300 frames per second, that translates to 23 to 43 msec prior to contact. But I'll just talk about frames here. You can see that basically all his swings are the same, and they are just either early or late. If you shift each curve they'd basically all overlap. (click to embiggen)
Here is how those swings look like based on the swing speed. That red-swing, the early-swing, when he reaches max accel early, his swing speed reaches its max at 6 frames prior to contact (20 msec), and essentially stays there for the duration. As to whether any of this is good or bad, well, we'd have to see his performance for each of these seven groups of swings. At this very moment, I have no idea. But, I'll do that in the comments later today.
I'll show you two charts, both very similar. (click to embiggen)
The first is looking only at batted balls that were hit 400+ feet. As the average HR is about 400 feet, we're essentially treating these as the perfect hits.
The second takes all the swings for each batter's 50% fastest exit velocity. This ensures proportionate representation (as opposed to the above which is biased toward batters who can hit it deep).
Either way, this shows that the acceleration is maximized from 65 msec prior to the impact time to 25 msec prior. The swing speed at these two points is about 30-35 mph at the start to about 65 mph at the end.
This is sortof the reason in my prior article that I was using 0 to 30 for the initial acceleration and 30 to 60 for the main acceleration. While we may be tempted to change that to something like 0-35 and 35-65, this will throw out all of those swings where the batter did not even reach 65mph. Take a look at Arraez for example. Even limiting swings that reached 60mph will remove a decent portion of his swings. At 65mph we'll be removing most of his swings.
In any case, we can report all the acceleration values. The 0-30mph, 30-60mph, as well as the point for each swing where acceleration was maximized.
Here is how Arraez and Giancarlo Stanton look in terms of their acceleration. Stanton reaches his peak acceleration much earlier than Arraez (about 10 msec). Though they both start the ramp up at around the same level of speed (29 mph for Arraez, 36 mph for Stanton), Stanton gets to a much higher level (70 mph) than Arraez (57 mph) in the same amount of elapsed time (40 msec).
If you are trying to imagine what 40 msec represents: a typical fastball will reach home plate in about 400 msec. So, the acceleration phase of the swing is about one-tenth the time it takes for the ball to each home plate from the pitcher's hand.
Suppose we decide that the start time of a swing is when the pitcher releases the ball. That seems a natural point to choose. A pitch however can be thrown from 105 mph all the way down to 35 mph. Even if you take a less exaggerated range, we're still talking about a pitch that will reach the front of home plate between 375 msec and 525 msec. That's a range of 150 msec, which basically (almost) allows a batter to check his swing and restart his swing! Clearly, only focusing on a common distance (~53 feet of release) is not going to work.
How about we focus on a common time, say 250 msec from plate crossing? So, regardless of the speed of the pitch, we're saying the batter's swing is dependent on the same amount of time. However, choosing the plate crossing presumes that all batters are trying to make contact at the plate crossing. But batters stand at different parts of the box, facing different sided pitchers with pitches thrown at different trajectories that will reach home plate inside or outside or up or down. Not to mention the ball-strike count affects expectations as well.
What if we take the actual point of intercept (meaning the actual impact point on contacted balls or the point where the ball and bat are closest for whiffs)? This presumes the batter know what the actual point of intercept will end up being. Working backwards from a known quantity comes with its own issues.
Finally, what is the right way? Well, the closest we can come is try to determine an expected intercept point. Using the identity of the pitcher, the identity of the batter, the ball-strike count, and the batter's location in the batter's box, we can try to predict where the intercept point will end up being for any particular pitch.
Can we come close to the right way in a simple manner? On an aggregate level, the pitcher and ball-strike count will not matter much for any particular batter. So, we can establish a batter's intercept point by looking at all his swings, as LHH or RHH, over the course of the season.
So, in terms of which method to use, I would suggest the most preferred to least preferred is this:
Predict the intercept point by using the variables in play for that particular pitch
Presume the intercept point by using that batter's seasonal average
Treat the actual intercept point on that pitch as the presumed intercept point
Use the front of plate as the intercept point
Use the pitcher release time plus some constant as the intercept point
Acceleration is a tough nut to crack. Not so much in terms of finding the acceleration curve, which you can see here as an example (and its derivative, the confusingly named jerk). But rather, in how to present acceleration.
For a pitch, it's quite straightforward: once the ball is released, the ball is, essentially, traveling at a constant deceleration. Not exactly but close enough. That allows us to create metrics quite easily from it, like the break of a pitch.
A bat is different, because it is constantly increasing in speed, and doing so at a non-constant rate. In other words, the acceleration is constantly going up... until it invariably starts to be reduced (though still positive). The tangent of the acceleration, the jerk, is somewhat constant. But by that point, most people are not going to understand what that even means.
So, we turn to runners. We can think of out-of-the-block, we can think of burst, we can think of cruising speed. We can also turn our attention to cars, going 0 to 60 in X.Y number of seconds.
Here's ONE approach. Suppose the first critical speed point is 30 mph. Maybe it's 40 mph, since that's closer to the max speed of a successful checked swing. But, let's go with 30 mph for now. Ideally, a batter is going to take as long as possible before they even get to 30 mph and rely on their acceleration. Or maybe, ideally a batter is going to want to get to 30 mph as fast as possible because they don't have the acceleration. Or, well, who knows right now. Let's take a look at the data. (Click to embiggen)
I always look for Giancarlo Stanton first, so I can understand what I am seeing it. And there he is, in 2023 and 2024, with Jason Heyward. This is what I call the Lambo Swing: they immediately ramp up to 30mph as fast as they can, and they go 30 to 60 as fast as anyone. In the case of Stanton, he just keeps going to 80+.
The next name I look for is Luis Arraez. And there he is, a Slow and Steady Swing: takes his time to get to 30 mph, and then a slow acceleration to 60 mph. And that's pretty close to his final speed.
In the top left quadrant is the Kokomo Swing: they get to 30 mph as fast as possible, but are pretty slow to 60 mph. Altuve, JD Martinez, Arenado are all there. Maybe it's batters that are just getting old, and so, are relying on their experience to start their swing as early as possible, because they don't have the acceleration to sustain it? We'll see.
Finally, the bottom right quadrant is the Pants on Fire Swing. They start their swing as late as possible and then explode 30mph to 60mph as quick as possible. Jo Adell, Corbin Carroll are representative here. So, this is probably what I think batters are after, taking as much time as possible to size up the pitch, then rely on their explosiveness to get to 60mph as fast as they can.
Are there other ways to describe acceleration? Sure. Instead of measuring elapsed time between two fixed speeds, we can instead measure change in speed between two fixed timestamps. For example, maybe we look for the change in speed from 70 msec prior to the intercept point to 40 msec prior.
Or, instead of two fixed points in time, maybe it's any 30 msec window where we can find the maximum increase in speed.
We'll try different methods to see what we can learn.
According to Baseball Savant, Salvador Perez has been minus 89 runs in Catcher Framing since 2016. Fangraphs has him at a similar -80 runs. Baseball Prospectus at -82 runs. DRS has a more compressed scale, still with Salvador as 2nd worst at -46 runs (compared to the low of -48 runs).
So, it is clear, Salvador Perez is a disaster at Pitch Presentation. And his other skills, his throwing (+17 runs) and blocking (+1 run) just can't compensate for that. Overall, he's a big net negative at -72 runs. But, is that all there is to being a catcher? Surely there's more to it. The calling of the pitches, the confidence the pitcher has.
See, when we look at Pitch Presentation, that's basically a segment of the pitch. Everything else that goes into it is not considered. Maybe he is so good at everything else about being a catcher that it not only overcomes such a huge deficit, but he may in fact even be a net positive. Is that possible?
Well, let me give you a number to blow your mind. Since 2016, the Royals with Salvador Perez have given up 3079 runs while making 17,369 outs. That is a rate of 4.79 runs allowed over the equivalent of 643 9-inning games (or just about exactly 4.0 full 162 game seasons). The Royals without Salvador as catcher have played the equivalent of 3.6 full seasons, while allowing 5.28 runs per 27 outs.
In other words, the Royals, with Salvador, have given up nearly 0.5 fewer runs per 9 innings. And since we just said he played the equivalent of 643 9-inning games, that's 316 fewer runs allowed with Salvador. This stands in STARK contrast to the 72 MORE runs that Salvador allows when we consider Framing, Throwing, and Blocking. We have a 388 run gap here to bridge.
Now, you may be saying: is Salvador catching with the better pitchers maybe? Well, that's the question I ask. As you know, I pioneered the WOWY method (With or Without You), which really at its heart is a simplified mixed-effects model. WOWY has the advantage of being completely transparent, easy to explain, and with a great theme song.
Let's start with Danny Duffy. With Duffy and Salvador, the Royals gave up 152 runs on 1222 outs. With Duffy and without Salvador, the Royals gave up 178 runs on 965 outs. So, that's more runs allowed and fewer outs. This is a feather for Salvador. Pro-rating the 178 runs on 965 outs to the 1222 outs that Salvador caught, that gives us a weighted runs allowed of 225. As you can see, Duffy loves Salvador: he gave up 152 runs instead of the expected 225 runs, or 71 fewer runs.
However, repeating this process for the next most popular pitcher, Ian Kennedy, he gave up 27 more runs with Salvador than without Salvador. Brad Keller also gave up more, at 13 more runs. As did Brady Singer at 46 more runs. And Jakob Janis at 20 more runs.
The Duffy pitchers, those what gave up fewer runs with Salvador gave up a total of 721 fewer runs. The Singer/Kennedy pitchers, those that gave up more runs with Salvador than without gave up 417 more runs. Add it up, and Salvador still allowed 305 fewer runs. This is after controlling for the quality of pitchers.
Remember, we had him at 316 fewer runs allowed without any controls at all. With this level of control, the identity of the pitcher, all we did was reduce that to 305 fewer runs allowed.
Can we do more here? Yeah, we can look at it year by year. Maybe we can focus only on strikeouts and walks, or at least separate the K, BB numbers from the other numbers. We can do alot.
But, the main point here is to say that Pitch Presentation, as real as it is, and as large of an effect as it has, still may pale in comparison to everything else that a catcher does beyond Throwing and Blocking. There's Game Calling. And maybe even just an overarching skill that we can call it Presence if we want.
Whatever it is, it's important that focusing on a very specific skill in a very myopic way doesn't keep us from looking at the overall impact of the catcher.
As you know, I have a win expectancy chart using the inning, score differential, runners on base, and outs. That's posted online, here and here, as well as in The Book.
Did I ever tell you I have a win expectancy chart that ALSO include the ball-strike count? I might have, I don't remember. I rarely if ever use it. I think it's time to break it out for the Ron Washington called, batter not-executed bases loaded squeeze play. Wash noted it wasn't that hard a play to make, but it sure seemed incredibly hard.
Anyway, on to the data. The scenario is this. It's the bottom of the 8th, 1 out, bases loaded, Angels down by 1, with a 1-1 count. The win expectancy is .539 with a 1-1 count (it was .549 at the start of the PA).
Now, let's go thru some possibilities.
First, what actually happened: runner is out, batter gets a strike, but both runners advance. In effect, the runner on 1B was putout. The win expectancy plummeted to .284.
Next, what did Wash hope to happen? In that case, runner scores to tie it up, other runners advance, the batter is putout. In that case, the win expectancy jumps to .611.
There are of course other possibilities. Batter could be safe (win expectancy goes to .774), there could have been a double-play, as we nearly saw (win expectancy goes to .158). The batter could have bunted foul (win expectancy goes to .501). I won't go thru all those scenarios, unless someone really wants to know.
Anyway, so, let's recap:
what could have happened was: .611- .539 = +.072 wins
what did happen was: .284 - .539 = -.255 wins
Well, that makes the breakeven point 78% under these two possibilities. Maybe it goes down to 70-75% if we consider all the ways it could have turned out. If Wash thinks that the batter had a more than 3/4 chance of pulling this off, then it's a good call. If batter had a less than a 3/4 chance of pulling this off, then it's a bad call. Notwithstanding the half-dozen other scenarios this could have played out.
Also note: I did not consider the lefty/lefty/sinkerballer note that Wash suggested, nor any quality of batter. There are of course many considerations, which is why this is a starting point to the discussion.
Theo Epstein had a great line some 15 or 20 years ago, paraphrasing: in order to see better, he needs glasses with one lens focused on performance analysis and the other lens on scouting. He needed both to see clearly.
One of the things that scouting entails (among many other things) is focused on tools: how fast someone runs, or throws, or in the case of batters, swings. (Again for you speed readers: I'm only talking about one small facet of scouting.) Most of us are focused on the end results (in the form of say wOBA), and then reverse-engineering, or inferring, what that means. Someone hits 50 HR, we infer they have a high bat speed. Someone hits 0 HR but has a .350 batting average, and we infer they have a low bat speed, but they square up the ball alot. 50 HR with 200 strikeouts and maybe they swing hard all the time. 50 HR with 50 strikeouts and maybe they swing hard and make great contact.
Instead of inferring what the batter might be doing that leads to those results, we can now use a new data point: bat speed. We no longer need to know if they swing hard or not. We now know. And by this time next year, we will know if their year to year bat speed went up or down.
For this little study, I'm going to presume that the batter's bat speed applies to his whole career (since 2015). I am going to use three data points: wOBA, xwOBA, and bat speed. I will be correlating each to next season's target, which is wOBA. It's important to note that wOBA includes walks and strikeouts.
First off, xwOBA does the best, with a correlation of r=0.446. wOBA comes next at r=0.407, while bat speed comes in last at r=0.224. On the one hand: that seems low. On the other hand: that seems high. After all, wOBA and xwOBA uses the combination of everything the batter does (his swing, his approach, his results, etc), and so, we'd expect them to correlate well with next season's wOBA. But bat speed is just... bat speed. To just be given that number, and get to an r=0.224 is actually pretty impressive.
What really matters though is if bat speed gives us EXTRA information, beyond what we already know in wOBA and xwOBA. First, when we use both wOBA and xwOBA, our correlation goes up to r=0.450. Remember, we got 0.446 with just xwOBA on its own. Including wOBA barely moves us forward. In other words, xwOBA, which focuses on launch speed and angle already does a great job in describing the batter, that we don't really need their result in the form of wOBA.
See, what happens is that xwOBA removes a layer from wOBA: it removes all the parks and fielders and Random Variation that comes with that. Most of that is really noise and so, adding wOBA to xwOBA doesn't really help us. We just needed xwOBA.
Now, what about bat speed? What if we look at xwOBA and bat speed? Well, in that case, our correlation goes to r=0.455. That is higher than xwOBA + wOBA. That's right, given the choice between xwOBA and wOBA, or between xwOBA and bat speed, it's the latter that is preferred. (Insofar that this little test suggests.)
Remember, think of it in terms of layers. One layer removed from wOBA is xwOBA. Then, one layer removed from xwOBA is bat speed. Bat speed leads to launch speed, which is the key ingredient of xwOBA. And xwOBA leads to wOBA. The more layers you peel back, the more you get to the core of the batter themselves.
And to finish off this little study: wOBA and bat speed gives us an r=0.429, which is even less than xwOBA on its own.
All three gives us an r=0.460
One word about xwOBA: it is a descriptive metric, not predictive. If I wanted to make it predictive, I would have done so. I would have given a POSITIVE weight to a high launch speed, high launch angle popout. In reality, xwOBA, a descriptive metric, gives this a very NEGATIVE value. As it should. But, as a PREDICTIVE metric, this would get a very positive value. Why? Because hitting a 100 mph, 60 degree popup takes ALOT of power. It's a sign that the batter has... high bat speed. That's the inference we can make. Of course, now that we have bat speed, we no longer need to make that inference.
Next time, I will look at see if how much a batter squares up on the ball does to predict wOBA. I don't know the answer yet.
Using the sublime Free Agency Tracker at Fangraphs, I added up all the years and dollars that each club spent on free agency, 2020-2024. At no big surprise, the Mets led at 1.001 billion dollars, at 95 years. What I ALSO looked at is at the players they LOST to free agency, how much those players were signed by OTHER teams. The former-Mets players for example signed elsewhere for 973 million dollars, at 87 years. So, the NET in free agency for the Mets ended up at +27 million dollars and +8 years, putting them right in the middle. In other words, the Mets are basically "trading" one free agent for another free agent. I'll leave it to Aspiring Saberists to decide if these are good trades or not.
Anyway, the team that spent the most on free agents in terms of these NET dollars was the Rangers, at +781 million dollars. In 4th place, shocking to me at first, was the Royals, a net +188 million dollars. But, this had more to do with the Royals not losing any talent to free agency. The most valuable player they lost to free agency was Greinke in 2023 at 8.5 million $. At the other end, the team that LEAST relies on free agency was the Braves: they signed 243 million dollars in free agents, while they let walk 573 million $ to other teams. Incredibly to me, 3rd place was the Dodgers, who are a net negative 251 million $: they signed a ton, at 989 million, 2nd most, but they let other teams sign the most from the former players at 1.24 billion dollars.
This shows it league-wide, for the last (up to) 10,000 pitches of each pitch type. Use this as a reference.
You can see how this could form the basis for an objective standard. The main reason it looks more splattered than we'd like is because if a pitcher says his pitch is a slider, but splatters more like a cutter, we still call it a slider. The good news is that at least we made headways in getting pitchers to split their sliders into gyro-slider and sweeper-slider. One day, the objective standards will eventually take hold, but that day is not yet upon us.
The x-axis shows the difference in swing speed for switch hitters. Players on the far right, like Jose Ramirez, swing much harder as a RHH than LHH. Players on the far left, like EDLC swing harder as a LHH than RHH.
The y-axis shows the difference in wOBA, translated to Runs per 700 PA. Players on top, like Robbie Grossman, perform much better as a RHH. Players on bottom, like EDLC perform much (much much) better as a LHH.
In the red box are players with reverse-splits: they perform batter from one side, though swing harder on the other side. As you can see, these are unusual players. Robbie Grossman hits much better as a RHH, even though he swings harder as a LHH.
In the blue box are players with matching-splits and have extreme gaps in swing speeds: EDLC for example performs far far better as a LHH. And, not coincidentally, he swings harder as a LHH. As you can see, there are many more switch hitters who perform both much better as RHH and swing harder as a RHH. The players in the blue box are candidates to stop switch hitting.
Batters in the middle across have a gap in the swing speeds, but no gap in performance. They may have figured out how to compensate their game. Tommy Edman is on the cusp here. He swings far harder as a RHH, and just has a modestly higher performance as a RHH.
Batters in the center down have a gap in performance, but no gap in swing speed. If there is a reason that Ozzie Albies performs much better as a RHH, it's not tied to his swing speed as LHH and RHH.
With one out in the fifth inning of San Diego’s 6-4 victory over the Reds at Petco Park, Padres first baseman Jake Cronenworth hit what appeared to be an RBI groundout to second base. Tyler Wade scored from third. Fernando Tatis Jr. advanced to second. Machado was due up. Ho hum.
But Cronenworth made a signal toward the plate for a catcher’s interference ruling. Sure enough, home-plate ump Cory Blaser made the call, leaving Shildt with an intriguing decision.
On catcher’s interference, the batting team is allowed to accept the result of the play rather than the base that the batter would otherwise have been awarded. Almost always, when catcher’s interference is declined, it’s because the play resulted in a hit, anyway.
But on this occasion, Cronenworth made the inning's second out. He also plated a run. It left Shildt with these two options:
1. Decline the interference, take the out and the run and a 2-0 lead
2. Accept the interference, with Wade returning to third, leaving the bases loaded with one out for Machado and a 1-0 Padres lead
So, what to do? (For all the charts below, click to embiggen)
From that standpoint, we see that bases loaded, one out is worth 1.590 runs, while runner on 2B with 2 outs, and banking the runner is worth 1 plus 0.325 runs. The difference is what we care about, and the difference is a whopping 0.265 runs in favor of taking the run, and out, off the board, and putting those two runners back on the bases.
Adding .265 runs is equivalent to adding almost .03 wins. However, run expectancy is a proxy for win expectancy. And as a proxy, it works well. Until it doesn't.
Next, we consult our win expectancy, specific for that half-inning, the bottom of the fifth. And, well, we get something different. The typical home team, up by 1, with the bases loaded and 1 out, has a .799 chance of winning. But, give up the runner on 1B to an out, and plate the runner on 3B for a run, and the win expectancy will go up to .811. In other words, we lose .012 wins by loading the bases (at least with an average batter batting).
So, quite the turnaround here. In a random situation, which is what the Run Expectancy is talking about, we gain almost .03 wins by keeping all our runners on base. But from a Win Expectancy scenario specific to the bottom of the 5th home team up by 1, and we lose .01 wins. That's a .04 win difference, all depending which method you use.
Now, the next step should be to consider who is batting, future HOF Manny Machado. All these charts, they all work nice when everything is average. Run Expectancy assumes an average inning and an average difference in score and average batters and runners and pitchers. While Win Expectancy uses the specific inning and score, it still assumes average players all-around. I will leave this particular step to an Aspiring Saberist.
What I want to do instead is look at EVERY half-inning and score, to see when bases loaded is preferred, and when trading two runners for an out and a run is preferred. Here is that chart, and we'll have alot to talk about here.
The top line is the batting team score. Plus means they are ahead and minus means they are trailing. The left column is the inning, split by top/bottom.
The darker the orange (the more negative the number), the more it favors putting the run and out on the board. The darker the purple (the more positive the number), the more it favors keeping the runners on the bases.
Purple = bases
Orange = out + run
In the play in question, bottom of 5th, batting team ahead by 1, we can see the value of -0.012, which means it favors putting the run and out on the board, to the tune of .012 wins. This is the Machado scenario, but with an average batter, not Machado, batting.
Ok, now let's look at specific overwhelming choices, where the identity of the players doesn't matter. The biggest is bottom of the 9th, game tied. In this case it should be obvious: plate the run for the walkoff win. That's a .171 win difference over loading the bases. But that was too obvious.
A high one is bottom of the 9th, batting team down by 1 run. In this case, it should be almost as obvious that you put the run and out on the board for a .078 win gain. When you are down by 1 run in the bottom of the 9th, the most important runner is the one that ties it, and the next important runner is the one that wins it. That third runner, the one on first base, that runner is irrelevant. So in this case, you are gaining the run and losing an out, but you are (effectively) only removing one runner from the bases.
On the flip side is bottom of the 9th, batting team down by 2 runs. Suddenly, all three runners are important. That first runner, he's really irrelevant on their own. He's only important if the second runner can also score. Scoring one, without the other, is irrelevant, since in that case, instead of losing by 2 runs, you lose by 1 run. Therefore, in this scenario, it should be quite obvious: load the bases, keep the extra out. This is a .182 win gain.
If you need rules of thumb, here they are:
when batting team is down by at least two runs, take the interference and load the bases
when the batting team is ahead by at least two runs, decline the interference and plate the run
in-between: consult the chart, but if it's close, consider the identity of the batter (like in the play in question, with Machado)
Recent comments
Older comments
Page 1 of 150 pages 1 2 3 > Last ›Complete Archive – By Category
Complete Archive – By Date
FORUM TOPICS
Jul 12 15:22 MarcelsApr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref
Apr 12 09:43 What if baseball was like survivor? You are eliminated ...
Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method)
Jul 13 10:20 How to watch great past games without spoilers