I Cut, You Choose? It's not exactly that, but it's close to that.
I'm going to come up with some random numbers. I don't follow football enough to give you good numbers, so I'll just try some random numbers.
In this iteration, I'll assume the chance of NOT scoring is 60%. And when you score, it's just as likely you will TD as FG.
So, let's start. Team 1 has the ball, and 20% of the time has 3 points on the board, and 20% of the time they put 7. Now, let's follow each of those three branches, starting with the scoreless one.
If Team 2 is also scoreless, it goes into sudden death. We'll assume Team 2 is more likely to score, so let's make it scoreless 55%, and scoring 45%.
With the FG branch: we'll assume here Team 2 is more likely to try for the TD. So, scoreless 65% of the time, FG 10%, TD 25%.
Finally the TD branch: Team 2 has to be more aggressive, so chance of scoreless is 70%, with 0% for FG, 15% for 6 points (and a loss) and 15% for 8 points (and a win).
The sudden death calculation is a simple calculation. At a 60% scoreless chance for both teams, then it's 62.5% chance for Team 1 to win their sudden death.
All of this now becomes a straightforward probability distribution calculation. And in this illustration, the win% is 52% for Team 1.
Now, what happens if I change the chance of scoreless down to 50%, and adjust everything off that? Now the chance of Team 1 winning is 51%.
If chance of scoreless is down to 40% for any drive, then team 1 winning is 49%.
Indeed, this is how it looks based on the scoreless rate, from 10% to 90%:
So, it is easy enough to see that when you have to input two specific teams, things can change from this baseline, and so what may show here as 47% can in reality be 52%.
That's the baseline. Now, all we need is for someone to come up with something a bit more intricate, and we'll see... probably the same thing.
So, whoever over at NFL ops who came up with this scheme likely proposed this setup because it's around 50/50, all depending on whatever actual teams are involved.
Everyone has their own VOZ method, the Value over Zero. The zero-point is the point at which that thing has no value. This is most clearly demonstrated with Fantasy Leagues. If you play Fantasy sports games, congratulations, you have a VOZ method. In a world where you have several hundred players, but only a few hundred will get selected, all the unselected players have a value of zero. You are only going to spend money on players who have value above the zero-baseline.
That zero-baseline is different for every position. A below league-average batter at catcher has value, while the same batting line for a 1B has almost no value. This concept is quite clear in Fantasy sports. It's a little murkier with real baseball players, but it's real nonetheless. All we need to do is establish what that zero-baseline is.
On Twitter, I asked what a 200 IP, 11-11 pitcher was equal to in value, and the most popular response was a 100 IP 8-3 pitcher. Now, follow me here, this is the important part. 11 wins and 11 losses has the exact same value, according to the voters, as someone with 8 wins and 3 losses. (In this illustration, the W/L record is a proxy for a pitcher's overall performance.) Again 11-11 = 8-3. If the two pitchers are equal, then the difference between the two pitchers is zero. In other words, this is what the voters are saying:
11-11 = 8-3 + 3-8
This is obvious, right? 8 wins and 3 losses, plus 3 wins and 8 losses is 11 wins and 11 losses. And since 11-11 = 8-3, then implies that 3-8 = 0
In other words, a pitcher who has 3 wins and 8 losses, or a win% of 3/11, or .273, is worth zero. That is the zero-baseline: .273, at least in this illustration.
A fairly high number actually chose 7-4 as being equal to 11-11. This implies the zero-baseline for this group of folks was 4-7, or a .364 win%.
The smallest group chose 9-2 as being equal to 11-11, which implies a .182 win%.
To summarize: 51% implied .273, 34.5% implied .364, and 14.5% implied .182. Collectively that comes out to .291 win%. In other words, the zero-baseline level, the point at which a player has no value, is a win% of .291. This is what is commonly called the replacement level, but my preferred term is the Readily Available Talent level. And so, value over zero, or in this case Wins Over Zero (WOZ) is set so that we subtract .291 wins per game for every player.
An 11-11 pitcher is compared to a .291 pitcher given 22 decisions. And .291 x 22 is 6.4 wins and 15.6 losses. So, subtracting 11 wins by 6.4 wins is +4.6 wins, or 4.6 WOZ.
And that 8-3 pitcher? Well, .291 given 11 decisions is 3.2 wins and 7.8 losses. And 8 wins minus 3.2 wins is 4.8 wins, or 4.8 WOZ. The 7-4 pitcher has 3.8 WOZ. So, somewhere between 8-3 and 7-4, but closer to 8-3, is where you find your pitcher equivalent to 11-11.
So Ben Clemens did terrific research on something that comes up every now and every then. And everyone that looks at it comes away with the same conclusion. So, it's good that Ben does this work, but after I comment on this, I'll show you something that is even more important.
The issue is: can't we include the Spray Direction with xwOBA, and not just rely on Launch Speed and Angle? The issue comes down to whether we want to explain the PLAY or the PLAYER. If you want to explain the PLAY, then naturally you need to know the spray direction, since 370 feet pop fly down the line is a HR while 370 feet pop fly straightaway is an easy out.
But do you know why we remove BABIP from a pitcher, and use only FIP? Right, because by and large we care about PLAYERS not individual PLAYS. BABIP contains far more noise than signal, which is why in an all-or-nothing situation, you want to using nothing of BABIP. If you want to weight it, you'd want maybe 20% of BABIP, but that removes the cleanliness of FIP. This is why FIP exists, to provide that clean break. If someone wanted to merge FIP and BABIP, they can do so, giving full weight to FIP and say 20% weight to BABIP.
So, about that spray angle: obviously we have pull hitters and spray hitters. They must have different value right? An xwOBA metric that totally ignores the spray direction must have some bias?
Well, sorta, kinda, if you look at it myopically, and not at all if you look at it holistically.
In Ben's article, he did something very smart, which is break up players into 4 groups based on their spray tendency, from heavy pull to heavy spray. And he did it even smarter by focusing on airballs. A pulled groundball for example is not what we are talkign about in terms of xwOBA missing out on HR down the line.
I asked him for two pieces of information. The first is a summary chart of his last chart for all batters, not just the group he noted. And, you can see a bias here. The pull hitters, when we look at their Air balls, have a .487 wOBA, while the xwOBA was only .473. That's a 15 point shortfall. And we see a larger effect for spray hitters, who, on air balls have a .474 wOBA, while their xwOBA was .492.
So, yes, he did find something. Myopically. Remember, we focused on airballs here. What we care about however is ALL batted balls. Are pull-hitters being biased against by xwOBA because we ignore their spray pattern?
I asked Ben for a chart for ALL batted balls as well. Well, here you go (looks like this is all their plate appearances, but no matter, since the K and BB values are equal in both). That bias shrinks all the way down to 2 or 3 points of wOBA, which is 1 or 2 runs. In other words, this is the FIP/BABIP story. BABIP has a ton of noise that in an all-or-nothing choice, you want to know none of it. And if you want to know some of it, it has really a small weight. And the same applies here: the spray direction has far more noise than signal, and so, you do not want to use it to evaluate players. Unless you severerly underweight that data. And that's why xwOBA doesn't need the spray direction to evaluate PLAYERS.
This is a mostly math post, and I'll be using draft data. If you don't care about either, you won't like this post.
I needed some data. It wasn't important for the purpose of this post what that data is, I just needed to convey the general point that the earlier the round the more value. Anyway, so this was total future WAR by draft round. Again, not important whether this is career WAR, or WAR through age 30, or WAR before reaching free agency, or whatnot. Y'all can do that heavy lifting after I go thru what I want to show.
Ok, no surprise in terms of the general shape, but maybe there's surprise in the steepness? I dunno. Anyway, so the objective is to create a function to connect all those points.
What helps is if we turn all those values into a "share" of the total WAR. In this data, we have 5261 total WAR. Players in the first round have a total of 2613 WAR, which conveniently is almost exactly 50%. Round 2 players have 11%, and it goes down from there. The total is obviously 100%. This is how it looks.
We instinctively knew that a 2nd and 3rd round pick is worth less than a 1st and 4th. Given the choice, we'd take 1+4 over 2+3. This is a good example of where 1+4 <> 2+3. You get a similar thing with exit velocity, where 110+60 is worth more than 90+80.
Indeed, given that the 1st round pick has 50% of all the WAR, this chart suggests that 1 = 2+3+4...+19+20. That's right, having a 1st round pick is worth the same as all other 19 picks combined. I'd bet you didn't know that! Well, at least that's what this data is saying. You gotta tease it to figure out what else it might be saying.
Back to math. When I look at this data, the first place I go to is 1/x. So, it's a question of what constant to put in the numerator, and how to represent the denominator. Let's start with a simple function of: 0.278/Round. This is how that looks.
As you could have guessed, that first round is woefully undervalued by our first attempt. 0.278/1 is obviously 27.8%, and we needed to have 50%. In addition, the dropoff just isn't there either.
Let's try another attempt, this time, instead of x = Round, let's make it Round-squared. The numerator in this case is 0.626, so naturally, the 1st pick will come out to 62.6%. So, the 1st round pick should be somewhere between 1/x and 1/x-squared. However. Look at Round 2. In either scheme, the value is above the data.
So, there's something that is still off. We've been treating Round 1 as a value of 1, and Round 2 as a value of 2. But, what if we made Round 1 a value of 0.5 and Round 2 as a value of 1.5. In other words, the scheme would be 1 / (Round - 0.5) . In this case, the numerator is 0.2. This makes Round 1 worth 40% and Round 2 worth 13.3%. You can see how we're on the right track here.
This is league-wide data, 2021-2023. LHH are "mirrored", so that all their pull data is on the left side of the chart, to match RHH. (click to embiggen)
At each launch angle level, the distance is higher the more you pull. It has more to do with how well a bat is hit more than anything.
At 28 degree launch for example, distance is maximized when you aim for the LF/CF gap. The more you hook, or the more you slice, the more likely you mishit the ball (lower speed). There's also the effect of the spin of the ball (the more you square up, the more likely you have backspin, and the more you mishit, the more sidespin. Just think of how you golf.)
I've done Delta Maps using Distances by launch angle x speed and comparing to the league average, or showing wOBA changes year to year along the same lines.
Kyle Bland showed a really snazzy one by using launch angle and some derivation of spray angle, and comparing the frequency of the player (Bo Bichette in this case) to the league average. It's really nice. So, I did the same thing, not as nice, but, it is more accurate since I use the actual spray direction, as well as showing the numerical values. Make no mistake, if I was as talented as Kyle, I'd overlay what I just did with heat maps as well. I'm not, so I won't. (Click to embiggen.)
This shows how many batted balls Bo Bichette (2021-2023) has at that particular combination of launch angle and spray direction, compared to the league average.
A few notes here.
The top row is the spray direction, where -45 is 3B foul line and +45 is 1B foul line, with 0 up the middle
The left column is the launch angle from -90 (down the ground) to +90 (straight up), with 0 being horizontal to the ground
Any batted ball short of 10 feet is put into its own short distance basket, labelled above as Chop
Any batted ball that was caught in foul territory is under Foul
So a few more notes:
Bichette is a HEAVY groundball batter: in addition to all those reds you see in the grand column at +4, -4, and -12 degrees, there's the huge red of 42 more choppers than league average
You can also see the complete lack of popups, at 36 degrees of launch and higher
Similar to Kyle snazzy chart, we can see a preponderance of groundballs hit to the 1B side, and a lack of popups to the left field
And much fewer foul outs than the league average
So, yeah, Kyle's presentation is brilliant, and we can tell a far better story by showing it relative to the league average. Thank you Kyle.
UPDATE: Here is Mookie Betts (click to embiggen)
Betts (2021-2023) is a big time flyball hitter. But good flyballs, not popups.
You can also see the complete lack of choppers, having 97 fewer than the league average. Betts has 149, and the league average is 97 above that, or 246.
He pulls all his line drive and flyballs, and really abandons the right side infield and short outfield.
A reminder that 28 degrees of launch angle is where you find most homeruns, though you can get them also at 20 degrees if you pull them enough
As we know, Coors helps batters tremendously with the carry of the ball, on the order of twenty feet, on 400 foot batted balls. However, Coors also happens to be the deepest park in MLB, sixteen feet deeper for homeruns. So, on the one hand, the environment adds 20 feet, while on the other hand, its configuration costs the batter 16 feet. The net effect is +4 feet. While that may not sound like much, each foot adds 3% HR, for an estimated +12% HR. Since 2020, Coors has been at +9%. So, that's a pretty good match in terms of actual HR being hit compared to expected based on the physical and environmental characteristics.
Here it is for all ballparks with GABP leading the way on one end, and Comerica on the other end (click to embiggen).
A typical batter will have about 1.85 swings per plate appearance, of which 90% are competitive swings (excluding half-swings and failed checked swings, etc). At 600 plate appearances, that comes out to 1000 competitive swings. Suppose you take a random sample of 100 swings? How representative of their true swing speed would that be? As you can imagine, it would be incredibly high. Now, what about 50 swings? 20? 10? What is the credibility level?
What I did was very straightforward: I took 100 random swings for a batter, and correlated to 100 other random swings for that batter. I did that for every batter with at least 200 swings. The correlation came in at r=0.98.
I ran this with 99 swings (for batters with at least 198 swings) and 98 and on and on down to 1 swing (min 2 total swings). Correlation at r=0.95 happened at only 33 swings. Correlation at r=0.90 happened at only 17 swings. Correlation at r=0.80 happened at only 7 swings.
Here's how the chart looks for every point from 1 to 100 swings (those are the blue dots). Click to embiggen.
The orange line is the regression amount, the ballast, the amount of league average swings to add. For you Bayesians out there: that's the prior amount you'd add to the Beta Distribution. As you can see, this number hovers at just under 2 swings. In other words, after 2 swings, the average swing speed of the batter in question is half-real. We can therefore say the Credibility Level is just under 2 swings.
The dotted line is the Reliability Level: swings / (swings + 1.8). While not as credible as pitch speed, swing speed is not far off.
Once you can hit a ball 430 feet, every extra foot is irrelevant. Hitting a ball 430+ feet is a HR, regardless of distance, and hence the wOBA value of 2.
When you hit a ball under 350 feet, well, adding distance, or SUBTRACTING distance, is about the same. When you hit a ball under 200 feet, every extra foot helps. But once you get to 220 feet, every foot HURTS. Until you get to 320-330 feet or so.
So, if you look at all batted balls 0 to 350 feet, as a group, it's basically immune to extra or lost distance. Adding a foot or subtracting a foot doesn't change anything.
The rapid acceleration happens at 350+.
Now, if you follow baseball, you can guess the reason: there's a gap between the infield and outfield. Infielders play up to 150 feet from home plate, while outfielders play starting at 280 feet from home, up to about 330 feet from home. So, you can get success between the infield/outfield, or beyond the outfielders (and/or beyond the fence).
When you hit a ball at 95 mph, at the ideal launch angle (roughly 24-32 degrees), that ball will travel about 350 feet. This is why the Hard Hit rate really starts at around 95mph. It's not arbitrary. 90 mph is not enough to get you to 350. And 350 really is a threshold that needs to be cleared. Naturally, 100 is better than 95 and 105 is better than 100. Just saying 95+ for hard hit is just a gateway to better understanding Exit Velocity.
And so, when you look at a ball having more or less carry because of wind or any other reason, it's players who hit the ball 350-430 feet that are going to be the most affected.
I love JT Realmuto. And it pleases me to no end that he would come out on top in various catcher metrics I've created during my time here at MLBAM working on Statcast. After I develop metrics, I always check to see how Kiermaier and Betts and Realmuto and so on do. He still comes out very well on throwing and blocking. And up until 2023, he was excellent in framing. 2023 however, he was very different.
When it comes to a very toolsy metric like what we have on Savant, the level of uncertainty is fairly low. Why is that? Because there is very little inference going on. Most fielding metrics, and really any of them pre-tracking, it's all about inferences. But here, we are simply reporting what was being measured or tracked. But, I know that seeing a number like -13 runs, when that is preceded by +7, +3, +4, 0 seems off.
Since I've started looking at the catcher locations on called pitches, I was interested in developing a new metric, Lunges. You know those pitches: the catcher is on one side of the plate, while the pitch is going the wrong way, so the catcher lunges to catch the errant pitch, even if it's in the strike zone.
The best catcher at Lunges (at least on 4-seam fastballs, RHH v RHP) is Matt Thaiss. You can see the description of the data I was going after in the previous article. In this one, I further limited it to pitches where the catcher was located on the inside part of the plate. He faced 21 pitches in the outside part of the strike zone, and all 21 were called strikes. That's well above the 85% for the league average. He caught 57% in the shadow area, where the league average is 38%. All in all, of the 58 pitches in these regions, he caught 34 that were called strike, while the expectation was 27. That's +7. Again, just limited to RHH v RHP on 4-seamers. Eventually, I'll make sure to cover everything.
Realmuto however. He only got 56% strikes on pitches clearly over the plate. When Thaiss gets to 85%, and JT is at 56%, that's certainly alarming. For pitches in the shadow area, JT only got 8% strikes (1 of 12), while Thaiss was at 57%. All in all, JT got 10 called strikes out of 36 pitches, whereas the league average is at 18. That's -8.
Now, I hear you, small sample size. Forget about: I hear you. I say it! You hear me. The larger point is that JT is at -13 runs for the season on all pitches. This is just one snippet to show where JT failed and where Thaiss succeeded. Given that Thaiss was -1 runs overall, this must mean that there was other areas where Thaiss did not do well. Lunges however, is where he did do well.
Now, off to watch some video of these two catchers to see if the eye test matches what we've just learned here.
Umpires are human. Catchers are human. Humans respond to stimulus.
The typical kinds of stimulus is light, heat, physical exertion. Everyone responds different because everyone is different. The most important thing to remember when you apply sabermetrics is that people are human.
How do people respond to taking a snapshot of a 90 mile an hour 3-inch moving object? Exactly, everyone will be different. A pitcher will throw a ball with speed and movement. A batter will move a certain way before taking a pitch. The catcher will catch that ball a certain way. And an umpire, faced with all this stimuli, and using the batter's stance as a frame of reference, along with the home plate, will then make a judgement call as to whether this ball was thru the strike zone or not. This is hard to do.
Now, let's look at RHH facing RHP, at 4-seam fastballs that end up outside, two to three feet off the ground. I select that height so that the focus will be purely on side-to-side. At that height, it doesn't matter if it's Altuve or Judge batting.
For pitches that end up outside, I create three different regions of outside.
The first is pitches that are on the outside part of the plate. So, still a strike, but just barely. These are pitches that are 8 inches from the center, plus/minus 1.5 inches.
The second is pitches that is just outside of there, enough for part of the ball to maybe catch part of the strike. These are pitches 11 inches from the center of the plate.
Finally, the third set is pitches just outside of that, and so are 14 inches, plus/minus 1.5 inches.
Got all that? Just put three balls stacked next to each other, starting with the outside part of the plate, and continuing going out from there. These balls are at 8, 11, and 14 inches from the center. A ball is almost 3 inches wide.
Does where the catcher position himself when catching an outside pitch matter, in terms of getting the called strike? (click to embiggen)
First, let's start with the easy one, the pitches that are 14 inches from the center, and should be 0% called strike rate. In reality, they are called strike 5% of the time. Indeed, whether the catcher is located on the inside or outside part of the plate is irrelevant: the pitch is far outside enough that it doesn't matter. It's a called strike 5% of the time. This is the blue line above.
Let's take the almost easy one, pitches that are over the outer part of the plate. When the catcher is positioned on the outside half of the plate, those pitches are called strike 98% of the time. It should be 100%, but 98% is pretty good. This is the red line above.
However. However, if the catcher is located inside, the called strike rate goes below 90%. This is those pitches that the catcher was expecting to be inside, the pitch goes outside, and the catcher has to dart out to catch. It looks ugly. And over 10% of the time, this is enough stimulus to fool the umpire. Umpires are human, just like you. 10% of the time they are wrong, and that's with years of experience. You'd be wrong even more.
Finally, let's look at those 50/50 pitches, the green line. This is where it matters the most. A catcher located on the outer half of the plate, and they get a pitch that is otherwise a 50/50 pitch will be called a strike 65% of the time. But a catcher located inside will instead only get a called strike 40% of the time. Remember, same location either way, but the catcher being outside will get the call 65% of the time, while being inside will get the call 40% of the time. Humans responding to stimulus. This is the result.
The angle is based on the ratio of the components of the velocity vector. You start by taking the z component (the up/down), and divide it by the remaining components (side to side and front-back). For the remaining components (x, y), you simply apply pythag: sum the squares, then take the square root. After you have that ratio, you take the arctan to give you the angle (in radians, which you can then convert into degrees).
Let's walk through an example. Suppose you have these velocity components (x,y,z): 4, -125, -20. The units won't matter, since they will cancel out, but if you must know, these are in ft/s. You start by combining the x,y components. The pythag of 4 and -125 is 125.06. As you can tell, virtually all (but NOT ALL) of the velocity is in the y direction, which is from mound to home. Naturally, there will be some amount of speed that is side-to-side, but when you are talking about pythag, a triangle, you can see the hypotenuse and the longest side are virtually the same.
Next, we take the downward value, -20 and divide by 125.06. That gives us -0.16. That's the ratio of the vertical velocity compared to the horizontal velocity. This number is naturally always going to be small.
Next, take the arctan of this small number, which gives you... a similar small number of -0.159. Again, when you are dealing with small numbers, the ratio of the sides and the angle (in radians) are going to be very very similar. They are basically both approaching zero. The closer the ratio is to 0, the closer the angle becomes 0.
The final step is to turn it into human numbers, converting radians into degrees. You do that by multiplying by 180/PI(), or 57.296. Fun fact: some software (I'm looking right at you BigQuery) doesn't have a PI() function, so you can use acos(-1), which is PI(). There's gotta by ONE BigQuery developer out in Google land that also likes sabermetrics. So, if you want to make me happy, just create a PI() function please. Anyway, -.159 in radians is -9 degrees.
I bring all this up because Eli noted that instead of calculating the VAA at plate crossing, we should be calculating, some angle, when it actually correlates the most with the thing we are interested in, which in this case is whiff rate.
He ends up concluding that we should take the angle at 13 feet behind home plate. Now, why would that be? I think it's simply a question of variation, which is really what correlations are about. Here's an image from Alan Nathan's calculator that we extend to 20 feet beyond home plate, and 3 feet underground. The scales are obviously exaggerated. That red line, the release angle, is actually just minus 6 degrees. If I were to extend a tangent line starting at plate crossing, it'll be minus 9 degrees in this image (the VAA). If you look at the range in the release angles in reality, and compare it to the range in approach angles (at plate crossing), you will see the range is about 1.4X at plate crossing. So, wider variation, the more you take the timestamp away from release point. You can thank gravity for that. (Though when I look only at 4-seamers, the variation is greater at release.)
I think this is what Eli is capturing, some combination of speed and release angle, but I may be wrong. The next step really is to focus on common speeds, say look at all fastballs thrown at 93-94 for example. My expectation is that it wouldn't matter where you are going to measure the angle.
This is an exciting chart (click to embiggen). This data is limited to RHH v RHP, where the 4-seam fastball comes in 2 to 3 feet above the ground, and on the 1B side edge of the plate. Remember that, every pitch we will look at are OUTSIDE pitches, pitches that land on the 1B side of the plate. These pitches are called strikes 50% of the time. Of course, we know that how the catcher frames the pitch matters. And I'll prove it for those who still don't believe it.
The top line is the side/side location of the catcher's wrist. At -8 inches, that's the edge of the plate on the 3B side, and +8 inches is at the 1B side.
The column represents the number of frames prior to the ball crossing the plate. At 30 frames per second, this means -10 is one-third of a second prior to plate crossing.
All good so far? The percentages you see there is the eventual strike% call. So, let's start with the most obvious thing here, the first entry that shows 22%. What that says is that the catcher was setup on the inside part of the plate. Since we know that all of these pitches land on the outside part of the plate, these must have been major misses. You know when it looks like the catcher stabs at a ball, even if it's still in the strike zone, that it gets called a ball? Congratulations, we now have proof. Those pitches are called strikes only 22% of the time.
Now, look where the catcher is setup when the throw is perfect. At the -10 frame, we see that when the catching wrist of the catcher is 2 to 7 inches from the center of the plate, that pitch is called a strike 60% of the time. Which is of course way above the 50% average.
Now, the even more exciting part: look at the rows after the row labelled 0. Remember 0 is at plate crossing. So, at row 3, that's 3 frames after plate crossing. And when the catcher's wrist is 3 inches from the center of the plate (remember for a ball that is crossing the plate on the black), that pitch is called a strike 80% of the time. So, this is the catcher drifting his glove back toward the center.
What about the catcher that just sticks-the-landing? Well, look at that column 10, which means the wrist is 10 inches from the center, so beyond the black on the wrong side. Those pitches are called strike only 5% of the time, when the point is 5 frames after the plate crossing.
Basically, you really need the catcher to give the perception of the strike in order to get the strike. You can't just catch the ball and just stay there.
As we saw in Part 1 and again in Part 2, the gap in the final location of a pitch high/low (about 18 inches) is not equal to the gap in the initial location of the glove of the catcher (about 4 to 5 inches). While there is certainly a directional pattern, the magnitude is not there.
This is why those early attempts at trying to measure how well a pitcher hits their target were doomed to failure from the start, at least those that were focused on the 2D portion of the strike zone. The up/down location of the pitch just can't possibly compare in magnitude to the glove of the catcher, since the catcher is ALWAYS keeping his glove low. The very highest the catcher is going to try to hold his glove is two feet off the ground, which is still in the bottom half of the strike zone (the middle part of the strike zone is about 30 inches off the ground). So any pitch that comes in high will always look like it did NOT hit its intended spot. Except the glove is not the intended spot, but a reference point. You need to be able to convert the reference point into an intended point.
The side-to-side is a much different story. There is a 6 to 10 inch gap in the location of the glove side to side, based on whether the pitch is intended to be inside or outside. This is for a final actual location of the pitch of about 24 inches. While still not the 1:1 relationship, it's somewhat better than the high/low location of the glove. So, even if you are trying to measure command along ONE direction, the side/side, you still cannot just do a comparison as if the location of the glove is the intended location of the pitch. Again, even here, you need to be able to convert the reference point of the side/side glove target into an intended point.
Can we get there? Yes, it'll take a while. We have alot of variables to consider. This one is the easy one, with the focus on 4-seam fastballs, where the pitch is the straightest in terms of the setup (though even 4-seam fastballs have some natural tail to them). What this really requires is a pitcher-by-pitcher understanding as to how catchers setup for each of their pitches, and how much the reference point represents the intended point. I suspect that some clubs already do this. We'll try to get this done as well, and hopefully get this data up for you to see.
Thank you for reading Part 1, which focused on inside pitches. Now let's look at how the catcher sets themselves up on outside pitches (click to embiggen).
Here we see a pretty muted pattern. First, we start with the green line, which shows no pattern at all in terms of left/right. So whether the pitch is intended to be high or low, the catcher does not set themselves up any different, on outside pitches. This is very different from inside pitches.
In terms of having the glove closer to themselves or the plate (the orange line): it's a bit over one inch, or about half the effect of inside pitches.
Finally, the blue line, how high/low the catcher sets his glove for pitches intended to be high/low: this gap is the most pronounced at almost 4 inches.
So, overall, it is similar as with inside pitches in terms of total magnitude (4 to 5 inches), but directionally, on outside pitches, almost all of the shift is high/low.
When a catcher holds his glove down and inside, what does that mean? It probably means he wants the pitch inside, and probably low. But how does a catcher call for a pitch high and inside? As we know, the catcher is not going to hold the glove at his face.
The glove is really a reference point. And so, we have to try to infer the intended location based on the reference point. Can we do that? Yes.
We have the location of the catcher's left wrist (all catchers catch with their left hand) throughout the pitcher's delivery. So, what do I do with this information?
Well, I combine this information with the actual pitch location of every pitch. Those pitches I flag into five locations: high/low, inside/outside, or over the heart of the plate.
That's enough words, let's look at the data (click to embiggen). I limit the data for this presentation for RHH v RHP and with 4-seam fastballs, on the idea that the catchers should setup similarly.
The data is captured at 30 frames per second. So, this data is from 2 seconds prior to pitch release to 1 second after pitch release. That's why you see the scale as -60 to +30.
Let's start on top with the orange line, which is how many inches the left wrist is behind the backtip of home plate. Our first interesting finding is that when the catcher is expecting a high pitch compared to low pitch that his glove will be behind the plate by about 2 to 3 inches more. In other words, his glove is 2 to 3 inches closer to himself, when the pitch is expected to be high.
How about the height of his glove? If you look at the final catch point, which is between frames 10 and 20, pretty close to frame 15, the high ball is caught when his wrists are almost 36inches (almost 3 feet) above the ground. The low ball is caught with his wrist about 18 inches above the ground. Now, where were his wrists before the pitch is thrown? If we look at about 10 frames (one third of a second) prior to pitch release, we can see that the wrist is about 2-3 inches higher when the pitch comes in high, as opposed to coming in low.
So, let's stop there for a moment. These are pitches where the final location is about 18 inches apart, and yet the wrist is only 2-3 inches of difference. This makes it pretty clear that the glove is purely a reference point and not a final target. The catcher is just nudging his glove slightly, 2 to 3 inches, in order to get a pitch to have an 18 inch gap up/down. Even if the final location is exaggerated from its intended location, we're still talking about a large multiplier effect here. In other words, it's really hard visually to see if a catcher is indeed calling for a high pitch.
Finally, how about side to side? Here we see about 3 inches of difference. The negative number means it's more inside. So, a high pitch will have the glove closer to the plate, and a low pitch will have the glove closer to the batter, all that by 3 inches of magnitude.
All in all, 2 to 3 inches of difference in each of the three dimensions gives us a total of 4 to 5 inches total. The high pitch has the glove further behind the plate, higher above the ground, and away from the batter. So if you are going to try to remember this, use the catcher's left knee and his chin as reference points. Low inside pitches means the glove moves toward the left knee, while a high inside pitch means the glove moves toward the chin.
Next time, I'll look at outside pitches, and how the catcher is setup for pitches high/low.
In March of 2021, in a series of tweets, I introduced the concept of Best Speed and Escape Velocity.
I also had some blog posts back in October of 2022 on the concepts of Best Speed and Escape Velocity when these metrics were published on Baseball Savant. And a few other posts like these around the same time. They are all worth re-reading. But if you want a quick summary:
As some of you may know, there is a problem when you average 60 and 110 mph batted balls, and compare it to two 85 mph batted balls. While both average 85 mph, you will get far more success with a 60+110 combo than 85+85 combo. Why is that? Because functionally, a 60 mph batted ball and an 85 batted ball produce similar results, while a 110 mph batted ball produces fantastic results. In other words, each MPH is not worth the same.
In addition, when you consider the talent for a batter to hit 60 or 85, it is virtually the same: both are mishits. So, on the field, they are worth similarly. And as it explains the batter, they are worth similarly. And so, we need to treat them similarly.
That's the idea behind Escape Velocity: you need to break through a certain threshold in order to do damage, both on the field, as well as to explain the talent of the batter. That threshold is close to 88 mph. Yes, I know, I know, what luck, that at 88 mph is when serious shtuff happens.
***
In the original formulation back in 2021, and when we introduced to the Custom Leaderboard in Oct of 2022, Escape Velocity was set as MPH above 88. So, a 100 mph batted ball would count as +12. And naturally, anything below 88 was set to 0, since, anything under 88 is essentially the same value. Not the same linear value that you are used to in Math, but the same value-value in a practical sense.
In this new re-introduction today, the floor is set at 88. So, a 100 mph counts as 100, but anything under 88 counts as 88. The magnitude of the scale hasn't change, just the shifting of the scale. While I had a slight preference to the original scale (so 100 counts as +12), saber-thinker David Adler pointed out: everyone knows what 100 means. And, in the face of that simple logic, my slight preference at the time really made me change my mind to the strong preference proposed by Adler. And so, Escape Velocity has been recreated as Adjusted Exit Velocity (or Adjusted EV), with a floor of 88. For you developers out there: greatest(88, exit_velo) instead of greatest(0, exit_velo - 88).
As for Best Speed: there is no change in its calculation, but it's been rebranded as EV50. It still works the same: the 50% hardest hit by a batter, and the 50% softest allowed by a pitcher are their "Best Speed". We average it out the same way. Except now you will see it as EV50. And to give it more prominence, you will now see it on the main Leaderboard.
And both of these stats are now part of the new default set on Custom Leaderboard. You no longer have to go hunt for those metrics. Though you may find you want to play with that Custom Leaderboard as there's hundreds of available metrics, both old school and new school, out there.
The really simple WAR method has Logan Webb at 4.7 WAR.
The simple WAA has Webb at +2.6 wins above average.
Since Webb pitched the equivalent of 24 full games, that's 15% of all innings for the Giants pitchers.
Let's take the 162 games of the Giants, and give out 92 games to the nonpitchers and 70 games to the pitchers. Since Webb has 15% of the innings, then he gets 70 x 15% or 10.5 games. This is his Game Space.
An average pitcher, given 10.5 games, would have 5.25 wins and 5.25 losses. Someone who is +2.6 wins above average would therefore have this record (what I call an Individualized Won-Loss Record, or The Indis):
7.85 - 2.65
What would a replacement level pitcher have as a record? Let's say it's around a .300 win%. So, a replacement level pitcher would have .300 wins x 10.5 games, or 3.15 wins. So, the replacement level pitcher is this:
3.15 - 7.35
And 7.85 is 4.7 wins above 3.15. And therefore, a 4.7 WAR.
You see, if both Bill James and Pete Palmer agreed to a two-dimensional representation of Logan Webb like so:
7.85 - 2.65
Then each of them could have decided on whatever baseline they want to compare to.
Pete said: compare to .500. Ok, so, 7.85 - .500 x 10.5 = 2.6 wins above average.
Bill in 1987 might have said: compare to .300. Ok, so 7.85 - .300 x 10.5 = 4.7 wins above replacement.
And that's it. That's how we get an agreement on two diverging views: represent the data in two dimensions. Everyone is happy.
Now, how does Baseball Reference compare to Bill's original formulation of WAR? Bill it should be noted did the "plus 1" based on an educated guess. He actually didn't invent WAR for the purpose of creating WAR. No, he did it to compare Roger Clemens to Don Mattingly in 1986 (and retroactively Jim Rice and Ron Guidry in 1978). To do that, he came up with a reasonable replacement level. And plus 1 fit the bill (no pun intended) for that purpose.
When I introduced the WAR framework twenty years after that, there was alot more work done. That's because I really needed to create something more expansive. Bill laid the groundwork, and now the rest of us just have to fit all the pieces. Others took that WAR framework and created their own implementation. If we look at 2023 pitchers on Baseball Reference, setting a value of 0.88 ERA above league average will get us to match on total pitching WAR for pitchers with at least 50 IP. Those pitchers had 408 WAR and doing +0.88 (instead of +1.00 like Bill originally proposed) gets us to 408 WAR as well. Is +0.88 better than +1.00? Let's table that for a few minutes.
Logan Webb for example had 216 IP, or the equivalent of 24 games. His ERA was 3.25. Since the league average ERA is 4.33, then +0.88 above that sets the zero-baseline as 5.21 ERA. So the WAR for Webb is simply this:
WAR = 24 * (5.21 - 3.25) / 10 = 4.7 WAR
About twenty years ago, Keith Woolner in his VORP calculation (which is a forerunner to WAR, but in the form of runs) distinguished between relievers and starters. He reasoned that it was easier for a pitcher to have a lower ERA as a reliever than starter. He was right. This eventually led to me creating the Rule of 17, which basically states that a pitcher, while pitching in relief, will have 17% more strikeouts, 17% fewer HR, have a BABIP 17 points lower, and give up 17% fewer runs. Walks were flat (my theory is that walks is the cost of doing business, and everyone basically accepts a certain amount of walks for their pitching style). Was it exactly 17 in each of those cases? Yes, pretty close to it. So what this means is that if your replacement level was an ERA for 5.00 for a starting pitcher it would be 17% less, or 0.85 less, as a relief pitcher.
What does Baseball Reference do? I'm not sure exactly, but a gap of 0.50 runs per game fits the bill. In any case, I'm just trying to show the basic idea that there are different standards. So, instead of +0.88, the standards are +0.54 for relievers and +1.04 for starters. And pitchers who throw in each have a sliding scale in-between.
For those interested, a simple way to get the rate between the two is to do this: GS*2 / (G + GS). So, a pitcher who throws only as a SP gets 100%, a pure RP gets 0%. If you start 10 games and pitch in 40 games, then that's 40%, which is 10*2 / (40 + 10). And so, the marginal baseline is 40% of 0.50, or 0.20. And therefore +0.54 + 0.20 = +0.74. In any case, just remember it's a range of +0.54 to +1.04.
You can see therefore that Bill's educated guess of +1.00 worked pretty well for Roger Clemens (a starting pitcher) in 1986, when you compare to the +1.08 that Reference uses (or at least I best-fit toward).
Anyway, what does this mean for Logan Webb? Instead of using +0.88 which led to this:
WAR = 24 * (5.21 - 3.25) / 10 = 4.7 WAR
We instead use +1.04, which leads to this:
WAR = 24 * (5.37 - 3.25) / 10 = 5.1 WAR
So, that's just the simple way to get there. There are of course other adjustments to consider, relating to fielders, park, opposing batters, and leverage. For Webb specifically, Reference says that comes out to an adjustment of 0.4 wins in his favor, so he ends up with 5.5 WAR. Some pitchers would obviously get an adjustment going the other way, all depending on their specific context.
Here's how the above simple WAR does in comparison to Reference WAR (click to embiggen). This is an r=0.94. The concluding part 3 is here.
Recent comments
Older comments
Page 3 of 151 pages < 1 2 3 4 5 > Last ›Complete Archive – By Category
Complete Archive – By Date
FORUM TOPICS
Jul 12 15:22 MarcelsApr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref
Apr 12 09:43 What if baseball was like survivor? You are eliminated ...
Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method)
Jul 13 10:20 How to watch great past games without spoilers