[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

A blog about baseball, hockey, life, and whatever else there is.

Tuesday, January 09, 2024

Explaining WAR simply

In the 1987 Baseball Abstract, Bill James invented WAR. He didn't call it that, and it was not WAR per se, but it was 90% of the WAR game. His idea was ridiculously simple, simple enough that you can put it in a tweet. It went like this:

  • WAR = IP/9 * (lgERA + 1 - ERA) / 10

He actually didn't have that divideby-10, so it's really RAR (runs above replacement) that he introduced.

A few years earlier, Pete Palmer introduced WAA (wins above average). It was even simpler:

  • WAA = IP/9 * (lgERA - ERA) / 10

Notice the difference? That "plus 1"? Yes, the entire dispute in the 1980s and 1990s between Bill James and Pete Palmer was that plus 1. Most of you who weren't around may not have realized that my heroes, the friendly group that started sabermetrics to begin with, could not see eye to eye here.

The solution that would have put them on the same page would have been to turn their WAR and WAA creations into Individualized Won Loss Records. Had they done that, they would have not only been on the same page, they could have likely co-authored books. But alas, it was not meant to be. Their dispute basically opened up the doors for the rest of us to fill in the gap.

Anyway, where was I? Oh, yes, explaining WAR simply. First, let's explain Wins Above Average. So, I said this:

  • WAA = IP/9 * (lgERA - ERA) / 10

Let's break that down. What is IP/9? That is simply number of games. Throw 9 innings, and that counts as 1 full game. Throw 90 innings, and that's the equivalent to 10 full games. 180 innings is 20 games, and 225 innings is 25 games. We don't really care if 225 innings is spread out over 25 actual games or 35 actual games. We're just interested in full-time equivalent games. Ok, so, that's IP/9.

What is lgERA - ERA? Well, ERA is earned runs allowed per 9 innings (or per full-time equivalent game). lgERA is the League ERA. So, lgERA minus ERA is how much better a pitcher's ERA is compared to the league. If the league ERA is 4.00 and the pitcher's ERA is 2.50, then the pitcher is +1.50 ERA better than the league, or +1.5 runs per game better than the league.

So, when we multiply IP/9 (or number of games) by lgERA - ERA (or runs above average per game), we are left with total runs above average. A pitcher with 180 innings (or 20 games) with a 2.5 ERA in a 4.00 league (or +1.5 runs per game) is worth 20 x 1.5 = 30 runs above average.

Finally that divide by 10 is to convert runs into wins. Being 30 runs above average is equivalent to 3 wins above average. Or 3 WAA.

Now, you might see the problem with WAA. Actually, it's not a problem, but a limitation. When you are below average, but still eating innings, you will have a negative number. And the more innings you eat, the more negative your number. Given that teams pay alot of money for an average pitcher, and still a substantial amount for a below average pitcher, showing a negative number in the form of WAA for a pitcher who is actively contributing seems odd. That was Bill's point.

Pete chose league average as his "zero" point. That's how it works when you compare to average. But, we want zero to represent something else: no value. And to do that, you have to figure out the point at which a pitcher is so bad that he is actually not contributing anything. And that's where that "plus 1" in Bill's version comes into play.

Pete had this:

  • (lgERA - ERA)

Bill proposed this:

  • (lgERA + 1 - ERA)

So, instead of comparing a pitcher to the league average, he is being compared to 1 run above the league average. Our 2.50 ERA pitcher, instead of being compared to the 4.00 league ERA is instead compared to the 5.00 "zero baseline" ERA, or the replacement-level ERA. He is now +2.5 runs above replacement. Times those 20 games, and he is +50 runs above replacement (RAR), or 5 WAR.

A league average pitcher with 180 innings would have a 0 WAA and a 2 WAR. How did I get to 2 WAR? 180 innings is 20 games. And his 4.00 ERA, which matches league average, is +1 runs better than replacement. 20 times 1 is 20 RAR, or 2 WAR.

That's it! 90% of WAR can be explained by this:

  • WAR = IP/9 * (lgERA + 1 - ERA) / 10

Next time, I'll show you the core basics of WAR as you see it on Baseball Reference, and creating separate standards for starting and relief pitchers.  (Update: click here for part 2.)

() Comments

Friday, January 05, 2024

To the sublime CoreWOBA from the ridiculous OPS

One of the saber stars from twenty years ago point out that you can use calculus to make sense of the ridiculous mixing of denominators of OBP and SLG that OPS insists upon.  Using his result, it is the same thing as saying this:

  • BB = 1
  • 1B = 1 + PA/AB
  • 2B = 1 + 2*PA/AB
  • 3B = 1 + 3*PA/AB
  • HR = 1 + 4*PA/AB

That "PA/AB" is the ridiculous part of OPS.  Now, watch what happens when I change all those "1" to "2", and remove that PA/AB (meaning we just treat PA/AB = 1).

  • BB = 2
  • 1B = 2 + 1
  • 2B = 2 + 2
  • 3B = 2 + 3
  • HR = 2 + 4

Which of course is this:

  • BB = 2
  • 1B = 3
  • 2B = 4
  • 3B = 5
  • HR = 6

The above is the numerator of CoreWOBA.  

Why settle for the monstrosity that is OPS when CoreWOBA makes it so nice and simple? Yes, yes, I know what you are going to say.  That's called: Inertial Reasoning.  I've been hearing the arguments for twenty years.  They sound no different today than they did then.

Monday, January 01, 2024

Explaining how the worst-sequenced batting lineup will only cost you 0.25 runs per game

Suppose you have the best-sequenced batting lineup, with the following batters and their wOBA

  1. 0.400
  2. 0.370
  3. 0.350
  4. 0.340
  5. 0.330
  6. 0.320
  7. 0.310
  8. 0.300
  9. 0.290

The number of plate appearances (PA) each batter has is 1/9th as much as the next batter. So, we have these number of PA per game:

  1. 5.000 0.400
  2. 4.889 0.370
  3. 4.778 0.350
  4. 4.667 0.340
  5. 4.556 0.330
  6. 4.444 0.320
  7. 4.333 0.310
  8. 4.222 0.300
  9. 4.111 0.290

Now, let's swap the first batter and the last batter. What happens? Well, instead of 5 PA going to the .400 wOBA batter, 5 PA will go to the .290 wOBA batter. That's a drop of .110 wOBA, or .110/1.2 = .092 runs, per PA. For 5 PA, that's a drop of .458 runs. However, our .400 wOBA batter will now increase the 9th place by .092 runs per PA, albeit for only 4.111 PA. So, that's .377 more runs for that slot. The swap of the 1st and 9th batters will give us a total net runs value of .081 runs.

Do the same for the 2nd/8th batters, and we have a net change of .039 runs. The 3rd/7th .015 runs, and 4th/6th is .004 runs. That gives us a total of 0.14 runs of change, based purely on the shifting of talent. A little bit more than that as the number of total PA will be less. The synergy of that change will be less than 0.14 runs. And so, the expectation will be a change of 0.2 runs, maybe 0.25 runs per game.

(3) Comments • 2024/01/02 • Batting_Order

Saturday, December 30, 2023

Introducing Predictive wOBA

Hitting a 450 foot HR is very indicative of a batter's talent. It shows that he has raw power and it shows that he can really put the barrel on the ball.

Hitting a 110 mph high popup to an outfielder for an easy out is also a good indication of a batter's talent. It shows that he has raw power and that a small mistiming is what kept him from hitting a 450 foot HR. This is what is called a Major League Out. For that particular PLAY, an out is an out, and is always bad. For that particular PLAYER, a Major League Out is almost a HR.

Similarly, a Texas Leaguer that clears the infield and lands in front of the outfield is always good for that PLAY, it is not a good indication of talent for that PLAYER.

So, let's talk about Expected wOBA and Predictive wOBA. Expected wOBA is the expected value of that PLAY. A Texas Leaguer is going to have a near 100% hit probability, and so if you have a low launch speed and a high launch angle, you are going to get an Expected Hit Probability that will approach 100%. It will explain that PLAY in RETROSPECT. It is much (much much) better to think of Expected Value in Retrospect, as in: The Expected Hit Probability WAS... Obviously, the word Expected can be used in both backward (expected was) and forward (expected will be), and so is confusing and ambiguous. The x-stats that you see, whether at Savant or anywhere else on the Web, are almost always meant to be retrospective, and simply measuring the PLAY.

Predictive wOBA is different. Predictive wOBA is tied to the PLAYER and not the PLAY. This is a critical distinction to make. A Major League Out is a much better outcome in describing the talent of a player than a Texas Leaguer Hit. When you see a Major League Out, as a fan, you should be disappointed, but as a scout, you should be elated. And the reverse when you see a Texas Leaguer Hit. The Expected wOBA (describing the PLAY) and Predictive wOBA (describing the PLAYER) is what you need to constantly remind you.

Let me show you three charts (click to embiggen). The first maps the current season's actual wOBA (actually it's wOBAcon, since we are only looking at batted balls, or Contacted plate appearances) to next season's wOBA (wOBAcon actually), for the 2020-2023 seasons, minimum 100 batted balls. Why do we compare to next season's wOBA? Because that is a (mostly) unbiased estimate of a batter's talent. That is a correlation of almost r=.5, which is pretty good. You can see that the slope of the estimate is close to .5, which means that next season's wOBA can be estimated as half-way between the current season's wOBA and the league average. So, a HR (wOBA of 2.000) in 2022 will indicate a wOBA of 1.180 of talent in 2023 (half-way between 2.000 and league average of around .360).

As we know, Actual Outcomes are filled with vagaries of the fielders and the park and the ball and on and on. This is why we prefer Expected wOBA over Actual wOBA. Expected wOBA is focused on those launch characteristics most in control of the batter (launch angle and speed), without worrying about whether the ball carries for 300 feet or 320 feet, or pulled at 20 degrees or 30 degrees, or how good the fielding alignment is positioned or how well the fielder reads the ball. All of those variables is what turns an Expected wOBA into an Actual wOBA. And Expected wOBA describes a batter's talent better than Actual wOBA, which you can see by the correlation having an r above .55. The slope of the line also suggests that you would weight the Expected value at 58%, and the league average at 42%.  Every combination of Launch Angle (from minus 90 to plus 90) and Speed (from 0 to 125) has a distinct Expected wOBA value.  The chart is obviously massive at 181 x 126 entries.

Now we come to the star of the show, Predictive wOBA. How does it work? We break up a batter's launch characteristics of each play, first along Launch Angle, into three categories. We have the Ideal Launch Angle, the Sweetspot range, of 8 to 32 degrees. We have launch angles above 32 and launch angles below 8 degrees.

Then we break up a batter's Launch Speed into four categories: 105+, 100 to 104.999, 95 to 99.999, and under 95 mph.

This gives us 12 combinations of speed and angle. For each combination, we get a Predictive Value (analogous to Expected Value, but in terms of the PLAYER, and not the PLAY). Here are those values.

So, when a batter gets his Major League Out, or any batted ball at over 32 degrees of launch at 100+ mph, that has a Predictive wOBA value of .838. This is one of the BEST things a batter can do, as it indicates TALENT. That's what Predictions are: an estimate of the TRUE TALENT of a PLAYER. That perfect hit, a ball hit at the Ideal Launch Angle of 8 to 32 degrees, at 105+ MPH: that is only SLIGHTLY more indicative of talent than a Major League Out: that has a Predictive wOBA value of .867.

The worst thing a batter can do is a high launch angle and low-speed, as that has a true talent value, a Predictive wOBA value of .206.

Once we apply the Predictive wOBA on each batted ball for every player and aggregate it at the season level, we can then compare to the next-season's Actual wOBA. And this is what we get: a correlation of r=0.61. Indeed, if you include all three measures, the Actual wOBA, the Expected wOBA and the Predictive wOBA, we STILL get an r=0.61 to next season's wOBA. This suggests that creating these 12 bins (it's actually 8, as there are some bins that stretch beyond one bin) is sufficient to describe a batter's profile.  And we can completely ignore a batter's Actual wOBA as well as their Expected wOBA.

Can you improve this? The next step really to make it truly predictive is to also incorporate the amount of batted balls in the sample. The higher the number, the more indicative the outcomes are. But, we'll save that for a future thread.

So, we no longer need to compare Expected wOBA to Actual wOBA and talk about luck or Random Variation as being the distinguishing feature as to why the Expected wOBA diverged from Actual wOBA. No. What we actually care about is Predictive wOBA, and we don't even care about Actual wOBA any more, not if we care about the True Talent of our batter.

Next time, I'll repeat this process for Pitchers. I haven't done it yet, so I'm just as curious as you are.

(37) Comments • 2024/01/12 • Batted_Ball Statcast

Thursday, December 28, 2023

Improving WAR - Re-solving DIPS (part 2)

From 2016 to 2023, the pitcher among the lowest BABIP in MLB is Justin Verlander. With 3196 balls in play, his hits allowed rate is 142 fewer hits than league average. That is a substantial number, by far the highest number in that time frame. In second place is Kershaw, at 95 hits better than average. In third place is Scherzer, at 83 better than average.

This seems like a perfect refutation of Voros and DIPS, which we discussed in Part 1. If I asked you to name the three best pitchers since 2016, Verlander, Kershaw, and Scherzer could very well make up that top 3. So that potentially the three best pitchers in MLB also happens to have the best hits on balls in play is not noteworthy in the least. The next names on the list however are Julio Teheran, Cristian Javier, John Means, Tony Gonsolin, Yusmeiro Petit, and on and on it goes. deGrom is 114th out of 654 pitchers. Gerrit Cole is 89th. Aaron Nola is 466th. Wheeler is 296th. The second WORST pitcher on hits on batted balls also happens to be the 8th best in FIP-based WAR: Kevin Gausman. These 8 pitchers by the way lead in WAR on Fangraphs, a metric that ignores balls in play. Looked at it holistically, this better describes the original issue Voros found: how much attribution can we possibly give the pitcher on balls in play?

Part of the problem we have is how I even introduced it in the first paragraph. I said Verlander is among league-lows in BABIP. But more accurately, we should say that Verlander AND HIS FIELDERS are among the league-leaders. We can't just bypass his fielders. And we still have the issue that so much of what happens on batted balls goes beyond the pitcher and his fielders and their park. Random Variation weighs heavily, in ways that you don't see in other stats.

Ok, let's get into it. We can use Statcast data to directly determine the contributions of the pitcher. We can look at their launch angle and speed to determine how well effective they are. When we do that, we can see that Verlander gives up alot of soft batted balls, to the point that he ALSO happens to be the best pitcher in baseball since 2016 on launch-based hits on balls in play. Well, take that Voros! Except, well, the magnitude is not there. When we look at Verlander and his fielders, their BABIP suggests 142 fewer hits than league average. But when we look at Verlander and his allowed launch angle and speed, that suggests 76 fewer hits than league average. The breakdown for Verlander looks like this:

  • +22 fielders with Verlander
  • +76 Verlander using launch angle+speed
  • +44 everything else, including Random Variation
  • ====
  • +142 Verlander's team, when Verlander is on mound

Here is Kevin Gausman:

  • -15 fielders with Gausman
  • -68 Gausman using launch angle+speed
  • -2 everything else, including Random Variation
  • ====
  • -85 Gausman's team, when Gausman is on mound

Gausman is interesting in that we can explain the ENTIRETY of the poor BABIP with himself and his fielders. He's had the bad luck of having poor fielding behind him, 15 hits worse than average. But the rest of the outcomes is because of Gausman himself. Relying on FIP-only for Gausman would not be a good idea.

We can do this for every pitcher. I will show you a chart (click to embiggen), that shows, on the x-axis, how well each pitcher, and their team, do, compared to league average. You can see Verlander on the far-right and Gausman on the far-left.

On the y-axis is the direct contribution of each pitcher. While in some cases, the two correspond, like with Verlander and Scherzer and Kershaw and Gausman. In other cases, there is little overlap. Take for example Adam Wainwright:

  • +10 fielders with Wainwright
  • -75 Wainwright using launch angle+speed
  • +26 everything else, including Random Variation
  • ====
  • -39 Wainwright's team, when Wainwright is on mound

This is a mess to resolve. Wainwright has been hit very very hard. Indeed, he's been league-worst, using launch angle and speed, at 75 more hits allowed than league average that we can directly attribute to his launch characteristics. He's had the good fortune of playing with good fielders. When he was on the mound, they made 10 extra outs than the average fielder. There was another 26 extra outs that we can't attribute to the pitcher or his fielders. Whether this is Random Variation, or it's the fielding alignment mandated by his coaches, or Wainwright somehow managing to gets more balls hit closer to his fielders, we can't really tell. All in all, Wainwright's team, with Wainwright on the mound, only gave up 39 more hits than league average.

Sometimes you get into issues like Zach Eflin, who is better than league average on how hard he is hit, and yet is worse than league average when he is on the mound with hits on balls in play. Do we really want to attribute to Eflin things that he has no control over, simply because he happens to be on the mound when those bad things happen? Why not attribute some of that to his fielders, who are equally not-complicit, but are equally present? Or maybe, stop attributing things that we don't know who to attribute to, simply because we've identified they are present? Sins of the Father and all that.

This is what it looks like if you compare the direct contribution of the pitcher, using their launch angle and speed, to the "everything else" I've been talking about. As you can see, virtually no correlation. In other words, after having identified the contribution of the pitcher directly by how hard they are being hit, whatever is left over has no association to that. Whatever is left, which is going to be mostly Random Variation, has likely very little to do with that pitcher.

When you look closely at first chart, we can come up with the general point: about half of the results we can attribute to the pitcher. In some cases more, in some cases less. In some cases, there's a reverse effect (like Eflin). But, if we simply use as our starting point that we'll count half of the outcome and give it to the pitcher, then we've taken a big step forward in better attributing outcomes to the underlying contribution.

Should we completely ignore hits on balls in play? No. The pitcher is not a pitching machine. There is some influence there.

Should we completely accepts hits on balls in play? Also no. The pitcher is not in total control here. There's alot happening that has no bearing on the pitcher.

Should we split the difference, give them half, and move on? For the pre-Statcast years: yes. Without any additional information as to their direct impact, then we have to infer their impact. And it's about half of what you see. Basically, BABIP is somewhere between Pitching Machine 4587 and that pitcher. And that's how much attribution we should give the pitcher.  In Statcast years, we have more information, and so we can better attribute the impact of the pitcher to the outcome when they are present.

We can of course be a little fancier, and figure out fielder influence as well, but that's a story for another thread.

Improving WAR - Resolving DIPS (part 1)

Twenty years ago, Voros shook the saber community with one of the most important saber discovery to that point, and still a top ten discovery of all saber-time. He called it DIPS, or Defense-Independent Pitching Statistics. My tiny contribution to that was FIP, which is merely a shortcut to the full-fledged DIPS. Had I not invented FIP, Voros would have eventually created it anyway.

The illustrations that Voros provided was extremely compelling. In 1999 and 2000, Pedro Martinez had perhaps the greatest stretch of two pitching seasons ever, in the history of baseball. It's difficult to even decide which of the two seasons was the better one. His ERAs were 2.07 and 1.74, and this is in the middle of the high scoring era. He had 313 strikeouts in one of the seasons and 284 in the other. And this is while pitching only 213 and 217 innings each season. In the season where he gave up 32 more hits, he also gave up 8 fewer HR. All in all, it's hard to decide which of the two seasons were better, and in any case, the two stood together as perhaps the best pitching seasons back to back.

What did Voros point out? If you remove the strikeouts and homeruns, and compared the non-HR hits to all remaining batted balls, what he called BABIP (batting average on balls in play), Pedro had among the league-low of .236 one season and among the league-high of .323 in the other season. This seemed ridiculous on its face. How could perhaps the greatest pitcher ever, having one of his two best pitching seasons ever, allowed hits on balls-in-play at a close to league-high rate? And how did he pair that up with a league-low rate in the other season?

This would suggest that allowing non-HR hits on balls-in-play might be pretty random. After all, Pedro would not pair a league-leading strikeout one season with a league-low strikeout another season and STILL be one of the best pitchers ever. You couldn't do that with walks either, or homeruns. It just doesn't work like that. But, non-HR hits on balls-in-play? Well, it happened. And it wasn't just Pedro either. While pitchers had a fairly stable SO, BB, HR year to year, their BABIP fluctuated greatly.

In retrospect, we should have known. Because Random Variation would have told us. But, no one ever looked, not until Voros. The key point of his discovery is that Voros created the denominator: balls in play. That was the key. Once that was done, then you could apply basic statistical principles to determine how much Random Variation could have impacted BABIP. Assuming 500 balls in play, then one standard deviation was roughly 0.46 divided by root-500 or 20 points. Two standard deviations is 40 points. So, going from 2 standard deviations worse than average to 2 standard deviations better than average is not that noteworthy from a performance standpoint. Look hard enough, and someone will do that year after year. In 1999-2000, that just happened to be Pedro. Even Pedro was subject to Random Variation.

Still, what do you do with this information, that Pedro had a .323 and .236 BABIP in back to back seasons? This is where you get into ATTRIBUTION and IDENTIFICATION. Suppose that pitching was done via pitching machines. And through Random Variation, you will end up with some games with 3 hits and other games with 13 hits. Nothing changes. It's the same machine, the same opposing batters, the same fielding alignment. Nothing changes. Except, because of Random Variation, you will get a random result of hits. We've identified the entity on the mound (Pitching Machine 4587). But do we attribute the results to that machine? Or, is the machine simply inconsequential?

Now, humans are different: they are humans. And when it comes to human behaviour and human talent, they can influence results. Now, just because they can influence SOME of the results, doesn't mean they can influence ALL the results. We can identify who the pitcher is on the mound, but do we attribute everything that happens to the pitcher? After all, we have human fielders involved, and we have the vagaries of the park and weather that day. The batters change, and heck, every ball is like a snowflake: no two balls are alike.

Just because we've identified Pedro, and we've calculated a BABIP of .323 one season and .236 another season doesn't mean we attribute all of that to Pedro. There's other entities involved here. Pedro cannot possibly absorb all those outcomes, given that he's one influence.

At the time twenty years ago, I was involved in a discussion and research called Solving DIPS, which basically determined, through basic statistical principles, that Random Variation was the large agent, while the pitcher and fielders were also significant agents, as was the park.

Next up: we'll set aside all that theory and look at things more factually.

Wednesday, December 27, 2023

Is Spencer Torkelson confident, or over-confident, in his swing?

And does Altuve abandon his swing too often?  

I don't know.  But to help us get us there, we can look at how often a batter has a full swing, at each plate location and ball-strike count (click to embiggen).  The first set of numbers is the league average. I (for now anyway) define a full swing as follows: take a batter's 50% fastest swings, take that average, subtract 10 mph, and that's the minimum threshold of swing speed for a full swing.  Anything below that is an abbreviated swing.  League average is about 10%.

The second set of numbers is Spencer Torkelson, who at 95% of his swing as full swings is among the league leaders.  That he is also among league leaders in strikeouts is not a coincidence.  The last set of numbers is Jose Altuve, who at 80% of his swings as full swings is among the league lows.  That he is among the league-lows in strikeouts is also not a coincidence.  Also note that he reserves his abbreviated swings especially in 2-strike counts, to a much larger degree than league average.

Tuesday, December 26, 2023

Are batters confident or over-confident on ball-strike counts that favour the batter?

Look at this chart. You will notice that batters, when a pitch is in The Heart of the Plate, have the slowest swing speed at 0-2 counts (70.1 mph) and fastest swing speed at 3-0 counts (74.4 mph).  Indeed, at EVERY count, the more balls, the higher the speed, and the more strikes the lower the speed.  Roughly speaking, every ball, the speed increases by 0.5 mph, and every strike, the speed decreases by 1 mph.  That's for The Heart of the Plate.

This directional progression (though not the same magnitude) is maintained when the pitch is in The Shadow Zone as well as the Chase Region.  It's only in the Waste Region where the ball-strike count does not matter.

While this progression makes sense in the Heart of the Plate, it makes no sense in the Chase Region.  At this point, the pitch is at least a few inches off the plate.  At a 3-0 count, there's no (good) reason for the swing speed to be at 71.9 mph, while it is 64.2 mph at 0-2.  This is a good sign that the batter is being overly aggressive at 3-0 in the Chase Region.

We can learn more by looking at the Run Values by location and count.  Focus on the Swing columns, and start with Heart of the Plate.  Swinging at 0-2 is providing far more benefit than swinging at 3-0, when the pitch is the Heart of the Plate.  Even though the batter is swinging less hard.  Indeed, if you follow the progression, it is almost a complete reverse of the speed progression: the more strikes, the better the batter is doing on swings, while the more balls, the worse the batter is doing.  

My initial guess is that swinging at 0-2 at a pitch in the Heart of the Plate has the batter with a more defensive swing, hence the lower speed.  And at 3-0, the batter is more aggressive, not worrying about any swing-and-miss, since the worst case is getting them at 3-1.  However, overall, this is not working out.

Naturally, not all batters are going to behave the same way.  I am sure if we look at the best and smartest batters, like Juan Soto and Luis Arraez for example, we'll likely learn what the more optimal approach should be.

What I'd like to learn is if this batting approach ability is something that can be taught, or is it something that pitchers will exploit in a batter early on, and thereby doom that batter to a shorter career.  So much to learn...

Monday, December 25, 2023

Dear batters, On a 3-1 count, never swing.  Ever

As you may know if you've read my past threads on the topic of swings and counts: batters swing too much. 

Here's the complete chart (click to embiggen) of Runs per 100 pitches, split between Take and Swing.  There's a further breakdown by the Pitch Location, as well as the Ball-Strike Count. Focus especially on the 3-1 count.  While a pitch in the Heart of the Plate is always a swing, for a 3-1 count, it is barely a swing.  On a 3-1 count in the Shadow-In zone (so just barely inside the strike zone), it's a Take.  And obviously, any pitch outside the strike zone is always a take.

Given that batters are not that discerning, it is much easier to say that a batter should never EVER swing at a 3-1 count than it is to ask them to distinguish between a pitch in the Heart of the Plate, from anywhere else in the zone, whether inside or outside the strike zone.

() Comments

Swing Speed, By Plate Location and Count

One of the very first things I did with Statcast data was break up the plate location into zones, beyond just in/out. Humans have a terrific grasp of nuances, and so, we should lean on those nuances. Instead, too often (much too often), we think in terms of binary terms or worse, we categorize things in binary terms.  But rarely are things binary. 

Take the strike zone.  There's a difference between a pitch thrown in the heart of the plate, and another one that is just inside the edge of the strike zone.  The batter, pitcher, catcher, umpire all respond to that nuance.  And so, to simply say "in the strike zone" loses that flavour.  And so, I split up that strike zone into Heart of Plate and Shadow-In.  Even pitches outside the strike zone should be separated.  There's a difference between a pitch just outside the strike zone, and one that is way outside.  For pitches in Shadow-Out, the batter is just as likely to swing as to take a pitch.  There's a Waste region where the batter is rarely fooled, and so will rarely swing.  And between the two is the Chase region, a region where good batter can lay off a pitch, while a bad batter will swing much too often.

With the forthcoming (do not ask me when) data on swing speeds, we can actually track the behaviour of the batter: how fast do they swing based both on the plate location and count?  Well, here it is (click to embiggen):

(2) Comments • 2023/12/26 • Bat_Tracking

Saturday, December 23, 2023

Swing Speed: Arraez v Acuna

Acuna has a swing speed of 77.4 mph, one of the fastest in the league.  Arraez has a swing speed of 63.8, one of the slowest in the league.  When we limit each of their swings to their personal 90% fastest swings (meaning we drop their 10% slowest), here is how their distributions stack up (click to embiggen).

As you can see, their shapes are similar, but just shifted over by 13-14 mph.  Notice that around 67-73 mph they overlap: Arraez at his fastest 20-25% of swings is Acuna at his 20-25% slowest of swings.  

Now, look what happens when we show the run production by swing speed:

Arraez is overall -4 runs on swings.  But at 67+, he is a healthy +7 runs (and naturally -11 runs below 67 mph).  Acuna on the other hand is at his worst at under 76 mph, -6 runs, while he is a superlative +16 runs at 76+.

As you can see, Arraez at 67-73 and Acuna at 67-73 is totally different.  Arraez at his top speed means he did everything he wanted to do, while Acuna at his low speed means that there's an indication of something going wrong.  That's why you can't just look at swing speed on its own: it really needs to be evaluated based on that batter's swing distribution.

More to come...

(5) Comments • 2023/12/25 • Bat_Tracking

Improving WAR - Harper at 1B or RF: What is a positional adjustment, anyway?

One of the things that was very obvious with regards to fielding positions is we have different baselines for comparison.  Anyone who plays Fantasy Baseball knows this immediately.  Two players with the same batting line, one at SS and one at 1B will have very different purchasing costs.  And anyone in Fantasy Baseball will come up with a comparison level to figure out what a .320 wOBA SS would be the same cost as a XXX wOBA 1B.  Let's say in Fantasy Baseball, they figure that a .320 wOBA SS will cost the same as a .350 wOBA 1B.  So, that's the comparison level, a 30 point adjustment.  That's what is needed.  So, a .300 wOBA SS would cost the same, and therefore have the same value as a .330 wOBA 1B.  And a .350 wOBA SS would have the same value as a .380 wOBA 1B.

We also know this instinctually as fans, but we don't have the obvious tradeoff in the real-world that we can more easily calculate with Fantasy Baseball.  The conversion values that I have for  a 1B / RF is roughly 5 runs.  So, what does this mean?  Well, you have a 1B that is league average fielder (compared to the other 1B in the league) and you have a RF that is league average fielder (compared to the other RF in the league), and your RF creates 100 runs on offense, then in order to pay a 1B the same amount and therefore have the same value, then that 1B would need to create 105 runs on offense.  This happens because of the obvious reality: the average fielding RF is a better fielder than the average fielding 1B.  How much better?  About 5 runs.

It also works the other way: you have a player that creates 110 runs on offense as a RF, and you have another player that creates the same 110 runs on offense as a 1B.  And the RF is a league average fielder (compared to other RF).  Then in order to have these two players worth the same value, the 1B has to be +5 runs better as a fielder (compared to other 1B).

Now, the average RF as a fielder is obviously a better fielder than the average 1B as a fielder.  That positional adjustment, while not necessarily required to be merged into offense or defense makes it easier if we do merge it into defense.  Again, not required to merge it, it can stand alone by itself.

Consider Bryce Harper.

Harper in his Statcast years has been -27 for range in the outfield, and +8 for his arm, or -19. With about 6000 innings, that’s the equivalent of just over 4 full seasons. And so, he averages about -5 runs per season, compared to the average fielding RF.

If Harper ends up being as good a fielder as the average fielding 1B, then his defensive value is going to be the same, whether he plays 1B or RF.

The positional adjustment is not a penalty or a reward. It is simply a way to compare average fielders at each position, with the understanding and knowledge that the average fielding SS is a much better fielder than the average fielding 1B. And we have a defensive spectrum that establishing the range in values.

And merging the positional adjustment with his positional-fielding value makes the most sense. But, again, not necessary to merge it, and it can stand on its own. But it is needed.

() Comments

Tuesday, December 12, 2023

FIP and xwFIP

Another excellent article from Josh, incorporating BABIP through the lens of xwOBAbip to give us xwFIP.  In other words, he keeps the actual BB, SO, HBP, HR while replacing the rest of the BIP by using the xwOBA data.  And it gives us a step forward.

FIP of course remains ubiquitous and offers the ideal Naive method for pitcher evaluation.  The amount of effort to calculate FIP is almost non-existent, while also being completely transparent.  And as we know, FIP represents a pitcher's past performance better than ERA does.  Using FIP unaltered for future prediction is a happy byproduct of its construction, but not its raison d'etre. FIP was, is, and always will be an evaluation of a pitcher's past performance.

(1) Comments • 2023/12/14 • Pitchers

Friday, December 08, 2023

Individualized Won-Loss Record of Pedro Martinez

I wrote this on Bill James site last year, but since that site may come down, I will reproduce it below.

***

Just a general point regarding WAR v Win Shares, which we can bypass altogether if we just focus on Win Probability Added (WPA), which has the advantage of guaranteeing everything adds up, not only at the game level, but at the individual play level.

And if you look at Pedro's WPA, he comes in at +51 wins above average for his career.

His W/L record is 219-100, or +119, or +59.5 wins above average.

His runs allowed rate is 66% of league average, and Pythag (using 1.82 exponent) says that's close to a .680 record, or +58 wins above average.

So, trying to come to terms with how good Pedro is is pretty straightforward, as we have good agreement using multiple methods. He's +50 to +60 wins above average. This is good enough for my illustration below.

So, if we were to create an "Individualized Won Loss Record" for Pedro, it should be pretty straightforward: let's give out for each pitcher a "game slice" of .42 games for each 9 innings. Pedro's 2827 IP is 314 9-inning games and so he'd get 132 game slices. The average is obviously 66-66.

Since Pedro is about +50 to +60 wins above average, using any method you choose (and using +50 in this illustration), then his Individualized Won-Loss record will come in at 116-16 or so. If you chose .37 games for each 9 innings, then it's 108-8 record. It doesn't matter (too much) what you use, whether .37 or .42 or whatnot.

It will matter (a bit) when you compare to the ".300 level" pitcher, or whatever baseline you choose. A 116-16 record is 76 WAR and 108-8 is 73 WAR.

The key point is that I can make everything add up at the season, game, or play level. And I can do so by using the centering point of .500. And I really, really, really think the entire problem of WAR v Win Shares is we are not talking about it using two dimensions. Because if either of them is appreciably different from this 108-8 or 116-16 record, then we'd have something more tangible to talk about that would actually move the argument forward.

Can you Bill provide the Win Shares / Loss Shares of Pedro's career?

***

Bill: No editorial responses here, because I don’t want this to become a debate exactly, but I can’t produce Pedro’s Win Shares/Loss Shares right now because I haven’t used that spreadsheet in a couple of years and don’t remember what it was called, where it is or how to use it. I’ll look into it, but the next three weeks are the busiest time of the year for me, because this is when we write the annual Bill James Handbook. But I’ll try to remember to get to that.

(5) Comments • 2023/12/09 • WAR

Thursday, December 07, 2023

Bill James Walkoff, Part 2: Live-blogging

My thoughts on the rest of the book is here in part 1.

Below I will live-blog Bill's chapter on Win Shares (and WAR).  I have not read it, so as I read it, I will update this thread.  I only have an hour, so I may have to pick this up tomorrow, we'll see.  For those new to the topic, Bill has generally been as tough on WAR as I have been on Win Shares.  Obviously, both of us feel we are on firm ground.  

To prepare yourself, you may want to read this back-and-forth we had (though because Bill did not give explicit permission to post his words, I only showed my words).

Anyway, time to open the book.  Jump the line when ready...

Read More

(19) Comments • 2023/12/15 • Bill_James

Bill James Walkoff, Part 1: Cold-blogging

Bill released his latest (and last) handbook, called Bill James Handbook Walk-off Edition.

I'm going to do a Part 2, Live-blogging Bill's Win Shares / WAR article. I have not yet read it, so I will reserve commentary on that.

In this one, I have read the rest of his book, and I'll comment on a few things that jumped out on me.

Read More

(3) Comments • 2023/12/07 • Bill_James

Wednesday, December 06, 2023

Statcast Catch Probability: Kiermaier v Castellanos/Schwarber

Since 2016, the best fielding outfielder in baseball has probably been Kevin Kiermaier.

The Problem

One of the issues we have with fielding stats is that there's alot of routine plays that are counted. Unlike batters and pitchers who actively participate in their confrontation, thereby making similar quality matchups over several hundred plate appearances, this does not apply for fielders. This is because fielders are not active participants in the confrontation, but rather recipients of opportunities.

So, what does this mean? Well, for any one plate appearance, Kiermaier would have a 25% to 40% chance of reaching base, all depending on the pitcher and park. But out on the field, he has a 0% to 99.999% chance of making an out. Furthermore, because Kiermaier faces a similar set of pitchers and parks as any other player, in the aggregate over the course of a season, the context of his chance of reaching base is going to be close to the league average. As a fielder, that's not the case at all, because we have no expectation that he'll face a proportionate share of easy and hard plays like other CF. There's nothing that's built into baseball that would allow for that.

The Theory

Therefore, it's all about quality of opportunities. What is an opportunity for a fielder? Well, the core is what you would think: the more distance he has to cover, the harder the play. And the more time he has, the easier the play. Let me throw some numbers at you.

  • From pitch release to fielder, if the ball is in the air for at least 4.5 seconds, and the fielder has to run at most 50 feet to get to the ball, there is a near 100% chance of the ball being caught.
  • Similarly, a ball in the air for at most 4.5 seconds and the fielder at least 90 feet to get to the ball, there is a near 0% chance of the ball being caught.

So, at 4.5 seconds, between 50 feet (100% out) and 90 feet (0% out) is where the real opportunity to make a play happens. The sweetspot, the point at which it's a 50/50 play is 75 feet: with the ball in the air for 4.5 seconds, and the fielder needing to run 75 feet to get to the ball, there have been 483 batted balls of which 230 were caught, or 48%.

Actual Data

How does Kevin Kiermaier do in such plays? At 70 feet, he's 38 for 38 (100% out). At 80 feet, he's 14 for 30 (47%). In other words, Kevin Kiermaier is 5 feet better than league average in such plays. Whereas the league average outfielder is 50/50 when they have to run 75 feet, Kiermaier's sweetspot is at 80 feet.

It also goes in reverse. I've combined the two worst outfielders according to Statcast OAA (Outs Above Average) into one: Nickyle Schwarbellanos. I'll call him Nickels for short. At 80 feet and 4.5 seconds, Nickels caught only 5 of 43 plays. At 70 feet, they were 26 for 55 (47%). In other words, they are 5 feet worse than league average.

And that's the difference between the absolute best outfielder and the absolute worst: 10 feet. Give our bad outfielder a series of 70 foot plays, and give our best outfielder a series of 80 foot plays, and they'll both catch ~50% of those for outs.

The Analogies

If you need a basketball analogy: imagine setting up a free thrown line that is 30 feet from the basket for a poor shooter, and 40 feet for a good shooter. But, we don't record the distance, only the success. And both shooters end up with a 50% free throw rate. Without knowing the distance, we'd assume similar opportunities and so, equal talent.

Or a football analogy: imagine all good kickers can only kick a FG from the 40 yard line, while all bad kickers can kick from the 30 yard line. Both will end up with a 50% success rate on FG. But, we don't track distances, only success rate. They both get 50% success rate, and so both kickers look equals.

Or even a school analogy: your kid is in AP Math, while someone else is in remedial Math. Both score 90%. On their transcript, it only shows "Math" as the subject.

So, quality of opportunity, or competition, or context is needed. We can't assume it evens out the way we can mostly assume batters will face a similar quality of pitchers. We need to track quality of opportunities.

(Click to embiggen) This chart shows the percentage of balls caught by Kiermaier and Nickels, for every combination of distance and time. Directionally, it all makes sense: 100% is in the top left corner, where we have lots of time, or limited distance; while 0% is in the bottom right, where we have limited time or too much distance.

I put a green box as a reference point, at 80 feet and 4.5 seconds. If you look at the Nickels chart at the bottom and scan all the 100% boxes, then you look up at Kiermaier's chart, you can see that Kiermaier matches them at 100% or close to it. However, you will notice that Kiermaier has a few extra 100% boxes. At 70 feet and 4.5 seconds, Kiermaier is 100%, while Nickels is 47%. Even at 65 feet, Kiermaier is 100% while Nickels is still at 86%. Or 70 feet and 5 seconds: 100% for Kiermaier and 80% for Nickels.

So, how do we end up with a value metric? It's pretty straightforward. If the league average for a particular play given distance and time is 70% out probability, then any play made is worth +0.3 outs (or 100% minus 70%), while any ball that lands is worth -0.7 outs. All we end up doing is adding up all these individual plays. Kiermaier ends up with 84 more outs made than the league average outfielder, while each of Castellanos and Schwarber are at 64 fewer outs made.

There is very little ambiguity here. Go back to the 80 feet, 4.5 seconds opportunities: Kiermaier is at 47% plays made, while Nickels is at 12%. What other variables could affect the quality of those opportunities? Well, there is the wall as a potential impediment. And running back is harder than running in, in terms of success rates. Indeed, when we consider those parameters, the quality of the Nickels in the 80/4.5 group of plays is a bit tougher: 29% chance of making an out by a league average outfielder given the Nickels opportunities compared to 37% for the Kiermaier plays. The main point is that the quality of opportunies are largely driven by distance and time. The other variables, which are considered in OAA, are secondary variables. So, imagine any other variable not considered: how much effect could those have? Very little, since now we are into the tertiary variables group.

This chart makes it very clear. All plays are broken down by the quality of opportunity, from group "0" (meaning 0 to 9.99%) to "9" (meaning 90 to 99.99%). In every single group, Kiermaier is ahead of Nickels, in many cases far far ahead. That last group is why fielding stats are problematic, if you don't know distance and time. There's nothing to distinguish there in terms of quality: it's as if the worst field goal kicker and the best field goal kicker always kicked from the 10 yard line. There's just not enough difficulty there to be able to find out who the better kicker is. Or even more apropos: combine FG kicks and Point-After-Kicks into one group. And if you have so many opportunities like that, its because a sea of noise that drowns out all the other signal. Those plays in the "9" group represents almost 80% of all plays.  (Those are similar to the Point-After-Kick.)

Starting in 2016, you should feel high confidence that we've got fielding evaluated well enough for outfielders. The main variables are distance and time, and the relationship there is quite clear. The secondary variables are direction and wall, and the relationship there is not as clear, but also not as impactful. Any tertiary variables unconsidered would be quite limited in impacting any conclusions.

() Comments

Tuesday, November 28, 2023

Imminent Strike 3

The best 2-strike batters are typically the best batters overall. Since 2020, Freddie Freeman leads all batters in Batting Run value (+192 runs) overall, as well as when limited to two strikes (+71 runs). Some batters get their entire value by being a good 2-strike batter, like Marcus Semien: +50 runs with 2-strikes, while being +58 runs overall.

What I am interested in is the Imminent Strike 3. What is that? The definition I've created is simple enough: the batter has 2 strikes, and the upcoming pitch is going to enter The Heart of the Plate. If the batter takes that pitch, it will be strike 3. So, the batter needs to swing at that pitch.

There are four basic results that I've broken down the Imminent Strike 3:

  • Poorly Accepted - Called Strike 3
  • Poorly Avoided - Swing and Miss, or Batted Ball Out
  • Neutrally Accepted - Fouled Off
  • Smartly Avoided - Base Hit

We can of course break things down even more, by looking at xwOBA, which we can do next time.

Since 2020, Victor Reyes has never received a called strike 3, when the pitch enters The Heart of the Plate: he had 191 Imminent Strike 3, and zero called strikes. Among the power hitters, Jose Ramirez is the model batter, with 519 Imminent Strike 3 and only five Poorly Accepted (called strike 3). In terms of Do Something, Ramirez clearly was going to do something, and not just accept a strike 3 down the middle.

On the flip side are batters who clearly need to learn. Elly De La Cruz had 111 Imminent Strike 3, with 21 of those as called strikes. The league average is 6% and EDLC is at 19%. Julio Rodriguez is the model for these batters, with 281 Imminent Strike 3, of which 32, or 11% are Poorly Accepted (called strike 3).

How about the leaders/trailers in Poorly Avoided (out while swinging)? The league average is 41%, with Nick Allen leading at 55%. Among the good batters at the top is Joey Votto, at 47% Poorly Avoided. Batters with the fewest Poorly Avoided were Jordan Walker (only 24%), with Nathaniel Lowe in the top 10 at 33%.

The Neutrally Accepted is, as the name implies, Neutral: you foul off a pitch, and it's still 2 strikes. Maybe it's not so neutral, since the pitch is in The Heart of the Plate, and so, there is some advantage is making sure you live to fight another pitch. Then again, it's the best pitch you will see on Strike 2, so, maybe not so good? Anyway, I guess that makes it the definition of neutral. Oswaldo Cabrera at 50% at one end, and Esteury Ruiz at 21% at the other end. Make of it what you will.

Finally, Smartly Avoided the Imminent Strike 3, with a base hit: Oscar Gonzalez with 23%. Among the good batters are Michael Brantley, also at 23%. Marvin Gonzalez only had 6% Smartly Avoided plays at Imminent Strike 3. Indeed, it is really hard to be an overall good hitter while having so few Smartly Avoided plays in this category. You have to get all the way to 11% Smartly Avoided (league average is 15%) with Sean Murphy to find a good overall batter.

(Click chart to embiggen) Does all of this mean anything? Or is it just another interesting, but ultimately useless, categorization? Juan Soto is overall at +158 runs since 2020, which is third highest after Freeman and Aaron Judge, with a very high +39 runs with two strikes. But he is minus 11 runs with Immiment Strike 3. Does this suggest that Juan Soto, a genius hitter, can be beat? Maybe. It does seem that you could challenge Soto with two strikes.

Kyle Tucker and Matt Olson on the other hand are the best batters in baseball when Imminent Strike 3. They should not be challenged. Even among the unchallengeable, which includes Aaron Judge and Austin Riley we find Teoscar Hernandez: he is at almost plus 30 runs when the pitch is in The Heart of the Plate, while being almost minus 30 runs when the pitch is NOT in The Heart of the Plate. Teoscar is an overall above average batter who has been reduced to a league average batter with two strikes. So it would seem that he should not be challenged with two strikes.

Anyway, as I said, I don't know if this means anything. I don't know that it has to mean anything. But, this is just another way of organizing data, which is really half the battle with sabermetrics. The other half is giving it meaning. And the last ten percent is making sure the math checks out.

() Comments

Monday, November 27, 2023

Deconstructing Leveraged Index

About twenty years ago, I introduced Leveraged Index to our little baseball community in a three-part series, which you can read about at the old The Hardball Times (since subsumed by Fangraphs).

The concept was straightforward enough: determine for each game state (inning, score, runners on base, outs) how much impact that situation had to winning or losing the game. Directionally, it made perfect sense: close and late had a higher leverage than blowouts. Of course, sabermetrics is not about directionality, but magnitude. We all know that there's a platoon advantage by handedness, but HOW MUCH is that advantage? We all know that it's easier for a pitcher to throw one inning in relief rather than six innings as a starter, but what is the DEGREE to which it's easier? We all know that Coors is hitter friendly but what is the magnitude of that effect?

And so, as part of that series I also released the Leveraged Index chart.

A few years later, in a long-forgotten article, I also discussed the Leveraged impact focused only on the base-out state. Bases empty? That's low leverage. Bases loaded? That's high leverage. We all know that. With the bases empty, a walk has limited impact. With the bases loaded, it guarantees a run.

Here is the Leveraged Index by Base-Out state (click to embiggen).

Baseball Reference uses a form of this chart on their site, in order to deleverage RE24 (run expectancy by the 24 base-out states), as RE24/boLI. Baseball Reference has so much data oozing out of its site, I would bet that most folks are not aware of this fantastic page.

In 2023, the pitchers to look at are Blake Snell and Spencer Strider. As we know, the performance of Snell with runners on base was much better than his performance with bases empty, while Strider was his opposite. If you treat both scenarios the same, you would think that Snell and Strider had equivalent pitching seasons. But, if you give more weight to facing batters with runners on base, when those situations has more leverage, then Snell easily outpitched Strider.

So, if you allow the leverage to drive your opinion, then RE24 is what you want, and we can see that Snell was almost 30 runs better than Strider. Which of course tracks with their ERA: Snell gave up 45 ER on 180 IP, while Strider gave up 80 ER (or 35 more) in 186 IP. Add 2 ER and 6 IP to Snell, and you have identical IP and 33 fewer ER for Snell. RE24 gets you to a similar place.

But if you deleverage those high-leverage base-out scenarios, Snell is only 4 runs better than Strider. Which of course tracks with their similar wOBA. See, what happens is that each bases loaded situation counts as 2.5X to 2.9X as much as a regular scenario, while bases empty count as 0.4X to 0.9X, depending how many outs. Is a bases empty, 2 out (leverage index of 0.39) strikeout worth exactly the same as a bases loaded, 2 out (leverage index 2.85) strikeout? Or, is the bases loaded one worth 2.85/.39 = 7.3X as much as the bases empty one? In terms of the run impact, it's 7.3X, no question. In terms of a strikeout is a strikeout is a strikeout, they are identical.

How do you see it? Is a 10 yard pass inside the opponent's 10 yard line worth the same as a 10 yard pass at your own 20? You tell me. Is a 20 foot pass through the slot to an open player for an easy goal worth the same as a 20 foot pass from the goalie to his defender? You tell me. You get to decide whether a pass is a pass is a pass. Or whether the context matters. You get to decide whether all strikeouts are the same or whether the context matters.

Now, as much as all the above is fun to think about, I'm here to finish the job in terms of Deconstructing Leveraged Index. I just presented the Base-Out Leveraged Index. Now I'm going to present the other half, something I've never done: the Inning-Score Leveraged Index. Here it is (click to embiggen).

A half-inning of "16" means the bottom of the 8th. All the black lines are the top of the inning. Here we can see that leverage is maximized when the batting team is down by 1 run. This is of course not a surprise, directionally. But as I said, sabermetrics is about the magnitude, not the direction. We can confirm direction, but the value is in establishing the magnitude. And that's what we have here.

And we can also see something plainly obvious: a tie-game and the batting team down by 2 are worth roughly the same, while batting team ahead by 1 or down by 3 are worth roughly the same.  So, if you are going to deploy a good reliever, and your choice is a tie game or the batting team down by 3, it is a very easy call: tie game.  To suggest otherwise is to miss the impact, the leverage of the game.  Sometimes sabermetrics is about reminding you of directionality.

Similar to whether you want the full effect of the situation via RE24, or deleveraged via RE24/boLI, you also get to decide whether you want the full effect of the inning-score as well. Do you want to treat the batting team down by 1 in the 9th inning (leverage index of 4.0) identical to the batting team down by 1 in the 1st inning (leverage index of 1.0)? You tell me. WPA and WPA/LI are both available, which you can also see at Fangraphs.

Obviously, Mariano Rivera would not enter a game unless the leverage was high enough. When you have one of the greatest pitchers of all time, and you can deploy him when his performance can matter the most, when you can leverage his talent, then naturally, you care about the leverage of the inning-score. Do you put Ozzie Smith at SS or 1B? You leverage his talent when he can make the most impact. Ozzie can make .05 more plays than a random fielder at SS or at 1B, but he will see alot more of those impactful plays at SS than at 1B. Do you put Kevin Kiermaier in CF or LF? It's all about leveraging opportunities.

There is no easy answer, no straightforward solution to evaluating the performance of players. There are only dozens of good questions, and so, potentially dozens of solutions to offer.

On a side note: you might think you can just multiply the boLI (base-out) by the isLI (inning-score), but it doesn't work exactly like that. It works sometimes, and doesn't work other times, notably when you care the most (ninth inning, close game). Multiplying two probability distributions only works when they are independent, and in this case, they are decidedly not independent.

Tuesday, November 21, 2023

What hath the new Pickoff Rules wrought?

Prior to the 2023 season, there was alot of words spoken about the three major rule changes. 

Invisible Rules

The prevention of shifts was the primary one, though in terms of an invisible rule, that one was pretty much it. With a restoration of the way baseball has been played, not having shifts would actually prove to be a non-story. You might have gotten a few what-if stories, but those were doomed to exist only in 2023. No one, and I mean no one, is going to do a what-if story on lost or gained hits due to the lack of shifts in 2024 or beyond.

The second was the larger bases. Again, that is another invisible rule. No one, and by that I mean almost no one, knew what the actual size of the base is. Going from a 15-inch square to an 18-inch square is not something anyone would notice even after being told that that change was being made.

Visible Rules

The third rule was certainly NOT an invisible rule: a clock countdown would be prominent. Technical infractions would be called for delay of game. Delay of game is a common rule in every sport. Through 2022, delay of game on the other hand was a tactic in baseball, not an infraction. We never needed it in the old days because baseball always had a clock: as Bill James said, it was called The Sun. Managers may try to delay games for hope of sunset, but start the game early enough, and empower the umpire to be aware of the sun, and the games hum along, without anyone complaining.

In the NHL, coaches would sometimes delay games as a tactic, as allowed by rule. This was done by swapping goalies. Swapping goalies was reserved for two things, in its history: poor play of the goalie and injury to goalie. And, like with relievers entering a game, as a courtesy, the new goalie got to warm up. Until one coach decided to swap goalies back and forth to get a defacto timeout. The referees would always allow the goalie warmup each time, though it became apparent quickly that this was getting out of hand. The NHL put a stop to it by eliminating ALL goalie warmups. If you can't police yourself, then the league will came in with a zero-tolerance policy.

And so, we get to 2022, where time of game reached its breaking point. Enter the pitch clock. Watching it ONCE in the minor leagues, and it was clear, this is what baseball needed. Anyone objecting to the pitch clock did it based on Political Reasoning, having never seen it in action. You see, when it comes to politics, you hold your position first, then find the evidence to support it. It is a ridiculous way to operate, whether personally or professionally. But, watch a game with the pitch clock in action, as I did last year in the minors, and next thing you know, you are watching an entire minor league game from start to finish, something I've never done on TV. This rule change was overwhelmingly positive, perhaps the most accepting rule change of any sport at any time in its history. Political Reasoning died at the first countdown. It was glorious.

A few doomsday folks clinged, thinking of Chris Webber. I had the thought as well, that we don't want a game to end that way. But, that thought was based on Inertial Reasoning, of wanting things to be as much like before as possible. That aside, being attentive to the rules is part of playing the game. You could have a rule in basketball that you can waive off a timeout request that you don't have. But they instead decided that that is a technical infraction. A catcher cannot cradle a loose ball with his mask. A pitcher has to deliver the pitch in a certain way, or a balk is called, even if he's not fooling any runner in the moment. A game winning run can result with a runner on 3B, with the catcher or pitcher being inattentive to the rules. That's part of the game. If a pitcher lets the clock run out, well, that's the pitcher being inattentive. If the batter calls a second timeout, well, that's the batter being inattentive. Do it with 3 balls or 2 strikes, and the player is being incredibly inattentive.

Disengagement Rule

Lost in all that was the disengagement rule (aka pickoff/stepoff). In order to keep the pitcher from resetting the clock, you have to limit the number of times they can stepoff. This has the INTENDED consequence of helping the baserunners. Everyone loves the stolen base. Everyone. I'm sure everyone's favorite player is someone like Tim Raines or Rickey or Ichiro or Buxton or Trea Turner. It's much more fun rooting for Tatis to steal home than for him to a hit a HR.

Entering 2023, I predicted the over/under on the increase of SB at 40%. We actually ended up at an increase of 40.9%. The larger bases helped a little bit (to decrease the distance between bases by 4.5 inches), but the overwhelming reason is the limited number of pickoffs.

Prior to 2023, there were no rules to limit the number of pickoffs. If you wanted to make a dozen consecutive pickoff throws, you could. The only thing stopping you was the fans booing you. Which would happen. Similar to the NHL policing the goalie-timeout, the league stepped in, and killed two birds with one stone. It's not every season that you get one spectacularly-accepted rule (the pitch clock) combined with a provision that leads to an overwhelmingly beloved intended consequence (more SB, almost no change in CS).

Research

Ok, that was alot of writing to get to the reason I wanted to write in the first place. In 2010-2022, having not issued a pickoff yet, given the choice between a pickoff or a pitch, 9.3% of the time, the pitcher would go for a pickoff. Had the pitcher issued a prior pickoff on the runner, 15.2% of the time, the pitcher would go for a pickoff. You would think the pitcher must be really successful doing this, but no, only ONE percent of pickoff attempts actually led to a successful pickoff. So, the other 99% of the time, it was to keep the runner close enough to the bag to keep him from stealing, the very thing we wanted. And the pitcher did that without consequence. This is unlike pitching to a batter, where you always get a ball or a strike at the minimum. With a pickoff, it's the proverbial: if you don't succeed at first (*), try and try (and try and try and try) again.

(*) Literally the first time and literally the first base.

How about in 2023? Without a prior pickoff attempt, a pitcher would attempt a pickoff only 6.1% of the time. But with a prior attempt, that dropped down to 4.5%. And with two prior attempts, that dropped further still to 2.1%. Funny what happens when you actually have a consequence for your actions. With the runners given a little more free reign, they took it. The pitchers were more successful in their pickoffs (going from under 1% pre-2023 to over 1% in 2023), but that was the small price the runners were willing to pay. The SB, as noted earlier, exploded.

Game Theory

This is STILL not what I wanted to write about. Pitchers were most successful when they attempted a 2nd pickoff, after having being unsuccessful on their first attempt. Basically, the runner, having survived the first attempt took greater liberties. And when the pitcher tried to pick them off, they did very very well. Of course, they had to pick their spots, hence they only could attempt this move 4.5% of the time.

The runners were most successful when they attempted a steal on a pitch to the plate, after a prior 2nd pickoff was already on the books. This is the intended consequence of the disengagement rule: if the pitcher is unsuccessful on the third attempt, the runner is awarded the next base, a Stolen Balk if you wish. The runner is very aware of this, and so took even greater liberties. The pitcher was really handcuffed here, and only attempted the pickoff 2.1% of the time. The balance went heavily to the runner, and it paid off.

So, the pitcher wants to make that second pickoff throw, with the runner taking larger chances, while the runner wants to survive a second pickoff throw, so that the runner can take even more chances.

Outcome Results

Let's describe all this in terms of run per 1000 opportunities. In the 2010-2022 time period, the baserunner was at almost exactly 0 runs (-0.1 runs), when no prior pickoff was attempted. He was at -0.6 runs with at least one prior pickoff. So, the pickoffs worked for the pitcher. But more importantly, basestealing was a breakeven proposition overall. Whatever gains may have been earned by the Ichiros of the world was undone by the slower baserunners taking too much chances. It was like the sacrifice bunt: yeah it works when it works, but it really hurts when it doesn't so that, overall, it's net-neutral. That's why the sac bunt goes down in frequency. And that's what's been happening with basestealing, as only the really good basestealers can make a go of it, and even then, they don't run as often as in the gogo 80s.

To give you context to these numbers: in 2016-2022, the fastest runners, those with a Sprint Speed of 30 feet/second, were +3.7 runs per 1000 opportunities. Those at 28 feet/second, slightly above league average in speed, were at -0.1 runs, as were the very slowest runners. Both of these groups shouldn't be running: the slow ones because they are too slow, and the slightly above average ones because they take too many chances, not picking their spots.

In 2023, things changed. Without a prior pickoff, basestealing did have a small gain, at +0.3 runs per 1000 opportunities. But with 1 prior pickoff, that jumped to +3.2 runs. And with 2 prior pickoffs, that jumped even more to +4.7 runs. In other words, the pickoff rules turned the average runner of 2023 into the fastest runners of 2016-2022. That is one heckavu advantage to the runner.

So, the pitchers get into this weird spot: pickoff attempts are good when they work, but are very costly when they don't. Runners are playing a waiting game. It's a game of chicken.

A Simple Model

When I try to create a simple model: when the pitcher makes 4 straight pitches, with no pickoffs, this is a net benefit to the runner. One pickoff followed by 4 pitches has the same benefit to the runner. But two pickoffs followed by 4 pitches is a benefit to the pitcher. And the best combination for a pitcher that I tried: three pitches, 2 pickoffs, pitch. Obviously, this is at the league level so things will be different for different kinds of runners and pitchers. This is the ultimate in Game Theory. We devoted a section in The Book to discussing the Pitchout and Game Theory. With the Pitchout almost entirely gone, Aspiring Saberists can instead turn to the Pickoff and try their hand there.

() Comments
Page 5 of 188 pages ‹ First  < 3 4 5 6 7 >  Last ›

Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

THREADS

November 29, 2024
The Book - available at a discounted 17.95$

November 23, 2024
Layered wOBAcon, for Pitchers

November 19, 2024
Layered wOBAcon

October 24, 2024
Layered Hit Probability breakdown

October 18, 2024
Describing the season of a pitcher: ERA v FIP

October 04, 2024
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

October 02, 2024
Component Run Values: TTO and BIP

October 01, 2024
Cy Young Predictor 2024

September 25, 2024
Arm and Movement Angles

September 25, 2024
Runs Above Average

September 23, 2024
Clase and Cy Young voting for relievers

September 20, 2024
Spray angles and xwOBA, part I-lost-track

September 17, 2024
FRV v DRS

September 15, 2024
Sacrifice Steal Attempt

September 11, 2024
Skenes v Webb: Illustrating Replacement Level in WAR