Tangotiger Blog

Saturday, August 11, 2018

Nola and his fielding adjustment

By Tangotiger

?This is just a collection of my tweets from yesterday and today. The basic point is that Nola (a) has one of the best BABIP and (b) plays with one of the worst fielding team. And so, it boils down to: how do you adjust for his context?

And more specifically: do we treat his fielders as having the expectation to play at their typical fielding level FOR THE SEASON or ON THOSE PARTICULAR TIMES WHEN NOLA IS ON THE MOUND? It's a nuanced distinction that has a very specific implication to Nola.

Here they are:

Thursday, January 25, 2018

Statcast Outside Lab: Barrels and FIP

By Tangotiger

The 2018 Hardball Times Annual is out, and it's an online only edition. Here's my take on Eli''s work.

***

SANTA = -4.18 + 17.3*{Ball Percent} + 96.7*{Near_Barrel_Percent}

If we assume 144 pitches per 9 innings, we can change "ball percent" to "balls per 144 pitches", and same for near barrel. We get:

SANTA = -4.18 + 17.3/144*{Balls per game} + 96.7/144*{Near_Barrel per game}

Which is

SANTA = -4.18 + 0.12*{Balls per game} + 0.67*{Near_Barrel per game}

This is baselined against "Strikes per game", as well as all the non-barrel contacts, which are the missing variable. In other words, the above is saying that each ball is worth 0.12 runs above "the rest".

Eli seems to suggest that a strike and a non-barrel contact is worth roughly the same by lumping them, and... it looks like he's right! I don't have the data like Eli has it, but it looks like a non-barrel wOBAcon is around .280, which is about -0.04 runs. A strike is closer to -0.07 runs or so. So, that probably averages out to -0.06 runs. With the ball 0.12 runs above "the rest", that implies the ball at +0.06 runs.

That seems in the ballpark of being ok.

I don't have the data exactly ready like Eli has, but a near-barrel and above would have a wOBA probably close to 1.100, or about 0.770 wOBA points above average or 0.62 runs above average. Baselined to Eli's "the rest", that makes it 0.56 runs above the rest. The near-barrel (and above) being 0.67 runs above "the rest" may be off, but maybe my assumptions aren't good enough.

All to say that:

a. we could have gotten to where Eli is using linear weights

b. his equation basically checks out

If it was me, I'd try to merge the two approaches, the linear weights approach (which is the "truth") to the regression approach (which captures some subtleties, like number of pitches per 9 inning being different for great and bad pitchers), and try to get some "simpler" numbers. Like if you can get "100" instead of "96.7", I'd go for that. And same trying for the ball term, if that can be either 15 or 20.

Anyway, love the work and approach.

***

Here is a similar themed idea, but relying on BB and K, instead of balls and strikes, as well as Barrels, as opposed to Near-Barrels from Boguslaw.

Anyway, both of these guys are on the right path, in trying to focus on barrels and barrel-like events, as well as something related to balls and walks.

Maybe the answer is different weighting for Super Barrels, Barrels, and Near-Barrels.

But, what is great with what Eli did is that he kept it simple. Rather than the "soup" approach where you can't see what is going on, you take a methodical approach, so you can isolate the variables.

() Comments • • Pitchers • Statcast

Friday, September 01, 2017

Stacast Lab: simple method for hard pitches

By Tangotiger

One of the things I proposed about ten years ago was to find a pitcher's "top speed", by simply taking the 25% fastest of his pitches, and taking that average. Why 25%? Because the average pitcher throws some 55% or so "fastballs", and so just about every pitcher will throw at least half that much as his fastball.

You can see a few studies I've done in this regard, and it works out well enough. Anyone can reproduce what I do with very little effort, and armed just with the speed of a pitch.

What follows is one little extra step, making the researcher go from "very little effort" to "little effort". After you've established a pitcher's "top speed" (i.e., the average of his 25% fastest pitches), choose a baseline that is 3mph below that line. This becomes a pitcher's "minimum hardball speed". You go back to all his pitches, and any pitch that is above his minimum hardball speed now counts as a hard pitch. And that's it.

When you do that, you go from making use of 25% of each pitcher's pitch to, an average of, 53% of each pitcher's pitch. I should point out that BOTH have their place. The reason to like the "top speed" approach is that you are guaranteed to keep the sample for each pitch proportionate to their actual number of pitches: 25%. The reason to like the "hard pitch" approach is that you are focusing on those hard pitches, while allowing few to no breaking pitches in there. On a pitcher by pitcher basis, you end up with each pitcher have 30% to 92% (Britton) of his pitches identified as "hard" pitches. Sale is interesting as he's at 35%, and that's because he has a wide range on his fastball/sinker.

Why 3mph below the line rather than 2 or 5 (and you can definitely make a case to use a threshold anywhere between those two numbers)? This breakdown of pitches at various levels points to somewhere close to -3 mph below the average of the top speed as being heavily fastball pitches. And as I said, choosing -3 lets you end up with 53% of the pitches.

Anyway, I hope you find this as useful and simple as I do.

Note: I'm excluding Dickey and Wright for knucking reasons.

(2) Comments • 2018/04/13 • Pitchers • Statcast

Monday, July 24, 2017

Does varying pitch speeds contribute positively to performance?

By Tangotiger

?Terrific work from Ryan. And the answer seems to be "not really". Or maybe it's that pitchers have already optimized their personal approach such that there's no more trend to see.

It's like saying that Vlad should chase fewer outside pitches, when in reality, he probably chased exactly the number he needed to. Every hitter and pitcher is optimizing his approach based on his skillset, so, my expectation is that we should likely not see trends.

(3) Comments • 2017/08/04 • Pitchers

Tuesday, June 20, 2017

SP rankings: Kershaw v Scherzer, and playing time

By Tangotiger

?So Bill James noted that Scherzer has pulled ahead of Kershaw. MGL noted that Kershaw is half a run better than Scherzer, per game.

Bill has a seemingly aggressive weighting, giving more weight to the recent performances. Basically, in the last 365 days, the pitcher is going to get some 60-65% of the weight. MGL likely follows a Marcel-like weighting scheme. That means that the most recent 365 days probably gets 1/3 the weight maybe 40% of the weight.

This year, Scherzer is 0.8 or 0.9 WAR ahead of Kershaw. Last year, Kershaw pitched 65% as much as Scherzer, so that, while pitch for pitch, Kershaw was way ahead, when you include playing time, they were around even. The years before that Kershaw was ahead.

The WARcels has a weighting scheme of 60/30/10, which is fairly close to what Bill uses implicitly. From that standpoint, WARcels has Scherzer even or just ahead of Kershaw.

So, that's what it comes down to, the amount of weight to give this year, and how much weight to give the playing time. Once you do that, then the answer will become more obvious.

(9) Comments • 2017/07/07 • Pitchers

Friday, May 26, 2017

Statcast Lab: Speed pitch-by-pitch within a plate appearance

By Tangotiger

?This is an interesting article by Rob. He shows that the speed of pitches goes up, the longer you take. Unfortunately, I don't have the timestamps in my data yet, so I can't do the same cool stuff he did. But let me do something in parallel.

One of the interesting things that you have to be aware of in data is the selection bias issue. For example, suppose that a pitcher who only waits 12 seconds to throw a second fastball after having just thrown a fastball, why would he do that, instead of waiting 21 or 24 seconds? Perhaps because he wants to keep the hitter on his toes, and is quick pitching him, as one possibility? What is unclear in Rob's data is whether the pitcher is actually more effective. Is it possible that the longer he takes, that the benefit is not only to the pitcher, but also to the batter? So, Rob (or anyone that has the timestamp data) can followup his study to show the "linear weight run value" of the next pitch, in each of his groupings.

For my part, what I did in parallel is to look at all fastballs (coded as FF, FT, SI in the BAM data), and look for the subsequent pitch, and making sure that subsequent was the same pitch type (so, pitch numbers 2-3 are FF-FF, or pitch numbers 3-4 are SI-SI, but not FF-SI) within the same plate appearance.

The average of the first pitch in the two-pitch sequence was 92.84mph and the second was 92.97mph. So, setting aside the "wait time", we see that the pitcher will gain 0.13mph on his subsequent pitch of the same pitch type. Maybe it's because he gets into a rhythm and it's just easier on the 2nd pitch than the first. Therefore, a possible "familiarity effect".

This is what I did. I first broke up the data based on whether it was the first pitch of the plate appearance or not. Then I not only looked for back-to-back, but also a three-pitch sequence and a four-pitch sequence. And in every situation, the speed went up by right around that 0.13 figure.

Rob's data seems to suggest that that 0.13 value itself (or some number close to it) also had a relationship to the "wait time". So, not only do we have a "wait time" effect, but also simply a "familiarity" effect.

?

() Comments • • Pitchers • Statcast

Wednesday, April 05, 2017

Pitch velocity: new measurement process, new data points

By Tangotiger

?Dave does a great job noting the changes. The short summary is that every ballpark is using Statcast and we are reporting in real-time the velocity of the pitch out-of-hand. The average release point is about 54.5 feet. Here, let me show the breakdown:

?

How does that compare to the velocity at y=50 (meaning 50 feet from the back tip of home plate), which was the previous number being reported? Glad you asked. Here are two charts, one based on the difference, and the other based on the rate. Each chart uses both the extension of the pitcher, as well as the pitch speed out of his hand.

?

Because the percentage retained is virtually entirely based on the distance, we can collapse the above chart like so

***

Just as interesting, an industry-leading site Brooks Baseball has been reporting measurements at y=55, meaning taking y=50 data and inferring speed at y=55 (whether the pitcher releases at 54 feet or 56 feet).

There are good reasons to have a fixed point (whether y=50 or y=55 or ... see below) as well as the actual release point. Both will be tracked. But in terms of the real-time tracking number, the out-of-hand is what you will be seeing.

UPDATE: As I noted above, I said BOTH will be tracked. The out-of-hand is what you will see. In order to see the fixed point, you can interpret it from the XML file. The key value you are after is vy0, which is in feet per second, which you can convert to MPH by multiplying by 0.681818. It's velocity along the y-axis. Thanks to Dan Brooks below for reminding me that to get the speed toward the plate, you need all three axis values, vx0, vy0, vz0. You'd square them all, add them up, and square root.

***

Ok, the "fixed" point, presumably to make sure every pitcher is being compared the same. Let's say two pitchers throw a ball, one that releases it 7 feet from the mound, and the other releases it 5 feet from the mound. By the time the ball reaches y=50 (meaning 50 feet from the back tip of home plate), both balls are traveling at 95mph. Are they equally impactful from the perspective of the batter?

The guy who released it with longer extension (i.e., closer to the plate), released it, out of hand, at 95.5. The guy who released it with shorter extension, released it, out of hand, at 95.8. Are those two equivalent, from the perspective of the batter?

I don't know (yet). If they are not equivalent, then there's no real purpose to reporting the y=50 value. We don't calculate data for the purpose of calculating data. We organize baseball data to be able to answer baseball questions.

It may very well be that the best way to organize the data is to show: (a) speed out of hand, and (b) x,y,z position of the ball at T minus 250 ms, where T=0 is front of home plate (or perhaps where T=0 where y=2 feet from back tip of home plate). Once we figure out what we want, then we'll do that.

(41) Comments • 2017/04/05 • Pitchers • Statcast

Saturday, March 11, 2017

Updated version of DRA

By Tangotiger

?First thing, kudos to Jonathan for being meticulous about it. Secondly, totally agree in terms of using FIP as the baseline, and he says it succinctly at the end. I would even use an additional baseline, and that is simply strikeouts minus "walks" per PA, where walks is BB-IBB+HBP. You'd be surprised (shocked?) at how well that does. Thirdly, the presentation of the three tests is perfect: (a) current RA9, (b) internal reliability, (c) future RA9

So, it's fascinating that DRA can (a) describe current RA9 as well as FIP AND (c) is slightly better than FIP in forecasting AND (b) does so by being more internally consistent (meaning it can better describe why it does so well). My GUESS is that it underweights HR and overweights SO, but can then counteract that with other data. So, for describing current RA9, it tried to do WORSE by changing the weights of HR and SO, but then is able to balance that out for current RA9 by using more inputs. And does so without affect future RA9. All the while making it more internally consistent. Just a guess, but that's what I'd do.

One thing this shows is how DIPS will survive (in the form of FIP). Remember, FIP is nothing more than DIPS, which means the whole idea of FIP I'd consider to be 99% Voros. Voros is the lawyer that created the DIPS argument, and I'm the paralegal that dots the i's and crosses the t's, with FIP. I'm sure Voros would like that analogy.

Anyway, the devil is in the details, and I'll be interested to learn about the specific pitchers, and where the value-added comes from.

(1) Comments • 2017/03/11 • Pitchers

Thursday, January 05, 2017

Should a few really bad starts change your forecast?

By Tangotiger

?MGL takes a look. He shows that for pitchers who had a bad ERA overall, with some really bad starts in there, ending up matching MGL's forecast. His forecast did NOT include a "flag" for number of bad starts. It just looked at their overall seasonal stats.

More interestingly, MGL's control group is pitchers who had the (similar) bad ERA overall, without a big number of bad starts. Again, his forecast did not look for number of bad starts. And their overall forecast was the same as the studied group. Except this control group actually outperformed their forecast.

This tells me that either:

MGL's forecasting system is not good enough (say 5% likelihood)
Number of bad starts is actually a good indicator, but in the direction of "fewer bad starts, given same ERA as someone with many bad starts" in the positive direction (say 20% likelihood)
Sample sizeitis (say 75% likelihood)

But for those hanging their hat on "if not for those really bad starts...", they won't find it here.

(15) Comments • 2017/01/06 • Forecasting • Pitchers

Thursday, December 01, 2016

DRA inroads

By Tangotiger

?A good piece by RJ on Jonathan's work.

RJ had asked me for an interview on this, but I told him at the time that I haven't kept up with DRA enough to be able to offer any good comments. Nonetheless, Jonathan does great work, so, DRA is in good hands.

(8) Comments • 2017/03/11 • Pitchers

Tuesday, November 22, 2016

Porcello v Verlander X3

By Tangotiger

This is a link to the one I just posted:

?As you can see, when we adjust Verlander’s WAR, we shouldn’t adjust it based on the fact that the Tigers cost him some ten hits, but rather adjust him so that the Tigers fielders actually HELPED him to two hits.

...with a healthy level of comments from MGL among others:

It looks like your assumption is that pitchers with low BABIP necessarily had very good defense behind them regardless of how good the team’s defense was for the year and vice versa if a pitcher had a high BABIP. So it’s a Bayesian type adjustment, which is indeed correct. Also if B-R is using team DRS to adjust pitcher WAR without regressing that DRS they are making a mistake. Same with using UZR.

This is Poz's article:

For one thing, I think it’s quite likely that Detroit played EXCELLENT defense behind Verlander, even if they were shaky behind everyone else. I’m not sure how you can expect a defense to allow less than a .256 batting average on balls in play (the second-lowest of Verlander’s career and second lowest in the American League in 2016) or allow just three runners to reach on error all year (the lowest total of Verlander’s career).

This is Bill James:

The logic of the Baseball Reference WAR analysis is that, given the same defense behind them, the same park, Justin Verlander WOULD HAVE allowed significantly fewer runs than Rick Porcello. The question this pushes us to is, Is this actually a reasonable thing to believe? No, it isn’t. Maybe it is a reasonable adjustment in theory, I don’t know. Maybe if we compared 100 different pitchers, this would be a useful and instructive adjustment in the other 98 cases; I don’t know. But we’re talking about this case.

***

In Bill's article he also had a preamble about pitcher Wins and how they are used in Cy Young voting. It's Classic Bill, which means he was able to restructure a complex topic into something easy to grasp. The whole thing is a great read. His conclusion:

The Won-Lost record was no longer the king of the library. From 1992 to 2005 other statistics were basically AS important in the Cy Young voting as the won-lost record, and since 2006 the other stats have been MORE important than the won-lost record.

() Comments • • Awards • Pitchers

Saturday, September 24, 2016

De/Re-constructing Bill James Season Score Metric

By Tangotiger

Bill James has Season Score, which is a sort of Game Score for the season. The statistics he uses are: W, L, SV, GF, IP, ER, R, BB, SO. The only main stats he is not using are hits and homeruns. Let's accept that the only thing we want are these stats he is using. Now, how can we create a metric out of that?

Part 1

The currency that you should strive for is wins. So, let's start with W and L. Let's assume that the replacement level win% is .385. So, in order to get a win-level metric using W and L, you would simply do:

W - .385*(W+L)

Which is .615 W - .385 L

Part 2

Next up is IP and R. Let's assume that replacement level is 6 R / 9 IP. So, to convert IP and R into runs above replacement, we'd do IP * 6/9 - R. To convert to wins, divide all that by 10.

Which is: IP * .067 - R * .10

He also uses ER, so replacement level for that is 5.50 ER / 9 IP. Repeating like above

Which is: IP * .061 - ER * .10

The average of the two is: IP * .064 - R * .05 - ER * .05

And instead of IP, let's use IPouts, which means that there are 3 outs per inning. So, we finally get:

IPouts * .021 - R * .05 - ER * .05

Part 3

Now we have SO and BB. This one is alot easier, since SO = BB is replacement level. Hence, we do SO - BB. To convert to runs you multiply by 0.3 and to convert to wins you divide by 10.

Which is: (SO - BB) * .03

Part 4

Now we want to know how to weight all of that. We can just add them all up, so that they each get an equal weight. But, let's say we want to weight them as 50%, 33.3%, 16.7%, with the IP/R having the most weight and the SO/BB having the least amount of weight. Applying these weights to the above and we get:

IPouts * .0105 - R * .025 - ER * .025

+ .205 W - .128 L

+ (SO - BB) * .005

Part 5

Let's rescale so we don't have all those decimals. So, we'll multiply everything by 40. It doesn't matter what you multiply by, as long as everything gets multiplied by the same thing.

IPouts * 0.42 - R - ER

+ W * 8 - L * 5

+ (SO - BB) * .20

Bill James says...

Bill sent me his formula. For this post, I'll remove the part that deals with Saves and Games Finished. Here's there rest of his metric:

(Thirds of an inning pitched * .425 – Earned Runs Allowed - Runs allowed)
Plus 8 times Wins
Minus 5 times Losses
Plus (Strikeouts Minus Walks) divided by 5

And that's how a metric is born. His metric can be perfectly explained if you think in terms of Wins Above Replacement. If you like the weightings of the three components, then you'll love his metric. Indeed, his metric seems pretty consistent with my Cy Young predictor, in terms of focusing on IP, ERA, W and SO. He considers a bit more, so it has a bit more practicality. But it would seem that it should do quite well as a Cy Young predictor.

So, when you see a pitcher with a Season Score of 320, just divide that by 40. That pitcher would be an 8 WAR pitcher.

I'll look at relief pitchers in part 2, just as soon as I can figure out how he got there.

(2) Comments • 2016/09/25 • Pitchers

Friday, May 13, 2016

Cy Young odds, 2016

By Tangotiger

Taken with a grain of salt...

58%	Clayton Kershaw
13%	Jake Arrieta
6%	Max Scherzer
5%	Madison Bumgarner
4%	Stephen Strasburg
3%	Jon Lester
3%	Noah Syndergaard
3%	Johnny Cueto
2%	Jose Fernandez
3%	Rest

32%	Chris Sale
10%	Jose Quintana
8%	Corey Kluber
7%	Danny Salazar
5%	David Price
5%	Felix Hernandez
4%	Jordan Zimmermann
4%	Masahiro Tanaka
3%	Cole Hamels
3%	Chris Archer
3%	Marcus Stroman
2%	Rich Hill
2%	Taijuan Walker
2%	Drew Smyly
11%	Rest

(62) Comments • 2016/08/18 • Awards • Pitchers

Thursday, May 12, 2016

FIP: come for the ERA-look, stay for the wOBA-design

By Tangotiger

?A very good primer by Neil on what FIP is. That people USE it for more than its intended construction, that's not a FIP-issue. As I noted in the comments:

I agree with the analogy of FIP to wOBA. They both:

use a subset of performances(*)
weight the values based on their run impact, regardless of when they happened
scaled to a common scale

(*) FIP ignores batted balls in field of play, and baserunner movement (SB, CS, WP, etc). wOBA ignores baserunner movement.

(3) Comments • 2016/05/12 • Linear_Weights • Pitchers

Friday, May 06, 2016

New DRA

By Tangotiger

?Jonathan has changed DRA, and you can see the new considerations for the model here. These are a couple of my minor comments:

Jonathan, thanks for the continued improvements.
Just so I am following along, is your chart showing that the IBB is dependent on the identity of the catcher (over and above the team that the battery is on)? If so, can you show us this "IBB impact" for the leaders/trailers for catchers?
Also, the putout impact of the 1B is dependent on the putout impact of the 3B? If so, is that because the talent level of the 3B allows the SS to move over which allows the 2B to move over? But that if you only focused on the impact of the 2B, that that did not have any relevance? Fascinating if true.
...
I also see that your inning parameter, which was so prevalent in your original model has all but disappeared, and limited to just the "extra innings or not".
That was one where it was almost certainly an overfitting, especially when those values would change each year. So, it's good that you are using baseball knowledge in contructing your models, rather than relying on a regression/kitchen sink approach.
In that respect, you'd probably want a "9th inning, tieing runner at bat or on base" parameter, which should be similar to the XI parameter you have. After all, bottom of 9th tie game is much closer to impact as XI tie game than 3rd inning tie game.

(83) Comments • 2016/10/27 • Pitchers

Friday, April 22, 2016

Tale of Two Jakes

By Tangotiger

I love Game Score, perhaps even more than Bill James does and he created it. Bill recently wrote "I was just relying on the Starting Pitcher rankings, which is a system that I believe in." The central component of those rankings is Game Score (or what I have dubbed Game Score Classic). Fangraphs has recently rolled out my version of Game Score (or what I have called Game Score v2.0). Confusing? Not to me, a product of the New Coke v Coke Classic Orwellian campaign. To you? Maybe!

Game Score is, in essence, a little bit of ERA, a little bit of FIP, a little bit of Linear Weights, and a little bit of WAR. All without anyone knowing, and described in an incredibly simple formula, and all along a useful scale of 0-100 (more or less), where 50 is average. Here is what Jake Arrieta's 140 games looks like in chunks of 4 games (so, games 1 - 4 is labelled as "4", etc).

?

As you can see, the first 84 games (of which 78 were starts), is Arrieta as an ordinary pitcher. Remember, 50 is average. When you get to 60, that is equivalent to being a .600 pitcher. You can see that in any 4 consecutive games, he never averaged over 60.

Ordinary Arrieta had 19 starts with a Game Score above 56, out of 78. He had another 21 starts at 35 or worse. Given that every pitcher starts with a score of 40 when they take the mound, 35 is not only below average, but below replacement. So, that's what Ordinary Arrieta looked like: about 1/4 of his starts were above average and 1/4 of his starts were below replacement. The average Game Score for Ordinary Arrieta was 46.

Arrieta's last 24 starts have been Extraordinary. He had 22 starts with a Game Score higher than 56. So, whereas Ordinary Jake needed 78 starts to get above the 56 threshold 19 times, Extraordinary Arrieta needed only 24 starts to exceed that threshold 22 times. And his other two starts had a Game Score of 49. Overall, he has an average Game Score of 76. I haven't looked to see what active pitchers have done for a 24 start string, though I presume Kershaw must have reached that level at least once.

It's the 32 starts in-between that we find the Two Jakes colliding with each other (6/8/2014 through 6/16/2015). He had 23 starts above the 56 level, and only 2 starts below the 35 level. He had another 7 starts in-between. This is an overall average of 62. Two put that in perspective, Kershaw's career level is 63. Chris Sale in 2015 was a 60.

So, in those 32 starts, we find half of Ordinary Arrieta and half of Extraordinary Arrieta. And combined, that's already a star pitcher.

(10) Comments • 2016/04/23 • History • Pitchers

Wednesday, April 20, 2016

Who are the top 10 starting pitchers in baseball?

By Tangotiger

?Bill James has a method that is based on Game Score with more recent starts weighted more, with some adjustments for parks, injuries. We can compare those to Steamer forecasts. While the two systems are not exactly comparable in terms of intent (Bill is backward looking, while Steamer is forward looking), they are more similar in intent than dissimilar.

Number 1 in both systems, far and away, is Kershaw.

Steamer has Sale #2, while Bill has him at #7. Bill has Arrieta at #2, while Steamer has him tied for 4th. They both have Scherezer third. Bill has Greinke 4th, while Steamer has him all the way down at 13th.

Tied for 4th on Steamer is Thor, and THAT is what tells you about Bill's system being "backward looking". I had this discussion with Bill when he first rolled it out, talking about Strasburg and Tanaka. Bill's system is about earning points, so, it's not really a forecast. But, it comes close to it.

Bill has Price 5th, and Steamer has him tied for 8th. Bill has Bumgarner 5th while Steamer has him 16th. It should be noted that Bill includes the playoffs, and I don't know what Steamer does with playoff data for forecasts, but clearly, they should count.

Steamer has Jose Fernandez 6th, so again, he won't appear on Bill's top 20 list for a while because of the nature of his system.

Steamer has Kluber 7th and Bill at 12. Steamer has Stras at 8 and Bill has him at 13.

Bill has Lester at 8 and Steamer has him at 12.

Keuchel is 9th in one and 11th in the other, and at this point, it doesn't really matter which is which.

(1) Comments • 2016/04/20 • Pitchers

Saturday, March 26, 2016

How many bullets does Felix have left?

By Tangotiger

?You know how it goes right? Felix has been pitching so much for so long, how can he possibly keep going? You don't hear it like: Felix has been pitching so much for so long, he must be indestructible. That's because of the bias of the pitching arm, that it's just one pitch away from a being a blown tire. But, what in fact is the reality?

Is reality Don Sutton and Greg Maddux? From the time Sutton was a 21-yr old rookie, he's been pitching 200+ innings non-stop every year, except for a strike year and, basically, retirement, a feat that Maddux nearly matched. Or is reality Catfish Hunter, who after 4 straight seasons of top 5 Cy Young finishes had 700 IP left in his career?

The boring answer and the true answer is always "in the middle". But, where in the middle? Rather than start with an answer and construct the narrative, let's start with the question and find the answer.

***

I looked for all pitchers born between 1931 and 1980. At the age of 29, they had to have at least 2 WAR according to Baseball Reference, with at least 180 IP. In the 4 seasons from ages 26-29, they had to have a total of at least 14 WAR and a total of at least 700 IP. This gave me 64 pitchers. On average, these 64 pitchers had 5.3 WAR at age 29 (Felix had 4.4) and a total of 20 WAR at ages 26-29 (Felix had 21).

In the rest of their career, these pitchers had 22 WAR and 1444 IP (or the equivalent of 7 seasons of 206 IP. And this is what we would forecast for Felix, if we knew nothing about his career prior to age 26. But since we do know, let's continue.

The question: if you have a star pitcher who's been pitching great from age 26 to 29, does it matter how often he was pitching through to age 25, in determining how much baseball life he's got left at age 30 and later?

I will select from these 64 pitchers based on how many innings they threw prior to age 26, with the top 10 being the true young workhorses. They are, in order from most remaining bullets to least: Don Sutton, Greg Maddux, who lasted twice as long as the average of the 64. Then we have John Smoltz, Mark Buehrle who lasted right around the average. Then we continue with Vida Blue, CC Sabathia, Camilo Pascual, Don Drysdale, at about 70% of the group average. Finally we have the flameouts: Catfish Hunter and Sam McDowell.

As you can see, it's pretty much all over the place. On average, these 10 pitchers averaged 1406 IP after age 29, or the equivalent of 7 full seasons of 201 IP. That is, the most worked pitchers prior to age 26 is no different than the entire group, in terms of number of bullets remaining.

In terms of WAR, these 10 averaged 20 wins, which is a bit lower than the 22 for the whole group.

***

All-in-all, you will see stories of Felix, every year, of how he's one of the leaders in IP through age.... 26, 27, 28, 29. Every year you have seen those stories. And every year someone will write about "is this the year?". They do this because, eventually, some year WILL be the year. And no one remembers all the bad predictions. And when the predictions finally hits, the writer can stand up and say "See? I told you!" When that inevitably happens, you salute that writer, in whatever manner you think that prediction deserves.

Roy's salute (image) in his last game for Canadiens (video):

(10) Comments • 2016/06/05 • Forecasting • Pitchers

Saturday, January 09, 2016

Cody Allen v Aroldis Chapman

By Tangotiger

?Cody Allen led the league in FIP, while having the 7th worst BABIP. Aroldis was very close to Allen in both metrics.

Allen however was far far worse with men on base. His FIP with bases empty was 1.16, while with runners on base it was 2.67. By luck or talent, he did perform much worse with men on base.

His team's BABIP with Allen on the mound was .309 with bases empty and .377 with men on base. In this case, we don't know whether to hold Allen or his fielders (or both) responsible for the abysmal performance with men on base.

Naturally, all that leads to lots of runs scoring, far more than would be expected if you just look at his seasonal line. We just don't know whether to hold Allen or his fielders accountable for all those excess runs.

Chapman also had a much better FIP with bases empty than men on base. However, the Reds BABIP was much lower with men on base than bases empty when Chapman was on the mound (.367 v .295). As a result, much fewer runs scored than if you just look at his seasonal lines. We just don't know how much to credit Aroldis and how much to credit his fielders.

And so the FIP v BABIP debate continues, as every year, we find more pitchers in this boat, always trying to figure out how much to hold the pitcher and his fielders accountable.

(12) Comments • 2016/01/12 • Pitchers

Tuesday, December 15, 2015

Hoffman v Gooden’s first 5 years

By Tangotiger

?Terrific stuff from Poz.

(6) Comments • 2015/12/17 • Pitchers

Dec 02 08:47		DH and PH Batting Human Adjustment
Nov 23 14:15		Layered wOBAcon
Nov 22 22:15		Cy Young Predictor 2024
Oct 28 17:25		Layered Hit Probability breakdown
Oct 15 13:42		Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is
Oct 14 14:31		NaiveWAR and VictoryShares
Oct 02 21:23		Component Run Values: TTO and BIP
Oct 02 11:06		FRV v DRS
Sep 28 22:34		Runs Above Average
Sep 16 16:46		Skenes v Webb: Illustrating Replacement Level in WAR
Sep 16 16:43		Sacrifice Steal Attempt
Sep 09 14:47		Can Wheeler win the Cy Young in 2024?
Sep 08 13:39		Small choices, big implications, in WAR
Sep 07 09:00		Why does Baseball Reference love Erick Fedde?
Sep 03 19:42		Re-Leveraging Aaron Judge
Aug 24 14:10		Science of baseball in 1957
Aug 20 12:31		How to evaluate HR-saving plays, part 3 of 4: Speed
Aug 17 19:39		Leadoff Walk v Single?
Aug 12 10:22		Walking Aaron Judge with bases empty?
Jul 15 10:56		King Willie is dead. Long Live King Reid.
Jun 14 10:40		Bias in the x-stats? Yes!
Jun 13 17:05		Bat Swing Checklist
Jun 07 12:10		Spray Angle is not needed, part 32
Jun 02 17:37		Stanton Swing Speed and Acceleration Curves
Jun 01 14:44		Statcast Lab: Pre-introducting Bat Acceleration
Older comments Page 1 of 152 pages 1 2 3 > Last ›
Complete Archive – By Category Complete Archive – By Date 2025 Jan Feb 2024 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2023 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2022 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2021 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2020 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2019 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2018 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2017 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2016 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2015 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2014 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2013 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec FORUM TOPICS Jul 12 15:22 Marcels Apr 16 14:31 Pitch Count Estimators Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS Jan 29 09:41 NFL Overtime Idea Jan 22 14:48 Weighting Years for NFL Player Projections Jan 21 09:18 positional runs in pythagenpat Oct 20 15:57 DRS: FG vs. BB-Ref Apr 12 09:43 What if baseball was like survivor? You are eliminated ... Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method) Jul 13 10:20 How to watch great past games without spoilers

Tangotiger Blog

Pitchers

Saturday, August 11, 2018

Thursday, January 25, 2018

Friday, September 01, 2017

Monday, July 24, 2017

Tuesday, June 20, 2017

Friday, May 26, 2017

Wednesday, April 05, 2017

Saturday, March 11, 2017

Thursday, January 05, 2017

Thursday, December 01, 2016

Tuesday, November 22, 2016

Saturday, September 24, 2016

Friday, May 13, 2016

Thursday, May 12, 2016

Friday, May 06, 2016

Friday, April 22, 2016

Wednesday, April 20, 2016

Saturday, March 26, 2016

Saturday, January 09, 2016

Tuesday, December 15, 2015

Recent comments

Older comments

Complete Archive – By Category

Complete Archive – By Date

FORUM TOPICS

Latest...