[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Sunday, May 24, 2020

Catcher WOWY

By Tangotiger 03:36 PM

In 1979, in games started by Gary Carter, the Expos allowed 463 runs and made 3705 outs, for a runs per 27 outs of 3.37. That season, in games started by his backups, the Expos allowed 5.08 runs per 9 IP.  That difference is a whopping 1.71 runs per 9 IP, or 66% as much as his mates.

However, Carter had 136 starts compared to the 24 from his mates.  In terms of the WEIGHT of that 1.71 difference, we would use the harmonic mean, which is 41 games (or 1073 outs).

We can go through every season of his career and repeat this process.  And in his career, his weighted runs allowed is 86% of that of his mates (3.66 RA / 9IP with Carter starting compared to 4.28 when his mates start).  In other words, his team has allowed 0.62 fewer runs to score per 9 IP, when Gary Carter started the game behind the plate, compared to his mates that season.

Gary Carter started 1954 games (or if you use outs, the equivalent of 1964 9-inning games).  And 0.62 runs per game times 1964 games (unrounded) is 1204 fewer runs allowed.  That is the best figure in the last 100 years.

Using this method, here are the top 11:

  • 1204 runs reduced: Gary Carter
  • 1004: Mickey Cochrane
  • 961: Tony Pena
  • 871: Yadi Molina
  • 807: Brad Ausmus
  • 757: Andy Seminick (who??)
  • 750: Rick Ferrell (who??)
  • 701: Johnny Bench
  • 682: Jason Varitek
  • 652: Al López (who?? he did have MVP votes in 7 seasons)
  • 647: Russell Martin

Fans of WOWY will recognize this as...well, WOWY.

The main problems, at least for THIS iteration, are as follows:

  • Are the backups of Gary Carter disproportionately worse than the backups of Cochrane and Pena and so on?
  • Was Gary Carter paired disproportionately with the best pitchers on his staff, compared to his mates?
  • How much can Random Variation affect these results?

The Random Variation one is a big one.  In the Gary Carter starts, his teams allowed 7246 runs.  And since this is 1204 lower than his mates, his mates would come in at a pro-rated 8450 runs.  Just because we've OBSERVED 8450 (pro-rated) runs doesn't mean that's the true rate.  How much Random Variation is there in runs allowed?  I should figure it out at this point, but let's say it's one standard deviation is 3 runs per game.  With 1964 starts, you take the square root and multiply by 3 runs and that gives you 133 runs.  Since we have observations on both Carter and his mates, the Random Variation of the DIFFERENCE is 133 times root 2, or 188 runs.  So, based on that number 3 that I totally made up, we can reduce the 1204 by twice 188 runs due to Random Variation (for 2 standard deviations).  That still leaves us with a whopping 828 runs lowered.  So, whatever you can say RAndom Variation contributes to the noise in that 1204 runs, it won't be able to wipe away even half of it.  Most of it is real.

Anyway, that's all I've got for Iteration 1.  If an Aspiring Saberist wants to take it from here, go for it.

***

Note: reason I started this was because of Ryan Doumit, since he was getting hurt by the framing numbers.  Using this method, his teams allowed 188 more runs.  And if you go all the way to the bottom of his Fangraphs page:

And look right under the Defense column, all the way to the last line, you will see this number: -178 runs.  In other words, this WOWY method (188 runs) supports the evaluation of Doumit and his framing numbers (178 runs)!

***

Note 2: since someone will ask, Mike Piazza was better than his mates by 322 runs.  Make of it what you will.  I just know whatever number this method spit out, someone is going to complain.


#1    Guy 2020/05/24 (Sun) @ 16:53

Very impressive numbers for Carter here. Controlling for pitchers would probably cut down that run total somewhat.  But in your linked article, you report that Carter’s backup catchers were actually pretty average (at least in the dimensions you measured—could be poor framers, I suppose).

I hope someone does pick up the baton here and see what a more fully-developed WOWY shows. These huge runs saved totals certainly hint at the possibility that catcher defensive value could be considerably larger than commonly understood. Another factor worth controlling for, if anyone runs with this, is day/night games. I’d guess Carter mainly sat out for day games, and offense in day games was notably higher in Carter’s day.


#2    Tangotiger 2020/05/24 (Sun) @ 21:58

I (tried) to control for pitcher by looking at each pitcher’s RA/9IP in the 5 seasons before/after (so 11 seasons in all).  I could try to make that tighter (+/-3 seasons), but I was trying to limit the “personal catcher” effect.

Anyway, Carter is still #1.

The numbers below is: the more negative, the more runs are suppressed.

The two catchers that are affected among the most:
Tony Pena goes from -961 to -666
Salvador Perez from -475 to -595

So, Pena was disproportionately more with good pitchers and Perez more with bad pitchers.

Jason Varitek was the most affected, going from -682 to -195.

Going the other way is Javy Lopez going from +239 down to -14.  In other words, the poor guy who wasn’t catching the good Braves pitchers gets his due here.

***

The observed standard deviation in terms of % of runs saved compared to mates is 5.5%.  I think it’s fair enough to say the true rate is 4%, maybe 3%.  Not lower than 3%.  But let’s make it 2.2%.

In practical terms, this means that if a team allows 4.5 runs / 9 IP, then 2.2% is one SD = 0.10 runs / 9 IP.  Per 162 games, that makes one SD = 16 runs.  This is enormous.

The other fielding positions has one SD less than 10 runs.

It’s possible we’ve been too harsh on catchers overall.

 


#3    Guy 2020/05/25 (Mon) @ 14:08

It’s possible we’ve been too harsh on catchers overall.
If you told me you had a crystal ball, and knew that 5 years from now we will have made one big change to the WAR system, my first guess would be that we discovered that it undervalues catchers in a systematic way. That could be wrong, of course, but to me it’s the single most plausible structural shortcoming.

We know from your work that catchers (at least the starters) face a systemic disadvantage as hitters, presumably due to wear and tear from their defensive load. On top of that, measuring their defense has always been a challenge. This work suggests the defensive value may be quite substantial. That’s especially the case if, as has long been thought, backup catchers are selected mainly for their defensive ability. If that’s true, and good starters like Molina and Carter can still save their teams hundreds of defensive runs, that would be enormously important. 

**

Varitek: wow that’s a big correction! Did he catch only for Pedro?


#4    dlf 2020/05/26 (Tue) @ 09:04

“Rick Farrell (who??)
...
Al Lopez (who??..)”

Proof that Tom—properly if I may say—pays no attention to the Hall of Fame.  Both Farrell and Lopez are in Cooperstown.  The former is considered one of the weakest selections ever and the latter is in more for his managing than his playing.

For what it is worth, Tom’s approach matches the anecdotal reputations of both Farrell and Lopez.  They were considered excellent fielders by the reporters of their day. For many decades, Lopez held the record for games caught until later passed by Bob Boone.  While it isn’t universally true (e.g. Jeter) players at defense first positions rarely last long absent excellent fielding.


#5    Tangotiger 2020/05/27 (Wed) @ 21:57

Good stuff, thank you for the historical info.  Lopez having MVP votes in 7 seasons certainly supports it too.

Varitek: I should do a breakdown of him.  I should also see what happens if I change the range for the pitchers’ RA/9 to +/- 3 years.


#6    Rally 2020/07/23 (Thu) @ 22:25

Missed this thread the first time around, but somebody linked to it on BTF in a Yadier Molina thread.

In the process of trying to replicate these numbers.  The process seems very simple, look at runs and outs with the player on the field and without.

Easy enough to compare to other positions and see if the spread of catcher runs is bigger in line with the others.  Accepting that a catcher is saving 10X as many runs as my metrics on baseball reference is not easy. But I suppose possible.

But if we find similar spreads at other positions where does that leave us? I think that would indicate we are on the road to impossible results.  If we’ve got career fielders at every position saving 800 or 1000 runs, wouldn’t the spread of team runs scored have to be vastly greater than what we observe?

So far I’ve only run 1B and C, for years 2016 to 2019. Best season by a catcher in this span was Jeff Mathis, for the Rangers last year. Hitting .158 with both OBP and SLG barely over .200, Mathis would have to be a good defender to stay out there, but we can assume he is considering he still has a job entering his 16th season.

Mathis was the catcher for 1954 outs, while the Rangers gave up 322 runs, or 4.45 per 9 innings. With other catchers, 6.36 per 9, so Mathis saved over 150 runs.

I did not make the pitcher control in my query.  Mathis caught most of the innings for Lance Lynn and Mike Minor, so adjusting for that might bring his number down.  But it’s a chicken or egg thing there.  Both pitchers had their best MLB season, but was Mathis lucky to catch them? Or were they lucky to throw to Mathis?  That’s what we’re trying to figure out.

I did find one problem with using the geometric mean in the first base group though. In 2016, Eric Kratz played one game at first base for the Pirates. In the 8th inning with 2 out and a runner on third, he entered the game at first.  The next 5 batters got hits, 5 runs scored before the final out was recorded.

How many of those runs was Kratz responsible for? Maybe one, there was a single to right field.  No idea if it was a play that a 1B should have made, but the other hits were all to center or left.

Using geometic mean we have Kratz with one out and 5 runs allowed, the team without him at 4351 and 753.  The geometric mean of 1 and 4351 is 66, giving him credit for 66 outs at that rate means charging him with -319 runs!

I think a better way is to cap the innings used for credit, they can be the lower of the actual outs while the player is in the field, or the geometric mean.

(aside - I think harmonic mean and geometric mean are the same thing.)

So with that change I’ve got Kratz on the field for one out and being something like -4.8 runs. Still a bit high, but at least it’s reasonable.

With that adjustment, Mathis is the only catcher better than +100 for a season. 27 are +50 or better. 20 are -50 or worse, and 2 lower than -100.  Call that 48 extreme values.

For 1B I’ve got 4 over +100, 18 over +50, and 26 at -50 or lower. So 44 extreme values.

Just getting started, but doesn’t look like the spread of defense for catchers is much different than for 1B.


#7    Tangotiger 2020/07/25 (Sat) @ 10:07

First off, I would say not to focus on an individual season, since an individual season would have to be heavily regressed.  Focus on careers, and then regress careers.

That said: it’s brilliant to use other positions to see the effect using runs and outs. 

I have done in the past for other positions to look at hits and outs (BACON or even wOBAcon). And the list was pretty good, if not “wide”, about 2X what we are used to.  I’ll find that link in a minute.


#8    Tangotiger 2020/07/25 (Sat) @ 10:19

Here’s WOWY using BACON:

http://www.insidethebook.com/ee/index.php/site/comments/best_worst_wowy_since_1993_through_age_34

So, I really like the idea of doing the same thing, using Runs per out.  We can convert BACON into runs, then compare the two.  That difference is random variation.

(Though, I should use wBAcon, not BACON.)


#9    Rally 2020/07/25 (Sat) @ 11:13

As far as individual seasons, I was doing the harmonic mean wrong in #6.  The outs used for Kratz being on the field for 1/3 of an inning should not be 66, it should be about 2.

(1/1 + 1/4351)/2 = .500115

1/.500115 = 1.9995

So that’s not something to worry about.


#10    Rally 2020/07/25 (Sat) @ 13:55

I ran all positions for years 1925-2019. My numbers are a bit different since I looked at outs and runs while the player was on the field through Retrosheet.  For season long ago, I will miss some data where retrosheet doesn’t have a pbp record.

But my catcher results are close enough to Tango’s to make me think I’ve got the code right.

Carter +1045
Yadi +951
Ausmus +878
Cochrane +824
Pena +726

Only 2 defensive players topped Carter. If I gave you a few guesses as to who it would be you’d probably get one of them.

Has to be a guy recognized as a great fielder, who played for a long time, right? So Ozzie, Brooks, Willie, somebody like that?  Or maybe someone who didn’t have a super long career, or one still active, who dominates the advanced metrics leaderboards?  Like Andruw or Andrelton.

It’s Willie Mays, +1800

But he’s only #2. You’d probably never guess who was #1, although you’ve heard of him. At +1821 runs, It’s Lou Gehrig.

Lou was 115 runs better than his backups in 2986 matched outs.  He was on the field for 47455 outs, so we multiply that number by about 16 to get his full career run impact. 

For Gary Carter, his matched inning total was about half of his actual innings.  For Mays, it was between a quarter and a third.  We have a very low matched innings total for Gehrig since he was the Iron Horse, and didn’t take much time off. Cal Ripken will have the same problem.

After the top 2, top non-catchers:

+943 LF Bob Johnson
+940 2B Frank White
+911 CF Curt Flood
+888 SS Arky Vaughan
+872 2B Willie Randoph
+863 1B John Olerud
+859 3B Brooks Robinson
+825 1B Steve Garvey
+822 1B Lu Blue
+764 CF Willie Davis
+759 3B Eric Chavez

I don’t know much about Blue or Johnson’s reputations. Vaughan is one of the greatest SS of all time, but his legend is based on his offensive ability. I thought he was considered average or a bit above on defense, but not outstanding. Garvey, of course, is Satan’s best friend.  The others have good or great defensive reputations.

At the bottom of the list is a recent SS who will come as no surprise to readers here.

-616 2B Eric Young Sr
-658 3B Ed Sprague
-729 RF Dwight Evans (big surprise, gold glover)
-746 RF Magglio Ordonez
-764 CF Richie Ashburn (another big surprise, huge range numbers but helped by his ballpark)
-857 RF Bobby Abreu
-974 SS Alexi Ramirez

Ramirez was about average between TZ and DRS. Never won a gold glove but don’t remember anyone suggesting he was bad out there. He played a bit less than half as much as Derek Jeter, on a per out basis he is substantially worse than Jeter.

-1133 Carlos Lee
-1322 Captain of the Yankees.

As a shortstop, Alex Rodriguez was -459 in less than half as much time as Jeter.  So maybe there wasn’t actually a good choice to make in 2004.

A big surprise is seeing Andruw Jones at -487, given how well he rates in most advanced stats. He’s right in the same range as Dave Winfield.


#11    Guy 2020/07/26 (Sun) @ 08:57

Really interesting work, Rally. There is clearly a real signal here: the top players are generally regarded as excellent fielders, and vice versa. And to the extent there are exceptions (Andruw, Evans) maybe we’ll learn something.

But as you say, the magnitudes seem implausibly high. The most obvious culprit is that WOWY applies huge leverage to relatively small “without you” samples. But it’s surprising that the potential biases in these samples (pitchers, home/away, teammate fielders, opposing hitters, parks) don’t wash out more than this over long careers. Do you have any theories about why that is?

Maybe the only answer is to use something like wOBAcon (including DP and excluding HRs if possible) for non-catchers. But runs allowed per game is a nice, clean, easy to understand metric.

Can you tell if starters are better or worse than bench players at each position?  That would be interesting to see.


#12    Tangotiger 2020/07/26 (Sun) @ 13:55

Note that the way I am doing the catchers here, I’m doing it year by year.  It works for catchers, it doesn’t work for Ichiro.

And for catchers, I’m also making it that after I sum it up at the career level based on harmonic mean, I then pro-rate it to their total outs.  That won’t make sense for Ichiro, who would have missed most of his games when he’s old, and so, pro-rating that to his young days.

Therefore, there’s alot of gotchas that apply to non-catchers that get exposed in a process like this.  This is why I focused on catchers.

You also have to be careful to neutralize by year, if you are using multi-year pitcher performance.

That said: there’s signal there.  You just have to be careful.


Click MY ACCOUNT in top right corner to comment

<< Back to main


Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

Aug 20 12:31
How to evaluate HR-saving plays, part 3 of 4: Speed

Aug 17 19:39
Leadoff Walk v Single?

Aug 12 10:22
Walking Aaron Judge with bases empty?

Jul 15 10:56
King Willie is dead.  Long Live King Reid.

Jun 14 10:40
Bias in the x-stats?  Yes!