[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


2013 Bill James Handbook

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, December 25, 2012

Bias in ERA

By .(JavaScript must be enabled to view this email address), 10:45 PM

ERA is a terrible idea.  It takes something factual, the number of runs scored, and then decides which of those to attribute to one pitcher or another (in case of multiple pitchers in the same inning), and of those attributed, decide which of those are “earned” and “unearned”.

When you take something factual, and decide to split it up in some systematic fashion, you have to worry about systematic biases.  And the larger the sample, the more the systematic bias will shine through.

Of the 1318 runs allocated to Curt Schilling, 1253 are declared as “earned” (a rate of 95%). He faced 13284 batters (in 3261 innings).  That’s a shade under .1 runs per batter faced.

Of the 1357 runs allocated to Kevin Brown, 1185 are declared as “earned” (a rate of 87%). He faced 13542 batters (in 3256 innings).  That’s an even smaller shade over .1 runs per batter faced.

Why do we have a bias?  Curt Schilling was a flyball pitcher while Kevin Brown was a groundball pitcher.  And errors are assigned to infielders far more than they are assigned to outfielders.  The very fact that Kevin Brown allows a ball to hit the ground will likely lead to him getting less “earned” credit for anything bad that happens, that that “bad stuff” gets transferred to his fielders.  But, Kevin Brown had zero expectation of “perfect” fielders.  That Brown allows a groundball comes with it the reality that it’ll get muffed more than Schilling allowing a flyball.

By allowing the scorer to decide that a pitcher gets absolved of blame in this biased manner, we perpetuate the systematic bias in the metric.  And we end up with Curt’s ERA at 3.46, and Brown at 3.28.

“Yeah, yeah, whatever”, might be your reaction.  After all, I picked the two most extreme pitchers of the current generation (true).  And their RA9 is 3.75 for Brown and 3.64 for Schilling.  ERA’s biased measure of +0.18 for Schilling becomes the unbiased -0.11 for Schilling.  So we’re talking about a +/-0.15 gap at the extreme.  True enough.  If you don’t care, you don’t care.  At least be aware of the issue, so you know enough to discard it as being mostly irrelevant.

The other issue is the mid-inning pitching change, where runs are allocated only if the following pitchers lets those runners score.  So, the run gets counted entirely to the initial pitcher only if the following pitcher allows those runs to score.  The reality is that we have a SHARED responsibility, but baseball record-keeping is so transfixed to give entire credit to one player or another, be it runs allowed or games “won”.  To reflect the reality of shared responsibility, we give the pitcher who leaves the game with runners on base a portion of those runners counting as runs scored, whether they scored or not.  And we give the relieving pitcher the remaining positive amount of the run if the runner scored, and a NEGATIVE run amount of the runner was left on base (so that at the team level, it all adds up for that inning). 

Be aware of the issues, and then decide if it’s relevant enough for you.


#1    Nick      (see all posts) 2012/12/26 (Wed) @ 07:31

How would you suggest properly accounting for the mid inning change?


#2    pm      (see all posts) 2012/12/26 (Wed) @ 08:46

“The other issue is the mid-inning pitching change, where runs are allocated only if the following pitchers lets those runners score.  So, the run gets counted entirely to the initial pitcher only if the following pitcher allows those runs to score.”

If you assumed that their relievers didn’t allow any of their inherited runs to score, Schilling would have a 3.36 career ERA and Brown would have a 3.02 ERA. Schilling’s guys allowed 32% of the inherited runners to score while Brown’s relievers allowed 37.

But the interesting thing is that Brown left a lot more runners on base mid inning than Schilling. Brown left 17.1 runners per 33 starts while Schilling left 8.6 per 33 starts and the crazy thing is that the inherited runner stat includes relief appearances which Schilling has 123 more in his career.

Brown had 10 seasons where he left 10+ runners on his base during his starts (4 with 25+ runners) while Schilling has just 2 10+ runner seasons. There is some value in being able to complete your starts which is why Schilling averaged 7.1 innings/start while Brown was 6.8.


#3    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/12/26 (Wed) @ 09:24

I’m not sure ERA is worth fixing but…

“How would you suggest properly accounting for the mid inning change”

the easiest rational fix is to give half the run to the pitcher who put the batter on base, and half to the pitcher that gave up the run. Or even share the run among every pitcher who allowed the batter to advance, if pitcher A walks him, pitcher B lets him steal a base and pitcher C gives up a single that score him, assign a third of the run to each.

I assume this was never done because scorekeepers and statisticians didn’t want to record and track fractional runs, the newspapers didn’t want to print them and the math was a lot harder before iPhones (ha! thought I was gong to say computers, didn’t you!).


#4    Tangotiger      (see all posts) 2012/12/26 (Wed) @ 12:20

You assign the bequeathed runners along the lines of the run expectancy matrix, regardless of whether the run eventually scores or not.

For the inherited runners, it’s the difference between the runner actually scoring (or not), and the bequeathed run value.

Best captured by RE24 (though RE24 does also include a park adjustment).


#5    DavidJ      (see all posts) 2012/12/26 (Wed) @ 12:28

Even more basic and obvious than the groundball pitcher vs. flyball pitcher bias is the contact pitcher vs. strikeout pitcher bias, no? Even with identical GB/FB rates, the pitcher with the lower K rate (everything else being equal) would be expected to be charged with fewer earned runs, simply because an error can only happen if the ball gets put in play in the first place (well, except for the occasional passed-ball third strike, I guess). So, Brown’s ERA (relative to Schilling’s) not only benefits from the GB/FB bias, but also from the contact/strikeout bias. Right?


#6    Jeff P      (see all posts) 2012/12/26 (Wed) @ 12:59

Assigning values for bequeathed runners based on run expectancy makes a lot of sense. It would be interesting to do something similar with starters and wins/win expectancy. What’s the win probability when a given starter leaves a game? If the pitcher always got complete game wins, they would have a 1.00.


#7    Tangotiger      (see all posts) 2012/12/26 (Wed) @ 23:15

David: excellent point!

Jeff: you can see RE24 on Fangraphs and BR.com (though note that they use park-based RE24 matrix).


#8    mkd      (see all posts) 2012/12/30 (Sun) @ 11:13

I was just doing a thing on thing on how much Jim Palmer was helped by the all-world Oriole defenses playing behind him and started comparing ERAs to RA9s. The idea is that you can roughly estimate how good a pitcher’s defense was by plotting the difference between the two numbers—the greater the correlation the better the defense.

Of the Top 200 rWAR pitchers the largest splits are all early 20th century pitchers: George Mullin (1.17), Noodles Hahn (1.14), Jack Powell (1.02), Rube Waddell (1.01), Joe McGinnity (0.98). This makes sense because there were a ton more errors in the old days. Jim Palmer is 180th of 200 with a split of 0.32 (2.86 ERA, 3.18 RA9). I would argue that this speaks to the insanely high quality of the 1970s Orioles defense.

I bring all this up because Curt Schilling is 200th of 200 with a split of just 0.178 (3.46 ERA, 3.64 RA9). Whatever the reasons (strikeout pitcher, flyball pitcher etc) there were not a lot of errors being committed behind Curt Schilling).


#9    dc Roach      (see all posts) 2013/01/04 (Fri) @ 02:30

I started tackling this mid inning pitching change issue last year, and just calculated the adjusted ERA numbers for the 2012 Nats pitching staff yesterday. (I looked up the run expectancy tables for a commenter and started reading the blog which led me here!)

http://www.federalbaseball.com/2013/1/3/3830146/e-era-2012-adjusting-for-inherited-runs

Critiques on my approach would be more than welcome! I believe I used the 1950-1968 numbers because they were the best thing I was able to find when I did the original calculations in 2011.


#10    Tangotiger      (see all posts) 2013/01/04 (Fri) @ 02:55

Run Expectancy matrix are hugely driven by the runs per inning environment, with some “shaping” based on the kind of run environment (HR-heavy, BB-heavy, etc).  By definition, the value in the bases empty, 0 outs must match exactly to the run environment.  So, that sets the whole thing off.  Then the various other 23 entries work off that (and each other), with some “shaping” based on frequency of HR, walks, etc.


#11    Tangotiger      (see all posts) 2013/01/04 (Fri) @ 02:55

For those who don’t know:

http://www.tangotiger.net/markov.html


Commenting is not available in this channel entry.

<< Back to main


Latest...

COMMENTS

Feb 11 02:49
You say Goodbye… and I say Hello

Jan 25 18:36
Blog Beta Testers Needed

Jan 19 02:41
NHL apologizes for being late, and will have players make it up for them

Jan 17 15:31
NHL, NHLPA MOU

Jan 15 19:40
Looks like I picked a good day to suspend blogging

Jan 05 17:24
Are the best one-and-done players better than the worst first-ballot Hall of Famers?

Jan 05 16:52
Poll: I read eBooks on…

Jan 05 16:06
Base scores

Jan 05 13:54
Steubenville High

Jan 04 19:45
“The NHL is using this suit in an attempt to force the players to remain in a union�