Tuesday, December 25, 2012
Bias in ERA
ERA is a terrible idea. It takes something factual, the number of runs scored, and then decides which of those to attribute to one pitcher or another (in case of multiple pitchers in the same inning), and of those attributed, decide which of those are “earned” and “unearned”.
When you take something factual, and decide to split it up in some systematic fashion, you have to worry about systematic biases. And the larger the sample, the more the systematic bias will shine through.
Of the 1318 runs allocated to Curt Schilling, 1253 are declared as “earned” (a rate of 95%). He faced 13284 batters (in 3261 innings). That’s a shade under .1 runs per batter faced.
Of the 1357 runs allocated to Kevin Brown, 1185 are declared as “earned” (a rate of 87%). He faced 13542 batters (in 3256 innings). That’s an even smaller shade over .1 runs per batter faced.
Why do we have a bias? Curt Schilling was a flyball pitcher while Kevin Brown was a groundball pitcher. And errors are assigned to infielders far more than they are assigned to outfielders. The very fact that Kevin Brown allows a ball to hit the ground will likely lead to him getting less “earned” credit for anything bad that happens, that that “bad stuff” gets transferred to his fielders. But, Kevin Brown had zero expectation of “perfect” fielders. That Brown allows a groundball comes with it the reality that it’ll get muffed more than Schilling allowing a flyball.
By allowing the scorer to decide that a pitcher gets absolved of blame in this biased manner, we perpetuate the systematic bias in the metric. And we end up with Curt’s ERA at 3.46, and Brown at 3.28.
“Yeah, yeah, whatever”, might be your reaction. After all, I picked the two most extreme pitchers of the current generation (true). And their RA9 is 3.75 for Brown and 3.64 for Schilling. ERA’s biased measure of +0.18 for Schilling becomes the unbiased -0.11 for Schilling. So we’re talking about a +/-0.15 gap at the extreme. True enough. If you don’t care, you don’t care. At least be aware of the issue, so you know enough to discard it as being mostly irrelevant.
The other issue is the mid-inning pitching change, where runs are allocated only if the following pitchers lets those runners score. So, the run gets counted entirely to the initial pitcher only if the following pitcher allows those runs to score. The reality is that we have a SHARED responsibility, but baseball record-keeping is so transfixed to give entire credit to one player or another, be it runs allowed or games “won”. To reflect the reality of shared responsibility, we give the pitcher who leaves the game with runners on base a portion of those runners counting as runs scored, whether they scored or not. And we give the relieving pitcher the remaining positive amount of the run if the runner scored, and a NEGATIVE run amount of the runner was left on base (so that at the team level, it all adds up for that inning).
Be aware of the issues, and then decide if it’s relevant enough for you.
How would you suggest properly accounting for the mid inning change?