Sunday, August 23, 2015
DRA component run values
?Jonathan was kind enough to send me a link to their latest breakdown, one that we asked for in some form. I'm writing this as I'm looking at the data, so, I don't know what we're going to find. The focus will be on Pedro 1999 v 2000.
In 2000 Pedro is noted as being 56 runs better than average. In terms of the main components that that number is adjusted for, it's +4 runs, meaning that Pedro's performance in 2000 was +52 runs, with about 4 runs uncovered via context.
Now, let's look at 1999, where he's noted as "only" 40 runs better than average, and his context was -1 runs. Meaning that he's a +41 runs before context, and when considering context, it's actually +40 runs.
So, the pre-context different of Pedro was 11 runs (+52 minus +41), with the advantage in the 2000 season. That the 2000 season is ahead of the 1999 season suggests that the central component is Runs Allowed.
However, each of those numbers is about 25 runs lower than would be suggested by Runs Allowed. This would ALSO suggest that there's a regression toward the mean built in to all the runs allowed figures. This is interesting, because this is what should happen!
That is, assume that ALL PERFORMANCE was luck. Given 200 IP, some pitchers will allow 3 runs per game and others would allow 5 runs per game. Just by luck, because that's the assumption. Do we want to represent that as +1 runs per game better for one pitcher and -1 runs for the other pitcher, just because they happen to be on the mound when luck happened? Or, we'll just show then as 0, just like we would if the pitchers were an actual pitching machine.
But, if we assume that it's MOSTLY skill, we still want to regress PARTLY. And this is what DRA is doing, it seems. A vote for DRA is a vote for regressing the observed results by removing the portion that is luck-based.
The question therefore is, since we've gone down the regression rabbit hole, would be to determine if there's too much, or not enough regression. That's the first thing you need to do to evaluate DRA, since this seems to be the primary driver to the results.
So, I would therefore ask Jonathan and crew to also include the luck values in runs as another component to display, to make it very clear what is going on here.