BaseRuns - AddendumSome data and formulas used in my BaseRuns articleBy Tangotiger
1974-1990, all data generated from Retrosheet and Ray Kerby
A couple of quick points as to why I did certain things. I tried to weigh the merits of not breaking up "official stats" like strikeouts, so that we can apply these measures historically. However, some "K" are also safe plays. It's rare enough that we can ignore this. But then I get the problem with the sac bunt having safe plays as well. This is why you see that annoying ".08" in the basrunner field. I also have a problem with partial innings, and the case where you have runners left on base. So, I introduced "implied outs", which is the number of outs "left in the game", to balance out. The "lwts" values and the "lwts_rc" values are identical except for the outs. However, you can get outs with a single, which is why the two numbers are not exactly the same. Using the Retrosheet scoring system of events, anything that happens following a single is credited to a single. Even a double-steal counts as "1" in the SB field.
Anyway, there's alot of technical things that don't amount to a hill of beans, but that I had to make certain assumptions/decisions so that everything added up. I'm not sure if all the decisions are correct, and I suspect that I may change some of this in the future.
"Freq" is how often that event occurred. If you multiply the "freq" by the "lwts_rc", you get the total number of runs scored. If you multiply "freq" by "lwts", you get zero (or pretty close to it).
The BaseRuns formula follows the true definitions of scoring runs: BaseRunners x scoreRate + HR. BaseRunners is denoted by "A", scoreRate is "B" / "B" + "C", and HR is "D". Again, you can make the case that the CS removes a baserunner, except that sometimes a CS is still safe. And you can also take it from the point of view of "initial baserunners". Again, I'm not 100% sure that these decisions are the best ones.
|