Tuesday, July 29, 2014
Regressions run wild
I like the lead-in here. But then when it gets to the regression... well, you know how I feel about using regressions as a final step. According to the results, the difference between a triple and a double is 0.62 runs, more than double the actual difference.
It's also not clear how the outs were handled. Was all the data done at the per-game level, so that (basically) removes the out as a variable??
The author has some good ideas when he talked about OBP, focusing on the idea of the 3+ runners reaching base in an inning. That's one of the things that BaseRuns for example did NOT control, the fact that you cannot average more than 3 runners left on base per inning. Obviously, at the levels we're talking about, it's not going to happen in MLB, or really any league. But I've wanted to see a model that adhered to the constraint that there are three bases (and how you can make that look nice, like BaseRuns).
So, I don't know what to make of it all. The author is motivated and willing and has good ideas. But darn that regression upsets me. It's stuff we were talking about ten years ago and moved on from. Regressions are useful first steps, but it can't override logic.
Recent comments
Older comments
Page 1 of 152 pages 1 2 3 > Last ›Complete Archive – By Category
Complete Archive – By Date
FORUM TOPICS
Jul 12 15:22 MarcelsApr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref
Apr 12 09:43 What if baseball was like survivor? You are eliminated ...
Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method)
Jul 13 10:20 How to watch great past games without spoilers