Wednesday, November 15, 2023
History of The Marcels
Back in the early 2000s, when I started blogging heavily on baseball and hockey, I was intrigued, then aghast at the "forecasting" systems being offered, some for a price. They all came with a pseudo-promise of some sort or other.
This is the same thing with the stock market that I used to follow back in the 1990s. I saw an article at the time about evaluating stock predictions. And wouldn't you know it: only one of the ten brokerage houses even beat the index. Basically, nobody can predict anything really. No one has any special insight. You throw thousands of people together, and Random Variation will simply start putting some folks ahead of others.
It's also when I learned how Mutual Funds would get above-average results: you'd have a fund company that has two similar types of funds. One will do better than the other. Guess what happens: one absorbs the assets of the other, but NOT the history. So, now you get survivorship bias: all the remaining mutual funds are above average! And then they create a NEW second one, to keep that cycle going.
This is also how they sell those free betting tips. You call some 1-800 number with three picks being offered for free. Well, they set up 8 different lines, each with a different combination of picks. One of them will get all of them right, and therefore 12.5% of the callers will be happy with those results, and stick with that phone line.
Anyway, back to baseball. I decided to try my hand at forecasting. I started with something simple, and just used the three most recent seasons. It worked pretty well. Then I started adding more and more. And something curious happened. It would help for 51% of the batters and hurt 49% of the batters. No matter what I tried, other than age, nothing really stuck much. A different 51% of batters helped, but no real bias. Each iteration was alot of work, for such little gain. So, I decided to take a step back and decided to have as my baseline just a Naive model: last three seasons, age, and regression.
Then, I compared that to what was being published publicly, and something interesting happened: the Naive model was as good, or better than virtually everything out there. So, instead of trying to improve the model to try to get every little gain, I decided to publish as-is, and call it Marcel The Monkey Forecasting System, aka The Marcels, as the basic most simple forecasting system anyone should expect. So, instead of trying to be the best, I'm basically saying: this is the worst (acceptable). And boy did that clear the field. If you can't beat The Marcels, then what is the value-added of your system?
And so, I published it, and kept it up for a while. In the meantime, others have implemented my model (though without me checking their code, so I can't confirm they are totally faithful, but, I'm sure they are all excellent).
And that's how The Marcels work and came to be.
What simple changes would you make to improve it even more?
Seems as if ballpark adjustments could help? Forecasting players with no playing time as something less than average?