Banner Years
© Tangotiger
Suppose you have two players, one that had a banner year, followed by 3 average years, and another who had three average years followed by a banner year. Which player will be better in the fifth year?
The question essentially deals with how much to weight each of the seasonal performances, and if recent performance is a better indicator than not. So, what do you think?
True Talent Level
Before, we go on, we should understand what a player's seasonal performance represents.
Every player has a true talent level, which is fairly static, but might change from day to day, because of change in training, or approach, or diet. Layered on top of this is the environment in which a player performs: the pitcher, the fielders, his teammates, the park, the weather, the base-out situation, the inning-score, etc, etc. And to top it off, the player only has 4 chances each day to perform (small sample size). So, a player's seasonal performance represents a combination of his talent, his environment, and luck. The more he can play and the more his environment matches to what the average player sees, the more representative his performance will be.
Unfortunately, one season's performance does not balance everything out. Rather than represent his true talent level, it is only an indicator of his true talent level. Of course, if we can increase the sample size from 500 AB to 2000 AB, things look alot better. To do that, we need to span seasons, which brings along another implication: change in talent level from year-to-year. While the talent level is fairly static day-to-day, as you start expanding this into many years, we start to lose confidence as to what a player's true talent level is.
Ok, now that I got that out of the way, if we look at a group of players, many of these problems are reduced, so that we are left with the aggregate performance level being fairly representative of a group's true talent level. Let's go back to our two players. Have you decided which player would be better, if any? Have you decided how much to weight each of the banner years, if at all?
Study 1 - The Five Year Cycle
I looked at all players from 1919-2001 who over a 5 year period had at least 300 PA (this is a selection bias as well, as only players of a certain talent level will play this much, this long). Then I looked for players who had a banner year (at least 25% better than league average) followed by performing at an average level (within 10% of league average) for three straight years. Unfortunately, I only found 18 such players. I did the same for the second group, but this time I looked for the banner year at the end of the cycle. 19 players fit the bill. Here are the results, along with the group's performance in year 5:
Year 1 133% 104% Year 2 102% 100% Year 3 101% 100% Year 4 102% 131% Year 5 102% 107%(Stats based on Linear Weights Ratio.)
We see here that having a recent banner year is more predictive than having an old banner year. In fact, the group of players with old banner years really matched their performance without consideration of having a banner year. They were average for the last 3 years, and were average again.
For the group of players with a recent banner year, we see that it does have some impact. However, they kept very little of that performance level in the following year. If you were to just do a simple average of the 4 years, you would get a predictive value of 109%. That is, without giving the recent performance any extra weight, you can predict the group's expected performance level.
Study 2 - The Four Year Cycle
Seeing that the first year didn't have any predictive value for the first group, let's redo our study, this time looking at a 4 year cycle instead of 5 years. (This will also allow us to increase our sample size.) There were 61 players in the first group and 61 in the second. Here are the results, along with the group's performance in year 4:
Year 1 1.32 1.02 Year 2 1.02 1.00 Year 3 1.02 1.34 Year 4 1.09 1.13
We see here that a 3-year sample should suffice (if we are to assume that the small sample size of the first study is not an issue). We also see that the recent banner year also has more of an effect than an old banner year. However, there is still not that much of a difference.
Study 3 - The Stars
Let's look at one last group of players, before we get too far ahead of ourselves: the stars. There are 655 strings of players who were at least 25% better than league average for three years running. (We might have an issue here as players like Hank Aaron have many strings of 3 great years. I'm not sure how to resolve this, nor if we have to.) Here are the results, along with the group's performance in year 4:
Year 1 1.49 Year 2 1.49 Year 3 1.49 Year 4 1.42
Interesting. Even though our group of stars were 49% above league average for 3 straight years, their next year averaged 42%. What this implies is that these players were probably slightly lucky for three straight years, and that their true talent level is actually 42% above average. These players retained 86% of their value, meaning that they regressed 14% towards the mean.
Let's put it altogether
Let's go back to our banner year groups. If we assume that 14% of the weight should be given to regression towards the mean, then we can use the following weights to predict next season's performance level:
24% - Year 1 24% - Year 2 38% - Year 3 14% - average
By using these weights, we can predict the Year 4 values that were produced by the last two studies. As a shorthand, you can say 1 part average, 2 parts each first two years, 3 parts third year.
I also see no evidence that a player's banner year, that is far above his previously established level, is a signature to a new higher established level. In essence, such a player was slightly unlucky for 2 years, and very lucky for one year. He was a 110 player who happened to play 100 for 2 years, and 130 in the third year. While it is possible that some players do have a higher established level of performance, we cannot find it in the traditional seasonal stat lines.
In the past, I've also looked only at HR, to see that maybe some part of a player's talent can signal a new established level of performance, and the results match pretty much with what these new studies show. If you want to look for a new established talent level, you have to look elsewhere.
Note: My comments, as well as MGL's, were retained. Comments from the following posters were removed (though I will gladly reinsert them, if said posters would grant me permission): R.P. Coltrane, Contrarian, Walt Davis, DW, Sancho Gamwich, Mike Green, Stephen Hobbes, F. James, Kenny, Mikael, Shaun, David Smyth, Brad Wenban,
October 31, 2002 - Tangotiger
(www)
Good comments, guys. I actually meant to address these two issues, and I'm glad you brought them up.
Age: definitely should be looked at, but I can tell you that there is no big age bias, even with the 149 group. I will do the breakdown, hopefully before the end of the day.
Banner selection: one of the considerations I had was that I did not want to select players that were say 110-110-110-140, because of the "regression towards the mean" issue I brought up. That is, even though you've got a guy who you think is a 110 level for three years, he might actually be 107 or 113, etc. The closer you are to 100, the more likely it is that this is a 100 player. Furthermore, by introducing all players, then I get into trouble with losing players. While it is unlikely that you will lose a player from the pool who is 100-100-100, it is very possible that you might lose a player who is 80-80, thereby introducing a bias. Of course, this also depends on position.
Having said that, I was thinking about running through the data anyway, and see what happens. And with the much larger sample size this would allow me, I can select 35% above "previous 3 years" to really highlight the banner years. I'll try to get to this next week.
October 31, 2002 - Tangotiger
(www)
Walt, good comment at the end, and this is exactly what I did with the HR study I linked to. And rather than seeing a "retention", we see essentially that what the player did the year after the banner year was repeated the year after that as well.
Again, what I am talking about is not "retention", even though I used that term. We are presuming that a player's performance level is a sample of his true talent level. Therefore, by selecting 130-100-100, I am choosing those players that had a great year followed by 2 average years. This does not imply that this player had an injury or something that forced him to go down to 100. The more years I tack on, the smaller my sample size. You are correct that I can simply show year1, select on years 2 through 4 (whether 100-100-130, or 130-100-100), and then look at year 5. My guess is that year 5 production will be only slightly different than year 1 production, age notwithstanding. This is a good idea, and I will run that next week as well.
Walt: any comment about the Hank Aaron issue?
October 31, 2002 - MGL
Excellent work!
You lost me a little on this one (you have a great writing style which generlaly makes evetything crystal clear. Either you deviated a litlle from such a style of I syddenly became thick. The latter is entirely possible. In any case, if you could explain the following (as if I were a 6-yo child).
"Let's put it altogether
Let's go back to our banner year groups. If we assume that 14% of the weight should be given to regression towards the mean, then we can use the following weights to predict next season's performance level:
24% - Year 1 24% - Year 2 38% - Year 3 14% - average
By using these weights, we can predict the Year 4 values that were produced by the last two studies. As a shorthand, you can say 1 part average, 2 parts each first two years, 3 parts third year."
(Also why the blue font? Or is that juts my browser? I can't seem to highlight, in order to "copy and paste" any of the blue text!)
For those of you who question why 3 149% years followed by a 142% year is "3 lucky" years and one "more indicative of talent" year, that is exactlty true! It's hard to explain why. I'll try. First of all, the 142% is exacty what we would expect after 3 years of 142%!
ANY TIME WE SAMPLE A PLAYER'S TALENT (1 YEAR, 2 YEARS, 5 YEARS) WE EXPECT THAT HIS TRUE TALENT LEVEL IS EXACTLY EQUAL TO THE SAMPLE MEAN (LWTS RATIO, OPS, OR WHATEVER) PLUS A REGRESSION TOWARDS THE MEAN OF THE POPULATION THAT PLAYER COMES FROM!
Without knowing the age, height, weight, etc, of that player, we have to assume that that player comes from a population of professional baseball players only. So the mean towards which our 3-year sample will regress is 100% (the mean normlalized lwts ratio of all players, which is by definition, 100%). And of course, the smaller the sample we have (say 2 years of 149%, as opposed to 3 years), the more we expect the next year to regress. Without doing the work, I KNOW that 2 years of 149% will be followed by something LESS than 142%.
Any given year, where we don't know a priori, what the value of that year is, is neither expected to be lucky or unlucky. That is the 142% year - simply a random (for players who already had 3 good years, of course) year whose value is unknown before we calculate it. Therefore, we neither expect it to be a lucky or an unlucky year.
The 3 149% years, ARE BY DEFINITION LUCKY YEARS! That is because we purposefuly chose players with 3 good years, relative to the league average. Any time we purposely choose goof or bad years, we are, again, by definition, choosing "good (or bad) AND lucky (or unlucky) years. That is why all good or bad years will regress towards more average ones!
Beleive it or not, now that we have 3 years of 149% followed by 1 year of 142%, we have a "new" sample talent of around 146.7 (ignoring the 1st 149% year, as it is gettin gold). In fact, let's bump up the 146.7 to 147 because fot he first 149% year. Even though this is our "new" talent sample for our player, it is still not our best "estimate" of his talent, i.e., our prediction for th enext year! We still think that all 4 years have been lucky (remember all samples above average are automatically lucky and all samples below average are automatically unlucky; how lucky or unlucky delends upon the size of the sample; here we have 4 years of above average performance - it is still lucky performance, but not by that much) so we think that our 147% projection is TOO HIGH! Again, without looking at the database, I can tell you that the 5th year will be something less than 147% - probably around 145% (maybe less, becuase we may start to have an age bias - th elonger the sample you look at, the more likely it is that you are looking at older players).
All that bein said, alrthough I don't think it is a problem either, I would address the age thing - either control for age or at least include it in your results.
I would also do some park adjusting or at least inlcude some park info in your results. I am a littel concerned that a banner year is weighed towards players who have changed home parks, so that, for example, banner year followed by 3 average years will tend to be hitters park followed by pitcher's park, so that in 5th year, the player is more likely to be in pitcher's park, whereas 3 average years followed by banner year suggests that the player is more likely to be in hitter's park for 5th year. This would "screw up" the weighting system...
October 31, 2002 - Tangotiger
(www)
MGL, yes, I agree with almost everything you said. Two points:
1 - yes, not my best writing work, as I wrote it in 30 minutes, but what is it that is unclear? Was it the weighting thing at the end? It basically means that you put more weight on the most recent year, and you have some weight to regress towards the mean. Or was it something else?
2 - As for the 149,149,149, which I selected for, the 4th year was 142. However, you then say that this group is actually a "147" group . Not! Because my group is "fairly large", then I would say that this selected group of 149,149,149 is a 142 player. And, if I looked at the 5th year, I would bet that this group would also exhibit 142. I would also bet that the year prior to 149 would also be 142. I would say *every* year around the 149 years would be 142. Do you agree? (Age of course is an issue if I start going crazy and start to consider pre-24 and post-36, etc years.)
However, for a single player, if I had a 149,149,149,142 player, since I didn't select such a player, then I would have to guess that he is a 147 player.
I think we are on the same page, but I'm not sure.
*** As for parks and changing teams, etc, yes that is always a problem. It's "possible" that the park may play an influence in the selection of my players, but I doubt it. The banner year was 25% above the base years, and so, while playing at Coors does increase the chances that he will be selected in the banner year, I don't think this is the case. I'll look into it though.
*** By the way, the more I look into this, this is just like MGL's hot/cold streak study. While he is looking at 15-day periods, I am looking at 3-year periods. We are (or will in my case) looking at the pre-selected and post-selected period, and we are (or might/will in my case) finding that those two values pretty much match, regardless of the intervening period.
October 31, 2002 - Tangotiger
(www)
First off, I'm not trying to capture ALL banner years, just some of them. As well, I am not suggesting at all that 149,149,149 is banner performance. I am using that type of player to show that a 149,149,149 player is not in fact a 149 player but a 142 player.
So, when you look at a 130,100,100 player, a player that certainly had a banner year, we should treat the 130 with some hesitation, since, as we've seen, this performance was "lucky" in some respect.
****
Anyway, I've re-run, so that we have "x", 149,149,149, "y". That is, how did the players with 3 great years do just before the "banner 3 years", and just after? Here are the results
Year 1 1.42 Year 2 1.49 Year 3 1.49 Year 4 1.49 Year 5 1.41
This population of players had 593 strings.
Now, if we break it down by age (in Year 5), this is what we get:
Age 1 2 3 4 5 n34+ 1.46 1.51 1.51 1.48 1.37 173
30-33 1.44 1.49 1.49 1.47 1.41 229
29- 1.36 1.46 1.49 1.51 1.45 191Again, as you can see, the "3 selected years", were pretty constant around that 149 level. The before/after years are consistent with the age grouping. But, in all cases, the before year was less than the selected period, even for the old guys.
There is also about an annualized 2% change in performance level between year 1 and year 5, which is also consistent with my findings in aging patterns previously done.
So, the "true talent level" is year 1 and year 5, and everything in-between is "lucky".
October 31, 2002 - MGL
Stephen, there is no falw in my argument! All players (as a group, NOT every single player - heck some 149/149/149 are actually 155 players!) will regress towards the mean, because...
Some of those players are actually 150 (let's ignore the exactly 149 players) or better players and got UNLUCKY, and some of those players are less than 149 players and got lucky. Those are the only two choices. The chances that any given player is less than a 149 player and got lucky is MUCH higher than the chances that he is better thana 149 player and got unlucky, sinply because there are many, many more sub-149 players.
The chances that a random 149/149/149 player if actually a sub-149 player who got lucky as compared to an above-149 player who got unlucky goes down as the sample size of the 149 group goes up (for example 149/149/149/149). However the upper limit (when the sample size is infinite) of the real talent of a sample group of players who have a sample performance of 149 is 149! It can never be higher and it can never be exactlty 149. It must be lower! That is why all all samples of performance above or below average will ALWAYS regress towards the mean of the population they come from. This is not my argument or opinion. It is a mathematical certainty, based on the "upper limit theorem" I describe above.
The only caveat is the definition of the "population they come from". In Tango's study, he looked at ALL players. Any player who had a 149/149/149 period qualified. Yes, this group is comprised of mostly good players (140, 135, 155, etc.), but they still come from a population of all ML players (technically all ML players who have played full time for 3 consecutive years, which is probnably an above average population, so they will not regress towards 100%, but maybe 110%). Now if we only look at first baseman or players over 6 feet tall, then the number towards which we regress will change...
David, your writing is excellent (isn't that what I said?). I just got hung up on the part I quoted in my last post. Could you re-explain that part please? As I also said, it may just be me being thick. Why did you and DS claim that I said that you didn't write the article well? That's an example of only telling part of the story and thereby disttorting the truth (like politicians and commentators do all the time in order to prove a point). I know youi guys didn't intentionally do that, but it is a bugaboo of mine...
October 31, 2002 - Tangotiger
(www)
MGL, sorry for the bugaboo.
To go back to your question, let me amplify. The 149 performance is regressed 14% towards the mean to match the "expected probable" true talent level of 142. So, generally speaking, we should regress all 3 year performances by 14%.
Now, of the three remaining components (year x, year x-1, year x-2), we weight the most recent seasons (x) as 38%, and the other two as 24% equally.
As a shorthand, rather than remembering kooky percentages, you can apply integer weights of "3" for "x", "2" for "x-1","x-2", and "1" for "mean". Maybe I should have skipped this part, as it's probably more confusing than it should be.
November 1, 2002 - MGL
I hope you guys can indulge a stats novice for a minute.
It's been a few years since I took true score theory, but from what I remember, outcomes are a function of true score plus measurement error. So, in other words, in some book in heaven somewhere, it may be written that Barry Bonds is a .320 hitter. Everything else represents measurement error. I think I understand regression to the mean. If the average baseball player hits .280, then we would expect Barry Bonds to follow a "true score" season with something less than .320. If I were a betting man, I would go with that.
You actualy have a nice handle on what's going on! Basically any player who hits better or worse than average over any time period of time is "expected" to hit closer to the mean than his sample hitting indicates. It's as simple as that. It is not conjecture. It is a mathematical certainty. It is a fundamental aspect of sampling theory. I think you completely understand how that works.
Given that we have a sample of a player's hitting (1 year, 5 years, whatever), that sample number is ALWAYS the "limit" of our "best estimate of his true talent" which is, of course, the same as his projection. For example, if Bonds' sample BA is .320, that is the "limit" of his true BA. Now the only thing left is to determine if a player's sample performance, like Bonds' .320 BA, is the upper or lower limit of his true level of perforamnce. The way we do that is simple, once we know the mean performance of the population that our player was selected from. If that mean is less than the player's sample performance than the sample performance is the upper limit of his true talent. If it is greater, then his sample performance is the lower limit of his true talent. In practice, it is usually easy to guess whether that mean is greater or less than a player's sample performance. In some cases, however, it it is not so easy.
For Bonds, if his sample BA is .320 we are pretty sure that no matter what, the mean BA of the popualtion that he comes from is less than that, so we estimate his true BA at something less than .320. That doesn't mean that we KNOW or that it is 100% certain that his true BA is less than .320. That's where a lot of people are making a mistake. There is a finite chance that he is a true .320 hitter, a true .330 hitter, or even a true .250 hitter who has been enormously lucky. All these things have a finite chance of being true. It's just that if you add up all the various true BA's times the chances of their occurrence, sampling theory tells you that you get a number which is closer to the population mean than his true BA. How much closer is completely a function of how large your sample of performance is and nothing else.
The other tricky part that gets people in trouble is "What IS the population of players that a particular player comes from and what is the mean of that population?" After all, that is an important number since that is the number that we need to regress to. Finding out or estimating that number can be tricky sometimes. If we pick a player from a list of all players without regard to anything other than he has a high or low BA, or whatever we happent obe looking for, then we know that the population is ALL BATTERS. It doesn't matter that we are picking a player who has a high or low BA deliberately. There is no "selection bias" as far as the population and its mean is concerned. Remember no matter what criteris we use to choose a player, the population that that player belongs to for purposes of estimating a performance mean that we will regress to, is the group of players that we are slecting FROM, NOT the group of players that we think that our player belongs to (good hitters for example)! If we pick a .320 player from a list of all ML players (or some random sample of all NL players), then that player comes from a population of ALL players and hence the population mean that we regress to is the mean of all ML players.
Now if we find out something else about that player we chose, then all of a sudden we have a differnent population of players and we have to choose a differnent mean BA, which is not all that easy sometimes. For example, if we find out if that player is a RF'er then all of a sudden we have a player who comes from the popualtion of ML RF'ers and NOT all ML players. Obviously the mean BA of all ML RF'ers is different than that of ALL ML players. Same thing if we find out if our player is tall or heavy or LH or RH or fast or slow, etc.
Anyway, for the umpteenth time, that's regression to the mean with regard to baseball players, in a nutshell, for whatever it's worth...
So if Barry Bonds hits .320 for three years in a row, his failure to regress represents luck. But why does that mean that his true score is not .320? Why can't Barry just be a lucky player who happened to hit at his true score level for three straight years?
See the explantion above. Yes he could be a true .320 player, just like he could be a true .350 player or .280 player. It's just that the best mathematical estimate of his true BA is NOT .320, it is something less, depending upon the mean BA of his population (big, possibly steroid laden, black, LH, RF'er who has a very nice looking swing, has a great reputation, a talented father, etc.) and how many AB's the .320 sample represents...
Whew!
November 1, 2002 - Tangotiger
(www)
Good job, MGL!
The mean of the players who played in those 5 year spans, with at least 300 PA is 115%. Now, this may sound like alot, but don't forget, we have alot of repeating players in there (like Aaron).
I don't think that the regress towards the mean would regress to 115%, but I'd like to hear from the statistics-oriented fellows about their thoughts on this matter. I would guess at this point that the Aaron situation comes up, and I should identify unique players only.
November 1, 2002 - Tangotiger
(www)
Since age is an issue, and I can easily control for it, I will re-run using that.
As well, the "mean" of the players is 115%. If we look only at one age group for the 5 year period (say ages 26-30), we see that each year they average 115%. If we select any other time period like 24-28, you also get similar results. And of course, no player could possible exist more than once in each age group. Therefore, the mean is 115%.
Therefore, I should probably select players that center around 115%, and that center around the 27 age group. I'll get to this next week.
November 1, 2002 - MGL
Tango, why don't you think that a sample mean of any player you choose who has played 5 years with > 299 AB per year will not regress towards 155%? It will, if that 155% is the mean of the population of all such players. Now, in order to get (estimate) the mean of that population, you cannot weight player's numbers. For example, you cannot have more than one Aaron in your group. The population mean must come from a random, non-weighted sample of all players who have played at least 5 years with 300 or more AB per year (or whatever your criteria was). So you must find all players who fit that description and give each player equal weight even though some of those players (Aaron for example) may have many such 5-year spans.
What numbers (let's just call in BA) do you use for those players who appear more than once (have more than one 5-yrar span with 300 or nore AB's each year)? You would take the average BA over all years for that player, as that would represent the best estimate of that player's true BA for any 5-year period...
November 1, 2002 - MGL
BTW, I contradicted myself (in a subtle way) in my second to last post. I said that in order to determine what specific population a players comes from, we look at the "list of players" that we selcted our player FROM. Then I went on to say that if we found out afterwards that a player was tall, we would change our population (from ALL ML players to ALL tall ML players). This appears to be a contradiction, which it is.
Whay I meant was that we can use any characteristic we know about our player (either before or after we chose him) to define or estimate the population of players he comes from. We cannot, however, use his BA (or whatever it is we are trying to regress) to determine what population he comes from (for example, if it is .320, we cannot say "Oh, he must be from a population of good hitters), becuase that is what we are trying to ascertain in the first place (the chances that he IS a good hitter versus the chances that he is a bad or average hitter who got lucky, etc.). It's sort of analagous to the dependent and independent variables in a regression analysis. The characteristics of the player we are regressing (like height, weight, position, etc.) are all the "independent" variables and his BA (or whatever number we are trying to regress) is the dependent variable. The "independent" variables determine the population that he is from for pusposes of figuring out what number we should regress to (the mean BA of that population), while the dependent variable (the sample BA of that players) CANNOT be used to make any inferences about that population (for purposes of establishing a BA to regress to)...
November 1, 2002 - Tangotiger
(www)
MGL, maybe you missed my last post, but if I only look at one 5-yr period, say ages 24-28, then of course Aaron can only exist once in this string. And, the players in this group are 115% of league average. Now, if I select some other age group, the unique players in that group are also 115%.
However, if I decide to combine the two groups, I might have two Aarons, and two Ruths, etc. I don't see why I would want to remove one of them from the groups.
I think it would be easier to keep all the age groups separate (24-28, 25-29, 26-30, etc, etc) and report on each one separately. This removes the conflicting players, but addresses the Aaron issue. However, I don't see the problem in then combining these three age groups afterwards, AND KEEPING the mean at 115%.
Or maybe I'm missing something?
November 2, 2002 - MGL
As far as I know, and I am no math or statistics maven (maybe slightly ahead of an ignoramus but something short of a sciolist), linear algebra is an advanced, college and graduate level, field of mathematics. So anyone who comprehends nothing more than linear algebra is indeed more advanced than I...
November 2, 2002 - MGL
Actually I wanted to add one more thing about "regression" as it relates to projecting talent in bsseball, assuming of course, that not EVERYONE is ignoring this thread now that the cat's out of the bag (that Tango and I are ignoramuses when it comes to statistics).
While the mean of a population from which a player comes determines the upper or lower limit of his true BA (from now on, when I use BA, it is simply a convenient proxy for any metric which measures talent), it isn't that useful in terms of knowing how much to regress a player's sample BA in order to estimate his true BA. In fact, it isn't necessary at all. Nor does the size of the player's sample BA tell us how much to regress, UNLESS AND UNTIL WE KNOW OTHER CHARACTERISTICS OF THE POULATION.
What I mean by that is that there are actually 2 things that tell us exactly how much to regress a player's sample BA to determine the best estimate of his true BA, and one of them is not the mean BA of the population to which the player belongs.
One of those things IS the size of the sample BA (1 year, 4 years, etc.). The other is the DISTRIBUTION OF THE TRUE BA'S OF ALL PLAYERS IN THE POPULATION.
Once we know those 2 things, we can use a precise mathematical formula (it isn't linear algebra, I don't think) to come up with an exact number whihc is the best estimate for that player's true BA.
Let's back up a little. In normal sampling statistics, a player's BA over some time period would be sample of his own true BA and our best estimate of that player's true BA would be exactly his sample BA. So if player A had a .380 average during one month and that's all we knew about this player, regular sampling theory would say that our best estimate of his true BA was .380 and we could use the number of AB's that .320 was based on (the sample size) to determine how "confident" we were that the .320 WAS in fact his real BA, using the standard deviation of BA, which we can compute using a binomial model, etc., etc. Most of you know that.
Now here is where we sort of veer away from a normal "experiment" in sampling statistics, when it comes to baseball players and their talent. We KNOW something about the population of all baseball players, which means, both mathematically and logically, that the .320 sample BA in one month (say 100 AB's) is not necessarily a good estimate of that player's true BA. We know logically that if a player hits .380 in a month that he is NOT a .380 hitter. The only reason we know that, however, is because we know that there is either no such thing as a .380 hitter or at least that a .380 hitter is very rare. If in fact we knew nothing about the range of what ML baseball players usually hit, we would then HAVE TO say that our player was a .380 hitter (within a certain confident interval, which would be around plus or minus 90 points to be 95% confident as the SD for BA in 100 AB's is around 45 points).
So now the question, as always, is, given that our player hit .320 in 100 AB's and given that we KNOW that players rarely if ever have true BA's of .380, what IS the best estimate of our player's true BA (still within the various confidence intervals)?
Let's say the mean BA of the population of ML baseball players (for the same year as our .380 sample) is .270. According to my other posts, that is the number that we regress the .380 towards, and the number of AB's the .380 is based on (100 ) determines how much we regress. Well, the first part is always true (the .270 is the lower limit of our player's true BA), but the second part is only true given a certain set of characteristics of the population of baseball players. IOW, it is these characteristics that FIRST determine how much we regress the .380 toeards the .270. Once we establish those characteristics, the more sample AB's we have, the more we regress.
What are those characteristics we need to determine before we can figure out how much to regress the .380 towards the .270? It is the percentage of batters in the population (ALL ML players in this case, since we know nothing about our .380 hitter other than he is a ML player) who have various true BA's. IOW, we need to know how many ML players are true .210 hitters, how many are true .230 hitter, true .320 hitters. etc. Obviously, there is a whole continuum of true BA's among ML players, but it would suffice for this kind of analysis if we estimated the number of players in each range. Now, estimating the number of players in baseball for each range of true B'A's is not easy to do and is a little curcuitous as well. The only wayt to do that is to look historically at players who have had a long career and assume that their lifetime BA is is their true BA. Of course, even that lifetime BA would have to be regressed in order to get their true BA, so that's where the "curcuitous logic" comes from - "in order to know how much to regress a sample BA, we have to find out the true BA's of ML players and in order to find out those true BA's we have to know how much to regress a player's lifetime BA..."
We have other problems in terms of trying to figure out hoe many players in ML baseball have true BA's of x. For example, not many players who have true BA's of .210 have long careers, so if we only loked at long careers to establish our percentages, we might miss some types of players (those with very low true BA's). In any casze, let's assume that we can cone up with a fairly good table of frequencies for the true BA's of all ML players. It might look something like <.200 (.1%), .200-220 (1%), .220-.230 (3%),..., .300-.320 (2%), etc.
NOW we can use Baysean (lower on the total pole than linear algebra) probability to figure our .380 player's true BA! The way we do that goes something like this:
What are the chances that our player is a true .200-.220 hitter (1% if we know nothiong else about this hitter other than he is a ML player) GIVEN THAT
November 2, 2002 - Tangotiger
(www)
Contrarian: I've already admited my shortcomings in many areas, including statistics. I've taken enough that I can follow conversations, but that's as far as I would take it. I also know enough to apply the basics. This is no news to people who've been reading me, and any of my comments should be taken like that.
I am always interested to hear from Walt Davis, and frankly I just missed his second post (the way Primer regenerates the site, there is a lag, and Walt's post got sandwiched in-between).
I have no problems with people criticizing my approach, or my comments, or anything I do. It would be nice though if you would provide an email address so that we can correspond privately, and you can elaborate further.
November 2, 2002 - MGL
Actually I wanted to add one more thing about "regression" as it relates to projecting talent in bsseball, assuming of course, that not EVERYONE is ignoring this thread now that the cat's out of the bag (that Tango and I are ignoramuses when it comes to statistics).
While the mean of a population from which a player comes determines the upper or lower limit of his true BA (from now on, when I use BA, it is simply a convenient proxy for any metric which measures talent), it isn't that useful in terms of knowing how much to regress a player's sample BA in order to estimate his true BA. In fact, it isn't necessary at all. Nor does the size of the player's sample BA tell us how much to regress, UNLESS AND UNTIL WE KNOW OTHER CHARACTERISTICS OF THE POULATION.
What I mean by that is that there are actually 2 things that tell us exactly how much to regress a player's sample BA to determine the best estimate of his true BA, and one of them is not the mean BA of the population to which the player belongs.
One of those things IS the size of the sample BA (1 year, 4 years, etc.). The other is the DISTRIBUTION OF THE TRUE BA'S OF ALL PLAYERS IN THE POPULATION.
Once we know those 2 things, we can use a precise mathematical formula (it isn't linear algebra, I don't think) to come up with an exact number whihc is the best estimate for that player's true BA.
Let's back up a little. In normal sampling statistics, a player's BA over some time period would be sample of his own true BA and our best estimate of that player's true BA would be exactly his sample BA. So if player A had a .380 average during one month and that's all we knew about this player, regular sampling theory would say that our best estimate of his true BA was .380 and we could use the number of AB's that .320 was based on (the sample size) to determine how "confident" we were that the .320 WAS in fact his real BA, using the standard deviation of BA, which we can compute using a binomial model, etc., etc. Most of you know that.
Now here is where we sort of veer away from a normal "experiment" in sampling statistics, when it comes to baseball players and their talent. We KNOW something about the population of all baseball players, which means, both mathematically and logically, that the .320 sample BA in one month (say 100 AB's) is not necessarily a good estimate of that player's true BA. We know logically that if a player hits .380 in a month that he is NOT a .380 hitter. The only reason we know that, however, is because we know that there is either no such thing as a .380 hitter or at least that a .380 hitter is very rare. If in fact we knew nothing about the range of what ML baseball players usually hit, we would then HAVE TO say that our player was a .380 hitter (within a certain confident interval, which would be around plus or minus 90 points to be 95% confident as the SD for BA in 100 AB's is around 45 points).
So now the question, as always, is, given that our player hit .320 in 100 AB's and given that we KNOW that players rarely if ever have true BA's of .380, what IS the best estimate of our player's true BA (still within the various confidence intervals)?
Let's say the mean BA of the population of ML baseball players (for the same year as our .380 sample) is .270. According to my other posts, that is the number that we regress the .380 towards, and the number of AB's the .380 is based on (100 ) determines how much we regress. Well, the first part is always true (the .270 is the lower limit of our player's true BA), but the second part is only true given a certain set of characteristics of the population of baseball players. IOW, it is these characteristics that FIRST determine how much we regress the .380 toeards the .270. Once we establish those characteristics, the more sample AB's we have, the more we regress.
What are those characteristics we need to determine before we can figure out how much to regress the .380 towards the .270? It is the percentage of batters in the population (ALL ML players in this case, since we know nothing about our .380 hitter other than he is a ML player) who have various true BA's. IOW, we need to know how many ML players are true .210 hitters, how many are true .230 hitter, true .320 hitters. etc. Obviously, there is a whole continuum of true BA's among ML players, but it would suffice for this kind of analysis if we estimated the number of players in each range. Now, estimating the number of players in baseball for each range of true B'A's is not easy to do and is a little curcuitous as well. The only wayt to do that is to look historically at players who have had a long career and assume that their lifetime BA is is their true BA. Of course, even that lifetime BA would have to be regressed in order to get their true BA, so that's where the "curcuitous logic" comes from - "in order to know how much to regress a sample BA, we have to find out the true BA's of ML players and in order to find out those true BA's we have to know how much to regress a player's lifetime BA..."
We have other problems in terms of trying to figure out hoe many players in ML baseball have true BA's of x. For example, not many players who have true BA's of .210 have long careers, so if we only loked at long careers to establish our percentages, we might miss some types of players (those with very low true BA's). In any casze, let's assume that we can cone up with a fairly good table of frequencies for the true BA's of all ML players. It might look something like <.200 (.1%), .200-220 (1%), .220-.230 (3%),..., .300-.320 (2%), etc.
NOW we can use Baysean (lower on the total pole than linear algebra) probability to figure our .380 player's true BA! The way we do that goes something like this:
What are the chances that our player is a true .200-.220 hitter (1% if we know nothiong else about this hitter other than he is a ML player) GIVEN THAT he hit .380 in 100 AB's (much less than 1% of course)? What are the chances that he is a .300-.320 hitter given that he hit .380, etc (more than 2% of course)?...
Do all the multiplication and addition (arithmetic, MUCH lower than linear algebra) and voila we come up with an exact number (true BA) for our .380 hitter (which still has around a 90 point either way 95% confident interval).
Remember that the mean BA doesn't tell us anything about how much to regress or what the final estimate of the true BA of our .380 hitter is; it only tells us the limit of the regression, and in fact, we don't even need to know that number, as in the calculations above. For example, let's say that the mean BA for all ML players were .270, as in the above, bu that all ML players ahd the same true BA. The true BA for our .380 hitter or ANY hitter with any sample BA in any number of AB's would be .270. Let's say that 1% of all ML players had true BA's of .380 and 99% had true BA's of .290. What would our .380 player's true BA be?
It is either .380 or .290, so it's not really a "fair" question. We could answer it in 2 ways. One would be that "there is an X percent chance that he is a .290 hitter (who got lucky in 100 AB's) and a Y percent chance that he is a .380 hitter (who hit what he was "supposed to" in 100 AB's). The other answer is that he is a .zzz hitter, where the .zzz is X percent times .270 plus Y percent times .380, divided by 100. Here's how we would do that calculation:
The chances of a .290 hitter hitting .380 or better is .025 (.380 is 2 standard deviations above .290 for 100 AB's). The chances of a .380 hitter hitting .380 or better is .5. So if we didn't know anything about the frequencies of .290 or .380 hitters in our population, our player is 20 times more likely to be a .380 hitter than a .290 hitter (.5/.025), or he has a 95.24% chance of being a .380 hitter. But since 99% of all players are .290 hitters and only 1% are .380 hitters, we now have the chances that our player is a .380 hitter at 20%, rather than the initial 1%. So we can say that our hitter has a 17% chance of being a .380 hitter and an 83% chance of being a .290 hitter or we can say that our hitter is a .305 hitter. We get the 20% chance of our hitter being a .380 hitter by the following Bayesian formula: The ratio of the chance of being a .290 hitter who hit .380 or better (.99 times .025 or .02475) to the chance of being a .380 hitter who hit .380 or better (.01 times .5 or .005), is 4.95 to 1. That means that it is 4.95 more likely that our .380 hitter is a true .290 hitter who got lucky, so the chances of our hitter being a .290 hitter is .8319, and hence .1681 for being a .380 hitter.
That same above Bayesian calculation would apply for any number of categories of true BA's in the population and the percentage of players in each category.
Now, given the difficulty in determining the categories and frequencies for true BA's in the population of ML baseball players and given the cumbersome nature of the ensuing Bayesian calculations, we can forgoe all of that by using a linear regression formula to approximate the same results. If we used a single regression formula for say the above example (a player who hits .380 in 100 AB's), we would take a bunch of data points constituting all players with a certain BA in 100 AB's (our independent variable) and regress this on those same players' BA for the next year or preferably multiple years. As usual, this will yield two coefficients, A and B in our y-Ax+B linear equation, and B will be colse to the mean BA of all baseball players (actualy the mean BA in our sample group we are using in the regression analysis). Remember that these coefficients will only work for 100 AB's. If we want to do the sem thing for a player with 500 AB's, we have to do a new regression analysis and derive a new equation, OR we can do a multiple regression analysis where number of AB's is one of the independent variables. Unfortunately, due to my status as a statistics ignoramus, I don't know wether there is a linear relationship if we include # of AB's (I don't think there is), in which case you would have to do a non-linear analysis, which is beyond my abilities...
November 3, 2002 - MGL
In case there is anyone on the planet still reading this thread, the 4th sentence in the third paragraph from the bottom should read (among lots of other spelling and grammar errors):
But since 99% of all players are .290 hitters and only 1% are .380 hitters, we now have the chances that our player is a .380 hitter at 17% (NOT 20%), rather than the initial 1%.
November 4, 2002 - Tangotiger
(www)
Sancho, thanks for the links. The first one I had not seen, and is a not bad one. As for Albert, I'm frankly disappointed. There's a long list of math professors who have tackled baseball issues, and really either miss something, or write so dry that I miss something. (Of course, there's an even longer list of sabermetricians who miss some math issues as well.)
November 5, 2002 - Tangotiger
(www)
(e-mail)
Shaun, I agree age should be taken into account, and I'm currently working on this. I should have something to show as soon as I get the time (which these days is not too much).
As for contract status, certainly this would have an impact. However, by having an aggregate of players, this impact should not be noticeable too much. And of course, since my data is from 1919 onwards, there's an even smaller population which would even be affected by this at all.
As for learning and improving, etc. This is the issue. Is it the case that the player is learning and improving, or is it simply random chance that the player happens to have a banner year. Hopefully, with the new data I have, we'll have a better answer.
November 6, 2002 - MGL
Brother of Jessie,
Sorry but your argument is mathematically (statistically) unsound. I don't have the time right now to explain why.
BTW, Tango's 149 149 149 149 142 observation is not a revelation. It doesn't "need" explaining nor is it open to "criticsm".
It is a mathematical certainty that no matter what the distribution of true linear weights ratios in the population of baseball players is, any player or players who show an above or below average in any number of years, will "regress" towards the mean (100 in this case) in any subsequent year. How much they regress (in percentage, like the 7/49, if you want to put it that way) depend entirely on how many PA's the historical data is comprised of. In this case, 4 years of 149 regressed to 142 in the 5th year. One year of 149 will regress to, I don't know, something like, 120. Tango's observations were just to make sure that nothing really funny (like the statistician's view of the world is completely F'uped) was going on. We don't need to look at ANY data to tell us that 4 149's will be followed by something less than but closer to the 149, or that 2 149's will be followed by something even more less than and less close to 149. Again, it is a mathematical certainty, even if there is lots of learning and developing going on with baseball players. The learning and developing can only decrease the amount of regression; it cannot eliminate it! Of course, what we will and do find if we look at real-life data, is that this learning and developing (to the extent that people "read into" these banner years) is small or non existent beyond the normal or average age progression. The reason we know this is that if the learning and developing were a large or even moderate factor, we would see much smaller regressions after banner years than we do. The regressions we do see comport very nicely to what a statistical model would predict if no learning and developing were going on. Given that, there can be only one conclusion - THAT A BANNER YEAR IS 99 PARTS FLUCTUATION AND 1 PART LEARNING AND DEVELOPMENT (i.e, the concept of a "breakout year" is a FICTION!)
November 7, 2002 - MGL
Mr. James,
You need to read my protracted discussion on "regression" as it applies to baseball talent. I think it is in this thread, but I'm not sure.
Despite your moniker, you got no shot to "shoot me down" on this one!
A player's stats gets regressed to the mean of the population that he comes from. Yes, if we assemble ALL-STARS and choose players from that group, the mean is greater than 100. Same is true if we assemble right fielders and choose a player or players from that group. If we assemble a group of sub 6-foot players, our mean will probably be less than 100.
Tango looked at all ML players and chose those who had high lwts ratios (an average of 149) for 4 straight years. EVEN THOUGH THESE ARE OBVIOUSLY GREAT PLAYERS, THEY CAME FROM A GROUP OF ALL ML PLAYERS. That is why you regress toward the mean of all ML players (actually you regress toward the mean of all ML players who have played at least 5 years). If you assemble All Stars and choose from that group, THERE HAS TO BE AN INDEPENDENT REASON FOR YOU CALLING THEM ALL-STARS, OTHER THAN THE CRITERIA YOU USED TO SELECT PLAYERS FROM THAT GROUP. IOW, in order to regress those same 149 players to a mean greater than 100, you would need to assemble your ALL STARS first by using some creiteris independent of having a high lwts ratio for 4 straight years - say a high lwts ratio for the previous 3 years. If you do that, then you regress toward the mean of all players who have had high lwts ratios for 3 straight years and have played for at least 5 more years. Get it!
You regress towards the mean of the population that your player comes from! You cannot make any inferences about that population based upon you rsampling criteria! That's the whole point of regression!
Read this - it is important:
To put it another way, in Tango's example, he looks at the entire population of baseball players. They include all players of all true talent. They have a certain mean lwts ratio, which we can easily measure, and of course, we define it as 100. Next he ignores those players who have not had a minimum number of PA's for at least 5 years, right? So now we have a population of players who have had a min # of PA's for 5 straight years. We take the mean of that population, which is probably higher than 100 (say 105). That is the number we regress to! The fact that we now select only those players who have had at least a 125% lwts ratio for 4 straight years DOES NOT CHANGE THE POPULATION THAT THESE PLAYERS WERE SELECTED FROM. That is the popultion whose mean we rregress to! Yes, that group of > 125% players are ALL-STARS as a group. Their true lwts ratio is much greater than 105, but it is NOT 149, as we can see from the 5th year ratio of 142. By definition, when we regress to the mean (this is not my "opinion" it is a rule of statistics), we regress to the mean of the poulation from which we chose our players, regardless of what criteria we used to select those players. By criteria, I mean "What range of lwts ratios?", like the > 125% that Tango chose. If we choosze criteria (these are independent variables) like what position, or what weight, OR WHAT WAS THEIR LWTS RATIO IN THE PRIOR YEAR OR YEARS, then we have a new population and hence, a new mean to regress to. Anyway, I got off on a tangent as far as the important thing to read...
When Tango chose those players who had ratios above 125% for 4 straight years, the reason we regress at all is that those playres selected consist of: 1) players with a true ratio of less than 149 who got lucky, 2) those players with a true ratio around 149, the sample mean, and 3) those players who have a true ratio GREATER than 149. We don't know in what proportion they exist, but even though it is more likely that a player who has a sample mean of 149 is a true 149 player, and it is less likely that he is a less than or greater than 149 true player, there are many, many more players in our original group (our population) that were true sub-149 players, so it is much more likely that an "average" player in our 149 sample group of players is a true sub-149 player who got lucky. It just so happens that the proper mean to regress to is the mean of the original group, whatever that is (105?).
If we had chosen a group of ALL-STARS, based on, say, 3 years worth of lwts ratios above 125%, we now have a group whose true lwts ratios is around 135 or so. Now, if FROM THAT GROUP, we select those players who have had 4 MORE year of > 125%, then we have the same experiment as Tango's, except that that group is ALREADY made up of players who have a true ratio of around 140, as opposed to in Tango's example, the group he selcted from are ALL ML players who have played for 5 years (etc.). They only have a mean ratio of around 105. So in the second experiment, where we choose from a group of KNOWN good players (on the average, not all of them), many more of the players we select are good players who did not get lucky or got a little lucky for the next 5 years (after the initial 3 years of > 125% performance). Many more (percentage-wise) in the second group, as oposed to the first group, are also true > 149 players who got unlucky. That's why the true ratio in the second group is more than 142 (probably 145 or so). It is still not 149, since the mean of the group of ALL-STARS is only around 135, so we still have to regress the 149 sample mean to 135. The "reason" for this is that we still have some lingering players who are not very good, but managed to make it into the ALL-STAR group through luck, and ALSO managed to make it into the > 125% for the next 5 years group. Obviously not many players will make it through these 2 hurdles, but as long as there is a finite chance that any true sub-149 player will make it, the true ratio of the 149 group will ALWAYS be less than 149! You may say, wait, there is an equal chance that a > 149 players made it through both groups as a sub-149 player, so they would cancel each other out! In fact that is true! There is an equal likelihood that any player in our 149 group is a true 154 or a true 144 (each one is 5 points different from the mean). But here is the kicker that makes us regress downward in either experiment: There are many more players in either population who are true sub 149 players than there are who are > 149 players, so an average player in our 149 group is MORE likely to be a 144 player who got lucky than a 154 player who got unlucky, simply becuase there are more 144 players!
Now if we chose our ALL-STARS such that our estimate of the average true ratio in that group of ALL-STARS were 155 (let's say we selected all players who had a ratio for 3 years of greater than 140 - not too many players, of course), then if we did the same experiment, and the sample ratio of players who had > 125% for the next 4 years was stil 149, we would eactually regress upwards such that our estimate of the 149 group's true mean ratio would be like 153 or so!
I hope this explains everything, because I just missed my appointment at the gym!
November 7, 2002 - MGL
I'm done trying to explain how it works with baseball talent (or any similar experiment). Either we are misunderstanding one another or you are very stubborn or both. Maybe someone else can explain it to you or maybe we can just drop it.
If a sample of players (yes NOT a random sample), using a measurement criteria (like above 125% for 4 straguiht years), drawn from the population of baseball players DID NOT regress toward the mean then you would NOT see the 5th year at 142 - you would see it at 149, right?
Do an experiment which should make everything obvious - in fact, you don't even have to do the experiment - the results hsould be obvious:
Look at all BB players from 2001. Take only those who had an OPS above .900 (non-random sample - obviously). Their average (mean) OPS is something like .980. What do you think their OPS as a group is in 2002? It is the .980 regressed toward the mean of the entire population of baseball playersr (around .770), which will be maybe .850 or .880. The 2002 OPS is also the best estimate of these players' true OPS (given a large enough sample of players, the best estimate of those players' average true OPS is the next year's - or any large random sample - OPS). We KNOW this right? We know that we take any non-random sample of players defined by a certain performance (less than .700 OPS, greater than .800, etc.), their subsequent (or past) OPS will REGRESS (toward the mean of the whole population of BB players)! That is regression towards the mean (for an excperiment like this)! There does not have to be random sampling, although the exact same thing would happen if we took a random sample!
What do you think would "happen" (what would the future look like) if we looked at all those players who had an OPS of over 1.000 for one week? (See my thread on hot and cold streaks.) Would their future (or past) OPS regress towards the mean or wouldn't it? Would their average OPS (of around 1.100) remain the same? Do you not understand what I am getting at here? What do you think the true talent (OPS) of these one-week 1.100 guys is? I know you don't think it is 1.100, which means they will continue at a 1.100 clip. I know you know that it will continue at a pace closer to the league average (probably around .880). What do you call that other than regression to the mean?
(Light bulb went off in my head!) Now I see what you are saying! My apologies. Yes technically, these "higher than average groups" (the 149 for 4 straight years guys or the better than 1.000 OPS for one week guys) will regress toward THEIR mean lwts ratio or OPS and NOT the mean of the league as a whole. Yes that is true and I think that is what you are trying to say. Again, my apologies. You are absolutely correct. IN PRACTICE, you can use the mean of the whole league to regress to, because you don't know what the mean of the group you selected is - in fact, that is what you are trying to figure out. IOW, if we take the 149 ratio guys and want to figure out their true ratio or what their ratio will be in the next year (same thing, basically), then technically, we must use their true ratio to regress to, but that's what we are trying to figure out in the first place - what that true ratio is. SInce we don't know that and the onlything we know is the mean ratio of all players, then we have to regress that 149 towards that all player mean of 1056 or whatever it is. Yes, technically that 149 doesn't get regressed towards 105. It gets regressed towards ssomething less than 149 and mroe than 105, but since we don';t know what that is, it LOOKS like it gets regressed towards the 105.
Anyway, there is no argument anymore, unless you think that the true OPS of that 149 group is 149, rather than something like 142 (the 149 regressed towards the true ratio). IF you do, you must wonder why the next year comes out to 142. If you do, you must think that one year of a more than .900 OPS is a player's true OPS, again, in which case you must wonder why these guys show a .800 OPS or so the next year. And if you do, you must certainly wonder why a group that shows a 1.100 in a week does not continue at that clip, even though we did not randomly select this group of players (we only looked at players who had greater than a 1.000 OPS in a certain week)...
CHeers!
November 7, 2002 - Tangotiger
(www)
(e-mail)
MGL, no F James specifically said that these 149,149,149 players would not regress, except for aging. In fact, they do regress to 142.
This group of players will regress towards THEIR mean, I agree. In fact, they will regress 100% towards their mean. But since we don't know what their mean is (without looking at other non-sampled years), we take the next best thing: the mean of the population they were drawn from. This mean is in fact 115%. Therefore, given the number of years (3), the number of players (I don't remember, let's say 100), and the number of PAs (let's say 500 / player / year), the best players will regress 7/34 (20%) towards the mean of the population they were drawn from. Different years, different # of players, and different # of PAs will regress differently.
Now, I know little of statistics, and perhaps Walt Davis or Ben V can put this matter to rest.
I'll be back in a week or two with detailed data, broken down by age.
November 8, 2002 - MGL
I'm gonna try one more time! Forget my last post!
Forget about the expression "regression to the mean". It is a generic expression which could have different meanings depending upon the context. Pretend it doesn't exist.
Remember that when I say that a value (call it value 1) gets "regressed" to another value (call it value 2), THAT MEANS two things and two things only:
1) Value 2 represents the direction in which we move value 1 (it can be moved either up or down).
2) If we don't know how much to move value 1 (which we usually don't in these types of experiments), value 2 represents the limit of the movement.
For example, if value 1 is 149 and value 2 is 135, we know 2 and only 2 things, assuming that we have to "move" value 1. One, we move it down (towards the 135), and two, we move it a certain unknown amount but we never move it past 135.
How does this very vague above concept apply to baseball players? I'm glad you asked.
First, I am going to call "value 2", as described above, "the mean" and I am going to substitute the word "regress" for the word "move" as used in the context above. This is literally an arbitrary choice of words. We might as well say "wfogbnlnfl to the slkvdn". I'm using the word "mean", not in any mathematical sense, but to represent the "limit of how much we move value 1". Likewise, I am using the word "regress", also not in any mathematical sense, but purely as a substitute for the word "move".
So "regression to the mean" from now on simply means "We move value 1 either up or down, depending on whether value 2 is less than or greater than value 1, and value 2 also represents the most we can move value 1."
Now here are some experiments in which I will attempt apply the above methodology. You tell me whether it should be applied or not (and if not, why). If you think that it is appropriate to apply, you must also tell me what value we should use for value 2. The correct answers will appear at the end of this post.
Experiment #1:
Let's say that we have a population of BB players and we don't know whether all players in that population have the same true OPS or not. Either everyone has the same true OPS (like a population of all different denominations of coins, where every coin has the same true H/T "flipping" ratio), or some or all of the players have different true OPS's (like if we had a population of coins and some coins were 2-sided, while others were 3-sided or 4-sided, etc.).
Now let's say that we randomly sample 100 of these players and look at a 1-year sample of each player's lwts ratio. We basically have 100 "man-years" of data that were randomly sampled (not exactly, but close enough) from a population of all baseball players.
Let's say that the mean OPS these 100 players is .780. This is our value 1, by the way. Let's also say that WE DO NOT KNOW WHAT THE MEAN OPS OF THE POPULTION OF ALL PLAYERS IS. Remember that we randomly sampled these 100 players and 1 year's worth of data for each player from a population of all players and all years.
What is the best estimate of the true OPS of these 100 players, given that the average OPS for all 100 players for 1 year each is .780?
In order to arrive at that estimate, did you need to determine a value 2 and did you need to move value 1 (.780) in the direction of value 2 and does value 2 represent the furthest you should move value 1? If the answer is yes to all 3 related questions, how did you arrive at value 2? If the answer is yes to some and no to some (of the above 3 questions), please explain.
Experiment #2:
Same as experiment #1, but we now know (G-d tells us) that the mean OPS of the population of all players is .750 AND we know that all players have the same true OPS.
Again, what is the best estimate of the true OPS of our 100 players chosen randomly (and their 1-year stats each)? This is a no-brainer right? It is not a trick question. The answer is as obvious as it seems.
Given your answer to the above question, did you move value 1 (the .780), is there a value 2 (and if so, what is it), and if the answer to both questions is yes, do we know exactly how much to move value 1 ("towards" value 2)? IOW, is there regression to the mean (remember my above definition - movement, direction, and limit, where "regression to" is "movement towards" and "the mean" is "value 2")?
Experiment #3: (We are getting closer and closer to Tango's experiment)
Same as above (#2) only this time not only do we know that the mean OPS of all players is .750, we also know (again from G-d, not from any empirical data) that all players in the population have different true OPS's. IOW, some players have true OPS' of .600, some .720, some .850, some .980, etc. In this experiment we don't know, nor does it matter, what percentage of players have true OPS's of x, what percentage have true OPS's of y, etc. We only know that different players have different true OPS's. So in our random sample of 100 players, each player could have a true OPS of anything, assuming that every OPS is represented in the population in unknown proportions.
Now, rememeber, like the previous experiments, the mean OPS of our 100 randomly selected players for 1-year each (at this point it doesn't matter that we used 1-year data for each player. We could have used 2-year or 6-months), is .780. Remember also that we KNOW the true average (mean) OPS of all the players is .750. And don't forget (this is what makes this experiment different from #2) that we KNOW that different players in the population have different true OPS's, of unknown values and in unknown proportions (again, the last part - the "unknown values and in unknown proportions" - doesn't matter yet).
So now what is your best estimate of the true average (mean) OPS of the 100 players? Is this an exact number? Do we use a "regression to the mean"? If yes, what is value 2 and do we know exactly how much to move (regress) value 1 (again, the .780) towards value 2?
Here are the answers to the questions in experiments 1-3:
Experiment #1 (answer):
The best estimate of the average true OPS of the 100 players is .780, the same as their sample OPS. There is no "regression to the mean". There is no value 2; therefore there is no movement from value 1. (Technically, we could say that value 2 is .780 also, the same as value 1, and that we regress value 1 "all the way" towards value 2.) The above comes from the following rule in sampling statistics:
When we sample a population (look at the 1-year OPS of 100 players) and we know nothing about the characteristics of the population, as far as the variable we are sampling (OPS) is concerned, the sample mean (.780) is the best estimate of the population mean.
Experiment #2 (answer):
The answer is that no matter what the sample OPS is (in this case .780), the true OPS of any and all players (including the average of our 100 players) is .750! This is simply because we are told that the true OPS of all players is .750! Any sample that yields an OPS of anything other than .750 MUST BE DUE TO SAMPLING ERROR
November 8, 2002 - MGL
I'm gonna try one more time! Forget my last post!
Forget about the expression "regression to the mean". It is a generic expression which could have different meanings depending upon the context. Pretend it doesn't exist.
Remember that when I say that a value (call it value 1) gets "regressed" to another value (call it value 2), THAT MEANS two things and two things only:
1) Value 2 represents the direction in which we move value 1 (it can be moved either up or down).
2) If we don't know how much to move value 1 (which we usually don't in these types of experiments), value 2 represents the limit of the movement.
For example, if value 1 is 149 and value 2 is 135, we know 2 and only 2 things, assuming that we have to "move" value 1. One, we move it down (towards the 135), and two, we move it a certain unknown amount but we never move it past 135.
How does this very vague above concept apply to baseball players? I'm glad you asked.
First, I am going to call "value 2", as described above, "the mean" and I am going to substitute the word "regress" for the word "move" as used in the context above. This is literally an arbitrary choice of words. We might as well say "wfogbnlnfl to the slkvdn". I'm using the word "mean", not in any mathematical sense, but to represent the "limit of how much we move value 1". Likewise, I am using the word "regress", also not in any mathematical sense, but purely as a substitute for the word "move".
So "regression to the mean" from now on simply means "We move value 1 either up or down, depending on whether value 2 is less than or greater than value 1, and value 2 also represents the most we can move value 1."
Now here are some experiments in which I will attempt apply the above methodology. You tell me whether it should be applied or not (and if not, why). If you think that it is appropriate to apply, you must also tell me what value we should use for value 2. The correct answers will appear at the end of this post.
Experiment #1:
Let's say that we have a population of BB players and we don't know whether all players in that population have the same true OPS or not. Either everyone has the same true OPS (like a population of all different denominations of coins, where every coin has the same true H/T "flipping" ratio), or some or all of the players have different true OPS's (like if we had a population of coins and some coins were 2-sided, while others were 3-sided or 4-sided, etc.).
Now let's say that we randomly sample 100 of these players and look at a 1-year sample of each player's lwts ratio. We basically have 100 "man-years" of data that were randomly sampled (not exactly, but close enough) from a population of all baseball players.
Let's say that the mean OPS these 100 players is .780. This is our value 1, by the way. Let's also say that WE DO NOT KNOW WHAT THE MEAN OPS OF THE POPULTION OF ALL PLAYERS IS. Remember that we randomly sampled these 100 players and 1 year's worth of data for each player from a population of all players and all years.
What is the best estimate of the true OPS of these 100 players, given that the average OPS for all 100 players for 1 year each is .780?
In order to arrive at that estimate, did you need to determine a value 2 and did you need to move value 1 (.780) in the direction of value 2 and does value 2 represent the furthest you should move value 1? If the answer is yes to all 3 related questions, how did you arrive at value 2? If the answer is yes to some and no to some (of the above 3 questions), please explain.
Experiment #2:
Same as experiment #1, but we now know (G-d tells us) that the mean OPS of the population of all players is .750 AND we know that all players have the same true OPS.
Again, what is the best estimate of the true OPS of our 100 players chosen randomly (and their 1-year stats each)? This is a no-brainer right? It is not a trick question. The answer is as obvious as it seems.
Given your answer to the above question, did you move value 1 (the .780), is there a value 2 (and if so, what is it), and if the answer to both questions is yes, do we know exactly how much to move value 1 ("towards" value 2)? IOW, is there regression to the mean (remember my above definition - movement, direction, and limit, where "regression to" is "movement towards" and "the mean" is "value 2")?
Experiment #3: (We are getting closer and closer to Tango's experiment)
Same as above (#2) only this time not only do we know that the mean OPS of all players is .750, we also know (again from G-d, not from any empirical data) that all players in the population have different true OPS's. IOW, some players have true OPS' of .600, some .720, some .850, some .980, etc. In this experiment we don't know, nor does it matter, what percentage of players have true OPS's of x, what percentage have true OPS's of y, etc. We only know that different players have different true OPS's. So in our random sample of 100 players, each player could have a true OPS of anything, assuming that every OPS is represented in the population in unknown proportions.
Now, rememeber, like the previous experiments, the mean OPS of our 100 randomly selected players for 1-year each (at this point it doesn't matter that we used 1-year data for each player. We could have used 2-year or 6-months), is .780. Remember also that we KNOW the true average (mean) OPS of all the players is .750. And don't forget (this is what makes this experiment different from #2) that we KNOW that different players in the population have different true OPS's, of unknown values and in unknown proportions (again, the last part - the "unknown values and in unknown proportions" - doesn't matter yet).
So now what is your best estimate of the true average (mean) OPS of the 100 players? Is this an exact number? Do we use a "regression to the mean"? If yes, what is value 2 and do we know exactly how much to move (regress) value 1 (again, the .780) towards value 2?
Here are the answers to the questions in experiments 1-3:
Experiment #1 (answer):
The best estimate of the average true OPS of the 100 players is .780, the same as their sample OPS. There is no "regression to the mean". There is no value 2; therefore there is no movement from value 1. (Technically, we could say that value 2 is .780 also, the same as value 1, and that we regress value 1 "all the way" towards value 2.) The above comes from the following rule in sampling statistics:
When we sample a population (look at the 1-year OPS of 100 players) and we know nothing about the characteristics of the population, as far as the variable we are sampling (OPS) is concerned, the sample mean (.780) is the best estimate of the population mean.
Experiment #2 (answer):
The answer is that no matter what the sample OPS is (in this case .780), the true OPS of any and all players (including the average of our 100 players) is .750! This is simply because we are told that the true OPS of all players is .750! Any sample that yields an OPS of anything other than .750 MUST BE DUE TO SAMPLING ERROR
November 8, 2002 - MGL
I'm gonna try one more time! Forget my last post!
Forget about the expression "regression to the mean". It is a generic expression which could have different meanings depending upon the context. Pretend it doesn't exist.
Remember that when I say that a value (call it value 1) gets "regressed" to another value (call it value 2), THAT MEANS two things and two things only:
1) Value 2 represents the direction in which we move value 1 (it can be moved either up or down).
2) If we don't know how much to move value 1 (which we usually don't in these types of experiments), value 2 represents the limit of the movement.
For example, if value 1 is 149 and value 2 is 135, we know 2 and only 2 things, assuming that we have to "move" value 1. One, we move it down (towards the 135), and two, we move it a certain unknown amount but we never move it past 135.
How does this very vague above concept apply to baseball players? I'm glad you asked.
First, I am going to call "value 2", as described above, "the mean" and I am going to substitute the word "regress" for the word "move" as used in the context above. This is literally an arbitrary choice of words. We might as well say "wfogbnlnfl to the slkvdn". I'm using the word "mean", not in any mathematical sense, but to represent the "limit of how much we move value 1". Likewise, I am using the word "regress", also not in any mathematical sense, but purely as a substitute for the word "move".
So "regression to the mean" from now on simply means "We move value 1 either up or down, depending on whether value 2 is less than or greater than value 1, and value 2 also represents the most we can move value 1."
Now here are some experiments in which I will attempt apply the above methodology. You tell me whether it should be applied or not (and if not, why). If you think that it is appropriate to apply, you must also tell me what value we should use for value 2. The correct answers will appear at the end of this post.
Experiment #1:
Let's say that we have a population of BB players and we don't know whether all players in that population have the same true OPS or not. Either everyone has the same true OPS (like a population of all different denominations of coins, where every coin has the same true H/T "flipping" ratio), or some or all of the players have different true OPS's (like if we had a population of coins and some coins were 2-sided, while others were 3-sided or 4-sided, etc.).
Now let's say that we randomly sample 100 of these players and look at a 1-year sample of each player's lwts ratio. We basically have 100 "man-years" of data that were randomly sampled (not exactly, but close enough) from a population of all baseball players.
Let's say that the mean OPS these 100 players is .780. This is our value 1, by the way. Let's also say that WE DO NOT KNOW WHAT THE MEAN OPS OF THE POPULTION OF ALL PLAYERS IS. Remember that we randomly sampled these 100 players and 1 year's worth of data for each player from a population of all players and all years.
What is the best estimate of the true OPS of these 100 players, given that the average OPS for all 100 players for 1 year each is .780?
In order to arrive at that estimate, did you need to determine a value 2 and did you need to move value 1 (.780) in the direction of value 2 and does value 2 represent the furthest you should move value 1? If the answer is yes to all 3 related questions, how did you arrive at value 2? If the answer is yes to some and no to some (of the above 3 questions), please explain.
Experiment #2:
Same as experiment #1, but we now know (G-d tells us) that the mean OPS of the population of all players is .750 AND we know that all players have the same true OPS.
Again, what is the best estimate of the true OPS of our 100 players chosen randomly (and their 1-year stats each)? This is a no-brainer right? It is not a trick question. The answer is as obvious as it seems.
Given your answer to the above question, did you move value 1 (the .780), is there a value 2 (and if so, what is it), and if the answer to both questions is yes, do we know exactly how much to move value 1 ("towards" value 2)? IOW, is there regression to the mean (remember my above definition - movement, direction, and limit, where "regression to" is "movement towards" and "the mean" is "value 2")?
Experiment #3: (We are getting closer and closer to Tango's experiment)
Same as above (#2) only this time not only do we know that the mean OPS of all players is .750, we also know (again from G-d, not from any empirical data) that all players in the population have different true OPS's. IOW, some players have true OPS' of .600, some .720, some .850, some .980, etc. In this experiment we don't know, nor does it matter, what percentage of players have true OPS's of x, what percentage have true OPS's of y, etc. We only know that different players have different true OPS's. So in our random sample of 100 players, each player could have a true OPS of anything, assuming that every OPS is represented in the population in unknown proportions.
Now, rememeber, like the previous experiments, the mean OPS of our 100 randomly selected players for 1-year each (at this point it doesn't matter that we used 1-year data for each player. We could have used 2-year or 6-months), is .780. Remember also that we KNOW the true average (mean) OPS of all the players is .750. And don't forget (this is what makes this experiment different from #2) that we KNOW that different players in the population have different true OPS's, of unknown values and in unknown proportions (again, the last part - the "unknown values and in unknown proportions" - doesn't matter yet).
So now what is your best estimate of the true average (mean) OPS of the 100 players? Is this an exact number? Do we use a "regression to the mean"? If yes, what is value 2 and do we know exactly how much to move (regress) value 1 (again, the .780) towards value 2?
Here are the answers to the questions in experiments 1-3:
Experiment #1 (answer):
The best estimate of the average true OPS of the 100 players is .780, the same as their sample OPS. There is no "regression to the mean". There is no value 2; therefore there is no movement from value 1. (Technically, we could say that value 2 is .780 also, the same as value 1, and that we regress value 1 "all the way" towards value 2.) The above comes from the following rule in sampling statistics:
When we sample a population (look at the 1-year OPS of 100 players) and we know nothing about the characteristics of the population, as far as the variable we are sampling (OPS) is concerned, the sample mean (.780) is the best estimate of the population mean.
Experiment #2 (answer):
The answer is that no matter what the sample OPS is (in this case .780), the true OPS of any and all players (including the average of our 100 players) is .750! This is simply because we are told that the true OPS of all players is .750! Any sample that yields an OPS of anything other than .750 MUST BE DUE TO SAMPLING ERROR, by definition. It told you this one was a no-brainer! In this experiment, there is "regression to the mean" (again, per my definition - re-read it if you forgot what it was). Value 1 (.780) gets moved towards value 2, (.750). It just so happens that we know exactly how much to move it (all the way). Value 2 is still the limit on how much we can move value 1 in order to estimate the average true OPS of the sample group of players. And in this case, value 2 is equal to THE MEAN OF THE POPULATION! How do you like that? In this experiment, regression to the "mean" is really to the "MEAN"!
Experiment #3 (answer):
Remember we still know the population mean (.750). This time, however, not only are we not told that all players have the same true OPS, we are told that they definitely don't. The answer is that the best estimate of the true OPS of our 100 player sample (with a sample average OPS os .780) is something less than .780 and something more than .750. We don't know the value of the "somethings" so there is no exact answer other than the above (given the information we have). So again we have "regression to the mean", with value 2 still being .750, value 1 still .780, and the amount of regression or movement is unknown. The movement must be down since value 2 is less than value 1, and the limit of the movement is .750 since that is the value of value 2. (We can actually estimate the amount of the movement given some other paramaters but that is not important right now - we are only interested in whether "regression to the mean" is appropriate in each of these experiments, and if it is, what is the value of value 2.) BTW, as in experiment #2, value 2 happens to be the population mean, so the expression "regression to the mean" is somewhat literal, although again, that is somewhat of a coincidence.
Back to some more experiments (leading up to Tango's)...
Experiment #4:
We know the average OPS of all players is .750 and we know that all players have the same true OPS. This time, however, we select X players, not randomly, but all those who had an OPS in 1999 greater than .850. Let's say that there were 25 such players and that their average (sample) OPS was .912 (for that 1 year).
What is the average true OPS of our 25 players? Again, easy (trick) answer! It is still .750, since you are told that all players have a true OPS of .750. Again it doesn't matter what criteria we chooose to select players or what any player's 1-year sample OPS is. All sample OPS's that differ from .750 are due to statistical fluctuation (sample error), by definition. Again, "regression toward the mean", where the "regression" is 100% and the "mean" (value 2) is the mean OPS the population. So we have "regression to the mean" even thgough we did not choose a random sample of players from the population. We chose them based upon the criteria we set - greater than an .850 sample OPS in the year 2000.
Experiment #5 (same as Tango's):
Same population of players. It has a mean OPS of .750. Unlike the above experiment, each player can have a different true OPS. This time we only look at players who had a sample OPS of greater than .990 during the month of June in 2000. The average (sample) OPS of this group (say 50 players) is 1.120.
What is your best estimate of the true OPS of this group of players? (This question is of course exactly the same question as "What is your best estimate of what this group of players will hit in July 2000 or August 2000, not counting things like changes in weather, etc.?") Well what is it, and how do you arrive at your answer? Is your answer reasonable?
In order to arrive at your answer, was there any "movement" from the 1.120, like there was in the last experiment (the .912 had to be moved toward the .750 - in fact all the way to it)? In which direction 0 up or down? Why? If you did move the 1.120 to arrive at a different value for the "best estimate of the sample players' true OPS" how much did you move it? How much should you move it? Is there a limit on how much you should move it? If you did move it, is there a value 2 that tells us in what direction to move value 1 (the 1.120). How did you arrive at the value 2? Does this value 2 (like all value 2's are supposed to do) represent the limit of the movement? If not, why not, and what is the limit of the movement? Was there "regression to the mean" in deriving at your estimate of the sample players' average true OPS? If yes, what value represented "the mean"?
I'm not going to answer any of above questions in the last experiment. If you answer them (and the others) correctly, you will know everything there is to know about Tango's and similar experiments and whether there is or is not "regression to the mean", in the generic sense, in determining sample players' OPS, no matter how we choose our sample (randomly or not), and what value is represented by "the mean", given that "regression" simply means "some amount of movement"...
November 8, 2002 - MGL
Well, Frank, not only did you not answer my last few questions, but it is obvious to me that you know virtually nothing about basic statistical principles (at least not enough to have an intelligent discussion about this kind of analysis).
Your statement:
"Rey Ordonez has NO chance of showing up in Tango's sample, even if you simulated his performance over a thousand years."
is of course patently wrong. Rey Rey or even my 13 yo cousin does not have NO chance of showing up in Tango's sample. Everyone has SOME FINITE chance, no matter how infinitesimal. The tails of a bell curve reach out to an infinite distance.
That being said, the discussion (with you) is probably over. Neverthless, my sample will contain slightly better than average players whereas Tango's sample will contain distinctly better than average players. We all know this of course.
Neverthesless, the answer to my questions will be same whether we use my sample or Tango's sample. Both samples, of course, are non-random, which was the crux of your original criticism. Tango's is very non-random and mine is slightly non-random. At what point to you switch answers (that you do or don't use "regression to the mean" to estimate the sample's true lwts ratio or OPS and that "the mean" is not the mean of all players)? You don't like "one week" because it encompasses most players (again, it is still non-random as the worst players will tend to not have any or many one week OPS's above .980, whereas the great players will have many - in fact probably around half of all their weeks will be above .950)? What about 2 weeks? 6 months? One year? 2 years? 4 years like in Tango's study? At what point do we no longer "regress to the mean of the population" in order to estimate true OPS of lwst ratio? Your argument is silly!
The reason I used one week is to make it obvious to you that even if we take a non-random sample of players from the population of all players and our selection criteria is greater or less than a certain OPS or lwts ratio, that the estimate of true OPS or ratio in that non-random group must be "less extreme" than the sample result. You know that; everyone knows that. The one-week experiment makes it obvious. You are just digging in your heels at this point, for whatever reasons. Rememvber that what we use for value 2 (the so-called "mean") is simply the number that represents the limit of our regression.
Read this!!!!
Here is where you are getting screwed up, no offense intended. In the one-week experiment, you know intuitively that the true OPS for our sample players is actually somewhere near the mean of the population (a little higher), so you don't mind using the population mean as the number to regress towards (remember that number, value 2, just tells us DIRECTION and LIMIT). In Tango's experiment, you know intuitively that the sample group is mostly very good players with very high true ratios - which is true - so you don't like using the population mean as value 2. That's fine. The true OPS of Tango's sample players is nowhere near value 2 (the mean of the population); we still use it though to give us direction and limit - that's all value 2 is - remember? The concept of value 2 being the limit becomes silly when we choose mostly very good players of course, but we still use it to give us direction because it is the only KNOWN value that we have. In those extreme cases, we don't really NEED it, as we can just say that the true ratio of Tango's sample players is "something slightly less than 149", but we can also say that it is "149 regressed towards 105" (or whatever the mean ratio of all 5-year players is). That's all I have been trying to say.
Basically, Frank, once you establish value 2 as your direction and limit of your regression (and value 2 is always, by definition, the mean of the population), then you can decide how much to regress your sample mean (the 149 in Tango's case). Without doing a regression analysis or having other information about the distribution of true ratios in the poulation, you can only guess at how much to regress. In my case - the one-week guys - you regress a lot, which you know intuitively. Still doesn't change value 2, does it, even though you know intuitively that the final answer (the best estimate of the sample players' true OPS or ratio) is going to be close to value 2? Let's take 6-month players. Intuitively, you know to regress more than you do for the one-week players, but still a lot. Again, we still use value 2, or the population mean, to tell us direction and limit. If you didn't use a precise value for value 2, you might make a mistake and regress too much! For example, if we did the same experiment and used one month samples above an OPS of .980, you would know intuitively that their true OPS's were a lot less than that, right? Let's true to guess .850. Is that too high or too low? Without knowing the population mean and using it as our value 2 (the limit of the regression), we can't tell! If I told you that the population mean is .750, then you would say "Phew, that's not too low!" Infact, you would probably nopw say that their true OPS was around .780 or .800, wouldn't you? IOW, you need to have that value 2 in order to have SOME IDEA as to how much to regress! And that value 2 is always the popualtion mean, arbitrarily and by defintion, simply becuase you know that the true OPS of your sample of players cannot be less tha that (since you selected your players based on a criteria of some time period of performnce GREATER than the average). How about 1-year samples above .980. We know to regress even less, but we still need a value 2 to give us SOME IDEA how much to regress and to make sure that we don't regress too much! When we get to 4 year samples, as in Tango's study, we don't all of sudden say "No more regression!" We still regress, only this time a very small amount! Do we need to know what direction? Well it's kind of obvious since we know from ur selection criteria that our sample includes more players who got lucky than unlucky. So we know to regress downwards. Do we need to know exactly what the limit of regression is - i.e., do we need to know value 2? No, of course we don't, but it doesn't change the fact that, like the 1-week, 1-day, 1-year, or 3-year experiments, there still is a value 2, which is still the mean of the overall population, and that technically that number established the direction and limit of our regression. If in Tango's study you don't want to call it "regression to the mean", that's fine. Who cares? It is only semantics. If you want to call it "a little regression donwards" that's fine too. It doesn't change my discussion an iota! I hate when someone takes an "expression" (out of context), critices it (with valid criticism), and then uses THAT criticism to invalidate a person's whole thesis. Why don't you just say "I agree with your whole thesis (in fact, as I said before, there is no agreeing or disagreeing - I'm just parroting you proper statisitical theory in my own words and realting it to these baseball experiments), but I don't like sentence # 42?" It's like if I tell you the entire and correct theory of the origin of the universe (I know that it is debatable), and I finish off by telling you that the universe is contracting rather than expanding and you tell me that I'm compleltely full of crap (political commentators do this all the time)!
Here's an example, BTW, of where knowing value 2, and making sure that it is the mean of the whole population, IS improtant, even when we take multi-year samples. Let's say we did the same experiment as Tango, but we select players who were only 5% above average for 2 straight years. Let's say that the average OPS for that sample was .788. Even though we are reasonable certain that our sample is made up primarily of above average players (no Rey Rey's in this sample), do we need to know value 2 in order to "hone in on" how much to lower the .788 to arrive at an estimate of the true OPS of our sample? Sure! Let's say I don't tell you what the mean OPS of the population of ALL 3-year players (players who have played for at least 3 or 4 years) is? You might choose .775 as the best estimate of our sample players' true OPS. Whoah! If i them told you that the mean OPS in the population (of all 3-year payers) was .776, you would know that you went too far! If I told you that it was .750, then you would know that you were in the ballpark (no pun intented). So while sometimes knowing value 2 is important and sometimes it is not (it is obvious in which direction to regress and it is obvious about how far), it doesn't change the fact that we always HAVE a value 2, and that it is always the mean of the population!
Whew....
November 8, 2002 - Tangotiger
(www)
(e-mail)
F James: I think I wrote this already, but it might got lost with all of MGL's explanation, but the year before the 149,149,149 string was 141 and the year after the string is 142. Subsequent years drop off slightly from 142, and in fact matches what you would expect from normal aging. (This will become more clear when I do the breakdown by age... eventually, whenever that is.)
Essentially, MGL's point boils down to: whatever period you take, how many ever players you take, how many ever PAs that performance makes up, you have to regress to some degree. The amount to regress is related to the variables I just mentioned. By choosing 1 day, we are regressing almost 100%, by choosing 5 years of performance between age of 25 and 29, and in each of those years the player has 1 google PAs, you regress close to 0%. Everything in-between is subject to more analysis.
Given my sample of 3 years of 600+ players of about 500 PAs, the regression of the 149 player is 20% towards the mean of 115 to achieve the true talent level of 142 (more or less).