New To Nerd Math? Start Here
One of the frequent questions we get is “If we don’t know what your numbers mean, where should we start?” I usually don’t have a great answer, suggesting a few various introductions and then telling them to get good at googling. Well, that’s over. At Lookout Landing, one of the commenters has taken it upon himself to collect and categorize a large variety of articles about various topics relating to the analytics of baseball.
So, now, start here. That post contains links to a lot of important information, covering nearly the whole spectrum of analysis. It’s a pretty awesome reference tool. Thanks, Fett.
Comments
49 Responses to “New To Nerd Math? Start Here”
Leave a Reply
You must be logged in to post a comment.
Good work, Dave. You and your brother have always been my favorite nerds.
Didn’t Tom Tango set up a wiki for a similar reference at one point?
Tough to keep up with all that is out there. This is a fantastic reference.
Forgive me if this has been covered before because I am new here but I find a couple things about the USS Mariner interesting. In particular the disconnect I see between how you are evaluating pitchers and how you are evaluating hitters. OBP is the holy grail around here, and I don’t disagree that it is important. Having runners on base makes pitchers pitch out of the stretch and that diminishes the effectiveness of their pitches. A high OBP also means that hitters are seeing a lot of pitches meaning you are helping the other hitters get an idea of what the starter is throwing as well as tiring the arm of the starter and most often getting to the bullpen sooner. Another big plus. And of course having guys on base makes it easier to score runs. It goes without saying that having a guy at third makes it more likely you will score than if you have no one on base. However, what I find interesting is that all of your metrics that evaluate hitters seem to leave out something that I think is very important. (I should say that I have been in love with baseball for the 29 years of my life so far and played the game for 20 years although not at a high level) The ability to drive in runs. Now before you go branding me an idiot or some sort of new baseball metric cave man let me explain. There is a clear difference between getting on base and getting a runner home who is already on base. Now, I should say that I don’t think RBIs are a very good measure of how effective a guy is at driving in runs. It just means he is up a lot with runners on base. So let’s not use RBI but what we need, and maybe it exists and this post is pointless, is some sort of metric that measures a players effectiveness at driving in runs. You know a percentage of times he gets a runner home from first, second, and third. I am not sure how the stat would look. I feel like the Mariners need a guy with a high percentage of runners driven in to anchor their current lineup. I realize that having a lot of guys who get on base equates to certain numbers of runs scored but wouldn’t the number of runs scored increase or decrease based upon having guys with high runs driven in percentages? In other words the more efficient your 4,5,6 hitters are at driving in runs the better for your high OBP team.
The other thing I find interesting is that while evaluating pitchers you concede that the HR is important. And you seem to discount the importance of the WHIP stat. Firstly, “Two other big factors that we’ve identified that can have a great effect on run scoring are home run rates and stranding runners. In general, flyball pitchers give up more home runs than groundball pitchers, which is why a groundball is a positive event for the pitcher and a flyball is not.†You concede that the number of HRs a pitcher gives up has a “great effect†on how many runs that pitcher gives up. Now wouldn’t it make sense then that having hitters who can hit home runs would help your team score runs at an equivalent level of importance? I am not saying you go out and bring in a bunch of guys who hit .240 with 30 home runs. I am just saying that home runs are an important part of having a successful team because of the speed with which the home run can score runs. It is easier to have a guy hit a solo home run than to have two to three guys string together a hit or walk to score that same one run. Scoring efficiency is important and the home run helps a team be more efficient. You also say that, “However, the non-outs that flyball pitchers give up are more harmful†which seems to me to indicate that a guy who hits a fly ball with a lower than average percentage of that fly ball being caught would be a valuable addition? If a guy hits an average amount of fly balls but those fly balls get caught 5% less than the league average you are looking at a very valuable piece to include in a winning team.
“There’s nothing that ERA or WHIP will tell you that those component statistics do not, but ERA and WHIP certainly leave a lot of the underlying information out.†Now I agree that ERA is a flawed stat. But to me WHIP is important. It’s like a quick overview of how a pitcher is doing. Without getting into all of the advanced metrics we can quickly look at WHIP and see how often there are runners on base. I mean it seems to me that WHIP is the pitchers equivalent of OBP. OBP doesn’t tell me how often a guy gets on by hitting a line drive, or a fly ball or a ground ball. And it doesn’t tell me how often a guy gets on base with other runners on or in key situations like late in a one run game. But it does tell me how often on average a guy is on base and we use it to judge hitters. WHIP is the same thing for pitchers so while it shouldn’t be the be all and end all of pitchers stats I do think it has an important role to play. “In this age of wonderful information, there’s just no reason to use ERA and WHIP for serious analysis of a pitcher’s ability. We have better tools at our disposal. We’re doing ourselves an injustice if we continue to lean on inferior information.†Couldn’t this argument then be made about the limited info that OBP is supplying?
Sorry about the length of this post. But I want to say one more thing. Advanced Metrics are great but do they take into account how often a player succeeds in key situations? Because if they don’t then there is still some flaws in relying on them exclusively to judge a player, or to build a team.
I’d suggest reading up the sections in the post above about wOBA and clutch hitting.
Short answer (because I’m low on time) – wOBA does not underrate “power hitters” who can drive in runs, and clutch hitting is not predictable, so therefore, not useful in roster building.
I never said anything about power hitters. I was talking about guys who drive in runners who are on base more efficiently than other guys. Is there a number for that? I would rather have a guy who gets a runner home from 2nd with less than 2 outs 25% of the time than a guy who does it 15% of the time. Even if the 15% guys has a moderately higher OBP. And I am sorry but if you are a GM building a team looking at a player who consistently comes up with timely hits is very important to roster building. If it was between a guy whose OBP was .380 and a guys whose OBP was .410 but the .410 guy had an OBP of less than .300 in the 8th and 9th innings who would you want?
You’re assuming that there are differences in abilities at this mythical ability to drive in runs. At the major league level, there are not, at least not beyond a player’s normal abilities that are already accounted for.
There is a mountain of great information contained in the links that Fett put together. Really, read them. They will answer your questions.
I am just a little sceptical that you can just use these metrics to build a winning team. I mean I understand that the numbers will identify which players are better than others and you want as many of the good ones as possible. But are there stats that you might have on hand that show that using the metrics you guys use does in fact build a winner? ie. the last 5-10 World Series Champions. I would be interested to know how the last 5-10 WS champs team OBP/OPS/OPS+ etc etc etc stacked up to league averages. What positions and places in the batting order were more effective in those categories than others and so on. I don’t expect you to have that info but do you know where I could find it?
“You’re assuming that there are differences in abilities at this mythical ability to drive in runs. There are not.”
Are you trying to tell me that there aren’t going to be guys better at getting runners who are on base home, and by better I mean more efficient, than others? There are guys who are better at getting on base than others. There are players who are better at drawing walks than others etc etc etc. So how can you justify saying that there are not going to be differences between guys in how often they get runners home in different situations?
Follow the links and answer your own questions.
You can sort team leaderboards on FanGraphs. You’ll find that the correlation between things like wOBA and runs scored is remarkably high.
Try the links in the article in Lookoutlanding that is described in the post.
But if you want to just look up player and team totals for the last few years for specific stats, start with baseball-reference and fangraphs.
So how can you justify saying that there are not going to be differences between guys in how often they get runners home in different situations?
The differences are entirely captured by their context neutral performance.
You don’t have to like this, but the evidence is clear – there is simply no predictive ability in things like hitting with runners in scoring position. None. Nada. Zilch.
Sorry.
You’re assuming that clutch hitting is a repeatable skill.
That will be a topic you need to research.
Another topic you should research is BABIP. And DER.
Those measures will tell you why WHIP is no more useful than ERA for predicting pitcher performance.
OK well you guys don’t need to talk down to me like I’m an idiot. I am new to all these numbers and I am trying to read up on all of them. These are just questions that are coming to mind as I am trying to come to grips with all of this new info. You guys have been around it a lot longer so I figured I would just ask you guys when I am struggling with a concept. Maybe I will just have to try to work it out for myself.
Telling you you are wrong and pointing you towards how you can get better information is not talking down to you.
Hi Bonegar,
I believe that the question that you were trying to raise, related to ‘driving in runs’ is this:
* For middle of the order hitters (4-6?), do these statistics such as wOBA correctly account for the value of the ability to drive in runners on base, or do they overvalue OBP abilities for these hitters?
It is reasonable to investigate whether we should differently weight the value of OBP vs SLG for players in different lineup spots. And if so, how big of a difference this makes.
The negative reaction you received from this site occurred because you made statements like:
In fact, the statistics that we look at DO evaluate the ability to drive in runs. This is exactly why we like stats such as wOBA, over a stat like OBP, when measuring overall offensive contributions.
I would recommend studying wOBA. I think that you will find that it measures all aspects of run creation well – both the ability to get on base and the ability to drive in runners.
Bonegar, most people come aboard the USS Mariner much as you are.
First, a person is PROUD. He has many good ideas, which it does not seem that others have considered. He played professionally. He coached at a community college. He understands swing mechanics. Etc.
When those ideas are refuted, he DENIES the refutation.
Upon discovering that denial may be misguided, he becomes ANGRY. These people may know their shit, but they are assholes, he thinks. Can’t they explain themselves without being so condescending?
As he begins to dig deeper into the volumes and volumes of information available, he realizes that in fact it is very difficult to explain everything to a person who hasn’t studied any of it and he begins to ACCEPT that these people are maybe not idiots and not assholes.
Finally, once he has sufficiently distanced himself from pride, denial, and anger, he begins to LEARN. And that goes a long way towards explaining why the baseball community in Seattle is so exceptionally learned.
Just because I have a different opinion doesn’t make me wrong. I am of the opinion that certain guys are better at getting hits in key situations than other guys. Whether through being mentally tougher or having swings that are more consistent so that pressure and nerves don’t affect the swing as much or whatever. I have to think that external factors like pressure and everything else impacts on players and some of them handle it better. So maybe there aren’t advanced metrics that show that one guy is more “clutch” than another guy and therefore you can’t factor it in when building a team. But, in my opinion, there are guys who get the job done in key situations more than others. Otherwise why even play the games?
CCW you may be right although right now I am feeling more stupid than anything. I consider myself to be a very bright guy but some of these stats and how they relate I am having a hard time wrapping my head around.
It isn’t a matter of opinion. The things you’re talking about have been investigated, and found to be without significant merit.
People have pointed you towards resources where you can see this for yourself; if you choose not to, that’s up to you, but don’t expect us to continue to be nice and play pattycake if you can’t be bothered to read when you’re pointed directly at the information.
However, having a different opinion that is not well supported by facts versus someone who has an opinion supported by facts probably will make you wrong.
And……I think you do the entire discussion a disfavor by framing it as one person’s opinion vs. another person’s opinion.
Alex – “As Dave Cameron of FanGraphs writes, wOBA is useful ‘when you just want to know how a batter did at the plate, regardless of who was on base or what the score was at the time.'”
This is what I am having a hard time reconciling. I want to know what numbers are out there to let me know how a batter did at the plate when there were guys on third, second, first, or a combination of those three and of course the number of outs in the inning need to factored in as well. Not to mention what inning the game is in and what the score is. Maybe I am just missing this part of my education? But I can’t seem to find where this number exists or where I can read why this type of number isn’t worthwhile. You all seem to agree that “clutch” isn’t a repeatable skill. And I can’t seem to find the evidence that you guys must have to back that assertion up. At the risk of alienating you all by asking for more help can someone help me here. I am looking for info on this topic of “clutchness” so to speak. And I can’t find it and I fear it’s because I am missing something.
Maybe FEELING STUPID should have been one of the steps… Seriously, though, there is a ton of information to digest. If I were you, I wouldn’t aim to try to understand every last bit of it. Few people here do. But you should have an open mind and understand that baseball analysis is no longer in its infancy. Many of these metrics have been developed over several years tested and re-tested against decades of statistics. In many cases, opinions have become irrelevant because there are facts that answer the question. I’m not sure we’re quite there on the question of clutch, honestly. That one may be debated forever. But we’re getting close.
I don’t think you’re looking very hard. There are 3 articles about clutch in the very article that is the subject of this post. The irony, here, is this post is about all the information that is available. And you’re sitting here asking where the information is…
This is definitely not an easy excercise, you shouldnt feel stupid. Tons of intelligent people have spent many years developing this stuff!
Just dont assume that all your previous opinions are necessarily correct. If all this analysis did nothing but tell us things we already knew, then what would be the point?
On clutch hitting, many statistical analyses have been performed. For example, in ‘The Book’, Tom Tango studies clutch hitting and concludes that while it does appear to be a skill, the effect is pretty small. Its also difficult to tell who has the skill, and requires a large sample size to measure.
The big problem in trying to assemble a team based on clutch hitting, is that if you look at who the most clutch players are one year, this has almsot no correlation to who is ‘clutch’ next year. You go get a bunch of players you think are clutch, and then next year they arent clutch anymore.
CCW – Sorry. Like I said I must have missed something. And I did. I ahve been reading that page for a couple hours now. And I just went back and looked through the topics again and found the clutch ones. So I am off to read them. Like I said I must have been missing something and I was. I apologize.
Sounds like you are looking for WPA (Win Probability Added), which measures the effect of the player’s at bat, in terms of how much it increases or decreases his team’s chance to win the game. (Look for the WPA section).
The problem with WPA and similar stats is that is is not predictive of how well the player will perform in the future. It measures the actual value of the player’s performance in terms of each situation he was in at the time. However, much of this value or lack of value comes from whether the player was lucky or unlucky, regarding when he got his hits or made his outs.
Also search that thread for the word ‘Clutch’, there are three artcles on the category.
Yeah thanks Alex I am reading it now. I feel like I just landed on a new planet and I am trying to learn a whole new language. Slowly this is all starting to make sense and I wish I had waited another 3 hours before posting anything cause I feel like an idiot.
Though baseball is a numbers sport, it isn’t necessary to fully understand them – especially the more advanced SABRmetrics – to fully enjoy it.
Though I’ve followed this forum and others like it for some time, the newer metrics still don’t roll off the tongue and resonate with importance the way the old HR, RBI, AVG, ERA acronyms do – but I expect that will change with time and familiarity.
It’s hard and irritating to be told that GWRBI doesn’t mean much when I’d spent hours memorizing such outdated minutiae from my baseball cards since 1st grade.
Oh, and math nerds usually aren’t known for their diplomacy or humility, so take it all with a grain of salt.
Familiarity is very important, yes. (Also, it takes less time to say ‘whoa-buh’ than ‘are-bee-eye’!) 18 months ago I had no idea what to make of these ‘wOBA’ numbers people were talking about. This guy is .325? Is that good? Bad? This other guy is .350, how much better is that?
Why cant this be scaled to batting average?
But once you learn how it works, the old stats are just annoying because you understand how innacurate they are, and you wish those baseball cards and Mariners broadcasts would tell you a stat that you actually cared about, and actually represented the player’s value accurately.
Very true, but in this case, the restraint shown was remarkable! The patience that Dave displayed gives me hope for Jose Lopez. Very nicely done!
Increasingly I find myself turning up my nose in disgust whenever I hear or read someone marveling at how great so-and-so was because of their sub 3.00 ERA or flashy Win/Loss record.
Consider this recent Facebook exchange: “Say, Fred, did you read Caruth’s terrific BABIP article on FanGraphs the other day? No? Oh, you were busy reading Baker’s mindless anti-Edgar column in The Times?”. Un-friend.
Hey, Briggstar, Geoff puts a lot of thought into his drivel.
Bonegar, don’t feel bad. Most of us came to this (at one point or another – mine was around 2000, others were later) much like you have. And keeping up with how quickly baseball analytics is progressing is not easy, either. But, for someone who is as obviously interested and passionate about the game as you are, it is worth the effort. As others have said, keep an open mind.
The one thing I would add to the comments already made (though I suspect both articles are already linked in Fett’s outline): go to Tango’s blog. At the top, there is a link to both a Sabermetric Wiki (which is kind of similar to what Fett is trying to do) and SABR101 Required Reading.
SABR101 is where to start – understand linear weights, and how/why we came to that, and you will have the basis for understanding all of the more advanced metrics that Fangraphs is famous for. Everything they do there (on the offensive side, anyway) is based on this. Start there – it will help.
And, BTW, when I started trying to learn this stuff, I had a pretty good guide. Some guy named Cameron…..
I would add that while I am mathematically inclined, I don’t have time in my adult life to learn these stats in depth. My approach is to gain a general understanding of what the stat represents and then follow (at a high level) the discussion around the stat. There are a lot of really smart people who know this stuff inside-out and they will vet ideas for accuracy. If the idea/stat passes peer review I add it to my collection of understanding. I view the approach as one similar to new medical ideas. I don’t have the time to research each article posted in a medical journal but if it passes peer review…
I only dig deep into the areas where I have a particular curiousity.
Great, great point DC. That’s basically my approach too.
Looks like Leslie Anderson is on the market. Think the m’s would be interested?
Great site and really appreciate the knowledge sharing and am pretty new to this all too, so go easy! What I don’t understand though is how people say that clutch doesn’t exist at all. I think what the non believers want is some way to justify that is pretty intangible because how do you define which is or isn’t a higher pressure situation?
Although it is hard to quantify, it is a fact that all people perform better under pressure.
So if you go up to bat with runners on base then surely there is more pressure to bring that runner home than if you come to bat with 1 out in the top of the 3rd, or if your batting in the bottom of the 9th with the game on the line then there is more pressure to not be the last out than say if you were batting in the top of the 1st with nobody on base.
I know the guys in the majors are the elite, but even elite athletes perform differently when the pressure is on. So isn’t it true that there is more pressure and some guys will perform better than others?
What people keep saying here is, “We (and others) have torn the numbers down to the axles, and the answer is, ‘no, the ability to hit the game winning [fill in the blank] under pressure (or ‘high leverage’), relative to other situations, is not a measurable or predictable skill. Really. We’ve looked at hundreds of thousands of at bats across many, many seasons, and it’s true.’
Now nobody has to ‘believe’ those assertions. This isn’t a faith-based site, although any Mariner fan is plenty used to praying by now. But if you go to the articles cited and read them — provided you’ve got enough stat background to follow along and really it’s not a lot of background — you will see the basis of what they’re saying and the soundness of the methodology. It’s not an opinion, but hard, cold, fact about variation in ability to hit, take a walk, etc. If you can bring additional data and higher level analysis to the table, you will be a star but in the meantime, the first step is to read the texts.
Gibbo, I’ll go easy here…mostly because it isn’t my style to do otherwise, but still.
Please investigate the links in the outline that is the subject of this post. Read the SABR101 articles I linked to. The answers are all there.
In short, some of what you say is “fact” (with no evidence offered) has been proved otherwise. There are lots of studies, but I would challenge you to do just this one thing:
Go to ESPN (or any mainstream stats site of your choosing that provides archival stats), and look at who leads the league year-to-year in whatever “clutch” split stat you choose. If such an ability existed, why wouldn’t the same guys lead the league year-after-year? Take a look at the wild variation (relative to their overall stats – of course a great hitter can be expected to be great “in the clutch,” as well) in their “clutch” stats.
Over time, what you will see is that all hitters more or less regress to the mean level of production that they display as an overall hitter. What you see, in terms of the variation year-to-year in “clutch” stats, is really a product of the smaller sample sizes you see when you slice and dice a season into “split” stats.
As human beings, we have a tendency to remember the memorable, and disregard the rest. The guy who comes up big “in the clutch” is remembered that way even though there is plenty of contrary evidence that shows he isn’t any better – or worse – when the pressure is on than he is overall.
To flip it around, too:
If someone performs measurably better when it is a “clutch” situation…why are they slacking off the rest of the time? Why wouldn’t a highly paid professional athlete be putting out their best effort all the time?
“Facts” like these are called beliefs.
I really like Andrew Dolphin’s analysis of clutch hitting here.
Which is basically what he wrote in “The Book”.
Summed up as, it exists, it is hard to measure definitively, and it isn’t a big difference.
I believe it exists too. I would probably call it hyper-concentration, though. But, it’s not something you can just call on whenever you want for the reason Jeff Nye gave.
My two examples:
1) Omar Vizquel on the last play of Chris Bosio’s no-hitter.
2) Derek Jeter’s play against Oakland during the 2001 ALDS:
So, yes it probably exists in some sense, but it’s not significant or measurable for MLB players.
There is one thing that always gets left out of the clutch discussion – self selection
Yes, there are people in life who suck at handling pressure. Guess what? They don’t make it to major league baseball. They probably don’t even make it to college baseball. The competitive pressures of needing to perform under pressure weed them out at a very early age.
By the time you get to MLB, you have a group of players who have been selected in part for their ability to perform under pressure. And, when they compete against each other, the differences between them are very small.
dchapelle, thanks for the link. I was aware of Dolphin’s conclusions, but hadn’t really read them before (and in the interest of brevity, didn’t mention them in response to Gibbo).
I think it still boils down to “the great hitters hit” plus a whole bunch of statistical noise and a few contact-hitting outliers and a small effect for some players that tends not to be predictive year-to year and isn’t really a large effect. Mostly, we remember great hitters who have memorable at bats as clutch hitters when, mostly, they are just the same great hitter in the clutch as they are normally.
I can’t readily find it (I did try) but there was a great entry last summer that debunked the then accepted wisdom that Jose Lopez was hitting great in the clutch. He’d had a couple of very well-publicized at-bats and had come through (I witnessed two of them), but when you looked at any reasonable definition of “clutch” (other than “the games I remember where he delivered”, which is the kind of self-selection others have commented on), he was exactly the same hitter or worse in clutch situations as he was the rest of the time.
I have to admit that until I read the article, I was in the “Lopez is having a good clutch” season camp, but the evidence was really hard to ignore.
Breadbaker, was it here? Or maybe here? Honestly, neither of those quite fits. I remember something about the (non-)predictive value of hot streaks (and more relating to Ibanez) in this vein, and neither of these is it. But both are interesting.