Linear_Weights
Linear_Weights
Tuesday, March 22, 2016
?I have a simple method to determine True Talent wOBA at the component level. I posted it in a post-by five years ago (see post 11), and it's never been referenced since, whether by me or anyone else. And it may be one of the most insightful things you come across.
For example, we all know that the run value of a HR is 1.40. But, what if instead we did this for a hitter’s HR coefficient:
PA/(PA+132) * 1.40
That becomes the new “skill” value for the HR.
Whether you regress the number of HR or you regress the coefficient for the HR, it comes out to the same thing, because we want to do this anyway:
PA/(PA+132) * 1.40 * HR
So, whether you do:
X * HR
where X = PA/(PA+132) * 1.40
Or you do:
X * 1.40
where X = PA/(PA+132) * HR
We still have the exact same thing.
And then go to post #12 for examples of the method in action.
See the typical thing is to regress wOBA, but that would make each individual component regress the same amount.
Almost everyone else will regress the amounts of each component (component-level regression), and then feed it back into wOBA or Linear Weights. And that's perfectly fine.
But if you want an incredibly sweet shortcut, follow the method I posted above: instead of regressing the amounts, you can instead regress the coefficient values!
Sunday, March 13, 2016
?Yes!
If you look at his batting-neutral numbers, the ones that treats the value of a HR and a walk and a single the same regardless of the base-out situation, Votto is +348 runs better than average according to Fangraphs (look for wRAA), and +354 runs according to Baseball Reference (look for BtRuns).
But if you walk with first base open or you hit HR with the bases empty, the actual run impact would end up going down. However, if you take advantage of the situation, and walk when there's a runner on first, and not strikeout when there's a runner on third, etc, the actual run impact would end up going up.
So, what happens when Joey Votto is batting? Well, on both sites, you can look for RE24, which looks at how Votto does in each of the 24 base out states and gives him credit for his performance relative to the base-out states. And on Fangraphs he's +395 runs and on BR.com he's at the identical +395 runs.
Instead of being at +350 runs in neutral situations, he's close to +400 runs in actual situations. So, Votto is a situationally smart hitter.
(Technical interlude: what we actually want is to compare RE24 to batting runs times boLI, the Leverage Index of the base-out state. But that's really getting into the weeds there. We can do that in the comments if you want to.)
Looking at all 200 hitters with at least 2700 PA from 2007-2015, Votto is 26th in best situational hitter in MLB, putting him at the 87th percentile. Remember, this is comparing Votto situationally to Votto in neutral conditions. Number 1 is Jason Heyward. He's followed by Cargo, Giancarlo Stanton, Chase Utley (naturally), Drew Stubbs (yup, below average hitter who actually is above average based on the situation), Dexter Fowler, Ryan Braun, Victorino, Jimmy Rollins, Todd Helton. On the flip-side, the hitter that is the least situationally-aware is Kyle Seager, followed by AJ, Delmon Young, Navarro, and Mike Aviles.
So, if you want to know how a hitter SHOULD hit, talk to the guys who are actually performing above expectation. They'll tell you how to approach a situation. That means listen to Heyward and Utley and Rollins... and Joey Votto.
Sunday, February 28, 2016
?This is very heavy on the math. But the payoff will be there.
Saturday, February 20, 2016
?There's nothing really new here for the Straight Arrow readers. This is more for those stumbling across wOBA for the first time.
Read More
As I'm reading down this list, ?I was thinking "I'd like to see this one". That was on the first one. And the second. And the third and fourth... all, without exception, is exactly what I'd like to see. I can't even think that I'd prefer to listen to one over the other. They are all right up my alley. So, whoever over at SABR choose these presenters, you did a fantastic job. And of course, the presenters themselves have chosen terrific questions to answer.
I do hope that the rest of the public will get to see these presentations in some form at some point in time.
Tuesday, January 12, 2016
?For nonpitchers: add 0.003 wins per PA
For SP: add 0.011 wins per IP
For RP: add 0.007 wins per IP
Monday, October 12, 2015
?Jonathan asks the question:
So, ask yourself this: if wOBA / TAv are the standard means of evaluating batters, shouldn’t the fundamental measure of pitcher value be the extent to which they limit batter wOBA / TAv?
Of course it should.
Not so fast! With linear weights (or wRAA as you will find it on Fangraphs, which is Runs Above Average based on wOBA), we treat all PA the same, regardless of the base-out situation. A HR is +1.4 runs whether the bases are empty or the bases are loaded.
With RE24, we apply a different run value for a HR based on the base-out situation. A HR with the bases empty is +1.0 runs, while a HR with runners on base will be higher, and much higher with the bases loaded.
Can you make a case for one over the other? Sure, it depends on what you are after.
Now, to be logically consistent, must you do the same for pitchers? No, it's not a necessity. It depends on the reason you do it for batters. If the reason you prefer Linear Weights to RE24 for batters is that the batter is not "responsible" for the base-out situation he sees, and so, it is "unfair" in terms of the number of opportunities faced, then that's a reasonable choice. This applies especially for leadoff hitters. It also presumes that a hitter won't change his approach based on the base-out situation, which of course is ludicrous.
And it gets to the point I keep making, that do we want to assign the impact of an event to a hitter simply because he happens to be involved, even if he may not "own" everything about the change in that event?
But for pitchers, it's different. If Verlander walks the bases loaded and then allows a HR, before striking out the side, that's 4 runs allowed (or +3.5 runs above average). If we followed Linear Weights, we'd give him +1 run for the 3 walks, +1.4 runs for the HR and -0.8 runs for the three K, for a total of +1.6 runs above average. Where did the other 1.9 runs go? Well, they went in how Verlander sequenced the events. He owns that and no one else.
If you include balls in play, the fielders also take their share of the credit for that, but overall, the pitcher is going to own more than 50% of the sequencing, maybe closer to 75%.
Friday, October 09, 2015
?The two players have virtually identical number of PA (difference of only six). Baseball Reference has Beltre at +220 runs above average with the bat, while it has Chipper at +558 runs. A 328 run is a clear and decisive victory for Chipper. Except we see that Chipper has 85 WAR and Beltre at 84, as close as you can get them to be even. So, what happened?
First and foremost is a 250 run gap in their defense. You can't believe it can be that much for what amounts to 15 full 162-game seasons? Well, we pretty much figure that the best fielder in baseball is about +20 runs better than average per season, and Beltre is among the best in baseball. It works. Then, Beltre played in the tougher AL league which account for another 40 runs or so. Chipper, according to Baseball Reference, played in an environment where the runs to win conversion was 10.5, while Beltre was 10.0.
Suddenly, a 300 run gap in offense, after accounting for defense, league, and playing environment shrinks to a 1 win gap.
Basically, given a choice of Chipper's career or Beltre's career (should it end this season), and WAR at Baseball Reference says it's a tossup.
Friday, August 28, 2015
?Let me show you how to create a simple "value" metric for pitchers. This may very well be the first value-metric I had ever created back when I was a pre-teen. It'll make sense prima facie. But when you go below the surface, its issues will be exposed.
We'll take the Jays who are 71-56, which is 33 wins above replacement, which we can split into a share for nonpitchers (19 WAR) and a share for pitchers (14 WAR). The key is that everything has to add up. For a pitcher, all we know about is his W-L record. Remember, simple pre-teen metric. The Jays pitchers have an obviously 71-56 record, but we want to somehow get that down to 14 WAR. You can get there by simply subtracting .45 wins from every decision.
We'll look at two pitchers to see how this affects them: Drew Hutchison has 14 decisions, of which we remove 6 wins, which turns his 12-2 record into 6 WAR. RA Dickey is 8-10, which now becomes 0 WAR. You can do this for all Jays pitchers, and you see you'll get 14 WAR for them.
Now, BASED ON WHAT WE KNOW, which is that we only know about a pitcher's W-L record, what I did is valid. But, it comes with it a huge assumption: all other things equal. That is, the pitcher W-L record, as assigned by MLB, is representative of a pitcher's performance. But Hutchison has received 3 runs per game in run support more than Dickey! Clearly, the W-L record doesn't represent our pitchers very well. It ignored a key context: the other half of baseball, which is the run support.
So, you need to heavily regress this metric if you don't consider run support. Yes, Hutch will still end up ahead of Dickey if all you have is the W-L. But at least, we won't give FULL WEIGHT to the W-L. We look at it with a high degree of skepticism. Maybe we end up with 3.5 WAR for Hutch and 2.5 WAR for Dickey or something, rather than 6 and 0.
The same applies for fielding, be it UZR or anything else. A player can be shown to be +30 runs above average, but, did it ignore some key piece of context? Or could it be biased in some manner? If you see one fielder at +30 and another at -20, it's almost certain there is not a 50 run difference between the two players, in terms of their performance. Chances are, the +30 fielder had good context that was not apparent or considered. And the -20 had bad context that wasn't considered.
The same applies for running and hitting and everything else. Except for hitting, things are ALOT more apparent. It's why we don't really care about regressing say the results by 5% or 10%... the margin of error is too small to bother with it. But just because we don't bother with it doesn't mean it doesn't exist or apply. It's there. And the honest thing to do is to apply it to all the value metrics, some more than others.
Thursday, May 07, 2015
?Terrific stuff from Dan. We've talked about the stuff at the top half of his article from several years ago. And the bottom half adds another layer of discussion.
Monday, March 23, 2015
?Spencer wrote an article that showed the run value of a swinging strike at around .16 runs and the called strike at .04 runs. I asked him for proof. And he provided his evidence. It's clear to me what the problem is. If it's not to you, try to think it through. My answer is below the line.
Read More
?David made a few changes, including some new stuff from MGL and updates to the FIP park factors, which we previously discussed.
Saturday, February 07, 2015
What I don't understand is why the current half-baked version keeps getting thrown out as if the process is finished.
As the one who had a leading hand in developing WAR, I can tell you unequivocally that it is NOT a finished process, and I never say that it's a finished process. And that I support TWO competing versions (Fangraphs and Baseball Reference) who sometimes don't agree with their estimates, no one can conclude at all that WAR is a finished process.
So, you are a building a strawman.
At the same time, WAR *is* the best thing available. If you choose to do anything else, you are doing something inferior. It's really that simple. If you think YOUR process is BETTER than WAR, then bring it on. Bill James is bringing it on, so, we can evaluate his method.
But no one else is doing that. No one. So why the heck should I listen to anyone who might suggest that player A is better than player B, when he's given me nothing at all to evaluate his opinion on. Or, whatever is given to me is in such a tight framework that I can't evaluate it in a more holistic fashion, to see whether it's consistent and systematic.
We've all got our own personal WAR-like system. The two WAR systems out there are simply the best of them all.
Tuesday, January 27, 2015
?Answer: current ultimate.
Why? Well, let's say that you think it's a quick and easy tool. What else would you do? You might... well, use RE24 instead of wOBA. That's still part of the WAR framework, simply a different implementation. Maybe you want WPA? Sure, go ahead. Maybe you prefer Dewan to Lichtman? Sure. But all of that is still part of the same framework. That's why WAR is the ultimate tool: it allows you to swap in/out your various components. You can even choose to have a different scale for the fielding spectrum. You can even change the replacement level to something higher or lower. And still, you'd be using WAR.
So, go ahead, and treat Fangraphs and Baseball Reference as something "quick and dirty". But I will promise you that whatever you will do will be even quicker and dirtier.
What WAR does is give you a framework, and makes it very easy for everyone to have their own implementation. Don't like what you see? Well, you are being given a systematic, consistent framework to which you can build your own house. Go ahead and do it, and give us an open house to look at it.
Or, complain that somehow these free WAR homes are not good enough and... well... keep wandering the streets with nowhere to sleep. The reality is that we all have some sort of WAR home. You just maybe don't know all the rooms in your own house, and the rooms keep changing, depending on which players come to visit. WAR at Fangraphs and Baseball Reference are merciless to its player guests.
Saturday, January 17, 2015
?I received an email, which I will post in its entirety (including the lack of introductions, etc):
Every single weight you have seems arbitrary to me. Where is the proof?
(0.72xNIBB + 0.75xHBP + 0.90x1B + 0.92xRBOE + 1.24x2B + 1.56x3B + 1.95xHR) / PA
Do you have some regression test of all the data in baseball that revealed these weights to a 95% statistical significance? What is the dependent variable? Runs Scored? After every year, would these weights not change after running the test?
Running a regression test isn't complicated. It's first year undergrad engineering stuff. Why people accept your weights as doctrine with zero statistical evidence is astonishing.
Soooooo..... where do I begin? He obviously did not read The Book, where I go into detail as to how this was developed. And he obviously did not look at the excerpt on The Book site. He did not google wOBA, as the first link brings you to this Fangraphs page which lists the wOBA weights year by year going back to 1871.
Instead.... he made up non-facts and treated them as facts (aka strawman). Then used those non-facts as arguments to form his opinion. In the end, we have a summary opinion with no evidence, which is the definition of bullsh!t. And there's no reason to argue with bullsh!t, which is why I instead took advantage of his bullsh!t email to at least show a few links about wOBA to those who may not be aware.
Monday, January 05, 2015
?Lee continues tracking this for us, which I love. It really fills the gap between R and RBI.
Lee: can you post the R/event, RBI/event, and RAS/event, where event is each of 1B, 2B, 3B, HR, BB, HB, RBOE, others.
Wednesday, November 12, 2014
I posted this on Bill James' forum, so I'll just repost it here:.
Note that there is nothing new here at all. It's the same thing I've been saying when I first starting championing WAR. But perhaps saying it this way adds more clarity to the process.
Read More
Tuesday, October 28, 2014
?Poz has a long piece on Bill James. He quotes Bill on his view of WAR:
Well, my math skills are limited and my data-processing skills are essentially nonexistent. The younger guys are way, way beyond me in those areas. I’m fine with that, and I don’t struggle against it, and I hope that I don’t deny them credit for what they can do that I can’t.
But because that is true, I ASSUMED that these were complex, nuanced, sophisticated systems. I never really looked; I just assumed that the details were out of my depth. But sometime in the last year I was doing some research that relied on these WAR systems, so I took a look at them, and … they’re not very impressive. They’re not well thought through; they haven’t made a convincing effort to address many of the inherent difficulties that the undertaking presents. They tend to get so far into the data, throw up their arms and make a wild guess. I don’t know if I’m going to get the time to do better of it, or if it will be left to others, but … we’re not at anything like an end point here. I assumed that these systems were a lot better than they actually are.
There's things I agree with and things I do not.
1. I do agree that WAR is not impressive, or at least not impressive looking. That's the beauty of its design. For example, look at what WAR is for pitchers at its core:
IP/9 x (lgERA + 1 - ERA) / 10
If your pitcher has a 3.00 ERA in a league of 4.00, and he has 225 IP, you get this:
225/9 x (4 + 1 - 3) / 10 = 5 wins
(That divide by 10 is simply the runs to win converter.)
And here's a little secret: this was invented by... Bill James! In his classic(*) article on the MVP race with Clemens and Mattingly, he goes through the machinations, including doing that "+1" bit, which is actually the most important part of this equation. Without the "+1" part, it becomes Wins Above Average, which is how Pete Palmer presented it in The Hidden Game. The +1 part turns it into Wins Above Replacement.
(*) Most of his articles are classic, so, I'm not really narrowing down the list.
2. As for the not thought through, I do not agree with Bill at all. They are actually incredibly thought through. Again, just as an example, the distinction between Starting Pitchers and Relief Pitchers is huge. This is something that baseball people inherently understand, but that those of us studying the data kind of dismissed or ignored for the longest time.
We just couldn't explain that a 3.50 ERA by a starting pitcher was far better than a 3.50 ERA by a relief pitcher, and it goes beyond just volume of innings. Keith Woolner was one of the first to bring this up over a decade ago, and others followed suit, me included, notably in The Book. This is research that evolved over time to the point where I gave it a rule, the Rule of 17, which basically says that a relief pitcher gets 17% more K, allows 17% fewer HR, allows 17 points fewer in BABIP, and 17% fewer runs (walks are flat).
There's the standard thing we do with park effects, as well as the difference in AL/NL talent ,so that "lgERA" is really adjusted for all that. Some even go so far as to look at the actual opponents and their fielders to further adjust that lgERA. (Note, when I say ERA I really mean RA/9, but ERA is so ubiquitous a term. Which is also another advance, that we focus on runs allowed, not the made-up earned runs.)
3. The wild guess could be something that's true, but I wouldn't say it's a wild guess so much as it's a necessary guess, an educated guess, a guess to move the discussion forward. Some examples are BABIP, which we really don't know how to split up very well, or at least, in a way that we can explain it well enough. If I say to regress Kershaw's 2014 BABIP will be based on his BABIP in 2013 and 2012 and 2011, that looks really confusing. Even if I try to tell you that simply to understand his 2014 performance on its own. It's really really hard. So, I just say: split the difference and assume his responsibility of the BABIP is halfway between his observed performance and league average.
Another one we have to handle is relief pitchers and leverage. Again, to move the discussion forward, we credit the reliever not with the Leverage Index that he actually faced, but rather halfway between that and the (by definition) league average of 1. It's part of a concept called chaining, that if that reliever wasn't there, some other reliever would have taken his place. But much like Ozzie Smith's fielding is leveraged at SS (he's involved in more plays than in LF) or Rickey Henderson's hitting is leveraged at leadoff (and so he gets FAR more PA than the average hitter), we can't completely discount the talent associated with the leveragable opportunities.
***
So let me just say that for purposes of making sure the metric is not a black box, to make sure the metric is accessible, to make sure that anyone could calculate their own version of WAR, the framework is flexible enough to allow that to happen.
We do not want it complex, or (too) nuanced, or too sophisticated. We want it so that anyone can build a house, and WAR gives you that blueprint. The potential saberist out there is now empowered and is given a path to build an even better house. The foundation is there.
We can say that a house is not impressive, or we can say that a house is incredibly impressive. Either way, WAR has been able to cut through the idea that we need something complex to be able to explain something as complex as baseball.
Tuesday, September 23, 2014
I sent this to Bill James in response to a discussion he's having on his blog.
With regards to a "run" credited to a pitcher: it's actually shared with his fielders, but just that the pitcher is the primary owner. This would be similar to a "run" credited to a batter, when it's shared with the guys who follow him. But we evaluate batters, for the most part, on their components. In essence, we're not giving the batter full credit for the runs they actually scored, when evaluating their overall performance. To that end, if two pitchers had the same RA/9, but one did it with a .240 BABIP and the other did it with .340 BABIP, we might be more inclined to think that the guy with the .240 BABIP had more help from his fielders.?
Sunday, September 21, 2014
?There's a thread here asking about negative wRC+ and wRC.
The point that is missing is the context of runs. You can have 9 guys that are all league average, and this team will score 4.5 runs, which is 0.5 runs created per player. You can replace one guy with a pitcher-as-batter, and keep the other 8 guys the same. This team will score 4.0 runs. Since the 8 guys were each worth 0.5 runs created when the 9th player was someone like them, we reason that they are STILL worth 0.5 runs when a pitcher-as-batter is in their midst.
And if the 8 guys are worth 0.5 runs created, that makes them worth 4 runs. Since the team of them plus the pitcher-as-batter scored 4 runs, we deduce the pitcher-as-batter created 0 runs... even though he did actually get on base, and score runs, and drove in runners.
And if you had Ben Sheets batting, this team would score 3.9 or 3.8 runs, and so, he'd deliver "negative" runs.
***
And this is the hard part: if you had 9 Ben Sheets batting, they would certainly score more than 0 runs, and certainly not negative runs.
So, you really have to establish the ASSUMPTIONS for the stat. One set of assumptions leads to Sheets being negative runs. Another leaders to Sheets being positive.
And one set of assumptions will lower the RC of the OTHER batters depending on who is their teammate, while the other assumption tries to maintain independence.
Choose your poison, but don't tell the other person that he's wrong. You aren't arguing results, but rather reasonableness of assumption
Recent comments
Older comments
Page 2 of 152 pages < 1 2 3 4 > Last ›Complete Archive – By Category
Complete Archive – By Date
FORUM TOPICS
Jul 12 15:22 MarcelsApr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref
Apr 12 09:43 What if baseball was like survivor? You are eliminated ...
Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method)
Jul 13 10:20 How to watch great past games without spoilers