Tangotiger Blog

#1 MGL 2015/09/24 (Thu) @ 14:54

I started to read that paper and write an ongoing critique and then my eyes glazed over. I don’t think I am really interested in their work. It could be excellent work, I don’t know. I don’t have the time to slog through the paper. I doubt it will get any traction.

#2 Tangotiger 2015/09/24 (Thu) @ 16:30

I remember one version of the positional adjustments in openWAR used the same premise that Pete Palmer used.

I’ve argued about that for 10-15 years at least as a bad premise. I don’t know if they changed that in their latest iteration.

#3 studes 2015/09/24 (Thu) @ 20:08

They didn’t change it in the final paper. In fact, if I’m reading it correctly, they apply a position adjustment to the hitting stats, but not to the fielding stats. This was one of several things that had me scratching my head.

#4 Tangotiger 2015/09/24 (Thu) @ 21:01

When it comes to leaderboards, I want to see the best hitter leaderboards to be the best hitters, not the “best hitter for a SS”, and so on. So, we’d naturally see tons of corner OF and 1B there.

Similarly, this is what a leaderboard for fielders should look like:
http://tangotiger.net/scout/index5.php
A bunch of infielders on top, and a bunch of LF/1B on the bottom.

This is how we think, and this is how we see players.

The stuff from The Hidden Game? One of the few missteps by Pete, but a misstep that has survived far too long. A few of you out there know how to use it, but too many people out there make mistakes with it. It’s not worth trying to salvage when we have a better path.

#5 beanumber 2015/09/25 (Fri) @ 14:11

@MGL: I’m sorry you found this to be tough going. Are there particular points I could help clarify?

@studes/Tangotiger: I see your point. I guess we thought about WAR as inherently tied to position. That is, your value as a player is inherently wrapped up in the defensive position you play. This is important because there is a scarcity imposed by the rules of the game. So even though the offensive leaderboards may be dominated by OF and LF, it wouldn’t be correct to say that those players are the most valuable overall, right? You can’t just make a team composed entirely of LFs and expect to have their WARs add up to team wins, right?

We were also computing RAA values on a per plate appearance level, which makes it easier to think about defensive position when batting, because again every hitter is playing a specific defensive position when he is batting.

If you didn’t have this correction, then every pitcher would have a large negative RAA contribution from their batting, but that’s isn’t fair, right? This would mean that pitchers in the NL would have no chance to compete against pitchers in the AL, let alone position players.

Now, if you want to argue that we should only evaluate pitchers on their pitching ability, then that’s a different argument, and I would argue that the number you’re talking about isn’t WAR. The fact is, pitchers hit, field, and run the bases too, and all those contributions matter, even if they are small relative to their pitching contributions.

I agree that to look at RAA_batting and ignore the other components of WAR would be to miss something, because you are right that our RAA_batting figures do take into account the player’s defensive position. But again, that is to account for the scarcity issue that in our minds makes defensive position an important component of WAR. If you just want to evaluate batting independent of everything else then we already have far simpler tools to do that.

#6 Guy 2015/09/25 (Fri) @ 14:28

#5: The error is in treating the position adjustment as offensive value, when it is actually defensive value. The greater scarcity at SS—and thus, lower offensive production—is a function of the position’s defensive demands. Surely it is no harder to hit because you are going to play SS in the next half inning, right? Similarly, a good-hitting SS is not a “better hitter” than a LF with the same offensive production—he is an equal hitter who provides superior defensive value.

If you add the position adjustment to position-specific fielding, you will have a reasonable estimate of a player’s defensive value. OpenWAR is instead effectively adding the position adjustment to hitting and calling the total “offense.” That makes no sense…..

#7 Tangotiger 2015/09/25 (Fri) @ 15:16

Ben:

What do you do when the offensive production of the LF is lower than that of the CF?

In your process, you force the overall production of the LF (off+def) to be EQUAL to that of CF.

But, since we know that virtually no team (outside of the Pirates) has a better fielder in LF than CF, then we know that the average CF is a better fielder than the average LF.

So, you end up with a faulty premise that avg LF = avg CF.

#8 studes 2015/09/25 (Fri) @ 15:38

Hey Ben, thanks for dropping by. I have to admit that I don’t understand the resampling methodology. We’re having another conversation about this in another thread, but do you re-sample the player in question, or the replacement baseline? If you re-sample the player, are you then negating the situational impact you first gave him by using RE24?

#9 Tangotiger 2015/09/25 (Fri) @ 15:44

Whether you resample the player or resample the baseline, the end result, the difference between the two, will remain (basically) the same.

So, you can get around the idea of “I only care what the player did” by focusing on the baseline comparison point.

Having said that, studes’ question is legitimate.

#10 Tangotiger 2015/09/25 (Fri) @ 15:52

So even though the offensive leaderboards may be dominated by OF and LF, it wouldn’t be correct to say that those players are the most valuable overall, right?

Nobody is suggesting that, overall, the best players are such players.

#11 Peter Jensen 2015/09/25 (Fri) @ 18:33

Ben - Thanks for joining in the discussion to help clarify some points about your metric. I am concerned about your use of the hit location data for the defensive portion of OpenWAR. Nowhere in your paper do you acknowledge that the MLB it locations are where the ball is picked up rather than where balls that are hits land or pass through the infield. That makes the data vastly inferior to the data used by MGL for UZR. You also do not seem to adjust the data to compensate for the errors in the field diagrams used by MLB stringers to enter the data. Nor do you field adjust the data for each player to compensate for actual differences in the dimensions of the field. I also found it impossible to resolve the differences in Figure 1 from figure 2 both showing probability contour lines. Figure 1 shows a correct representation for outfielders with probability shown as an oval with the major axis perpendicular to the line from the plate to the fielder’s start position. Figure 2 shows an incorrect probability pattern of an oval with the major axis oriented along the line from the plate to the player. Why the change? Also the probability contours are very close to concentric in Figure 2, but clearly are not in Figure 1. It is also unclear from your paper if a fielder that doesn’t successfully make the play on a ball that is fielded by a teammate (whether for an out or not) is nevertheless docked his positions probability of making that play.

I also don’t understand why you lump all types of base advancement together as you explain in the supplement. Clearly advancing on a hit is different than advancing on a passes ball and both are different than advancing on a stolen base. MLB Gameday data clearly differentiates these different plays.

Similarly, I don’t understand your statement in the supplement that MLB doesn’t give the hit ball type on every hit ball. They do, in the event description where it can be parsed out.

“Thus, openWAR is not an attempt to reverse-engineer any of the existing implementations of WAR. Rather, it is a new, fully open-source attempt to estimate player performance on the scale of wins above replacement.”

This statement from section 1.3 is admirable as is your intention to make OpenWAR reproducible and provide estimates of error. However, it is disappointing that you seem to use it as an excuse to ignore the reasoning of why previous implementations may have parts that actually make more sense than yours. It does not make much sense to have a reproducible methodology that no one wants to reproduce.

#12 Steve C 2015/09/26 (Sat) @ 16:36

I could be off here, but don’t catchers that have been converted to 1B or 3B hit better after dropping the rigors of catching? So there ought to be some small offensive component to positional adjustments.

Something else to consider is the transformation Hanley went through when he signed to be a LF. He showed up to spring training huge. He can’t do that and reap the offensive value if not for playing a less difficult position.

#13 Peter B 2015/09/26 (Sat) @ 17:47

Considering that Hanley did not succeed offensively or defensively after moving to the outfield, I’m not sure his perceived weight gain holds much, er, weight.

Catcher might be a different story, but I’m skeptical on all other positions.

#14 Tangotiger 2015/09/26 (Sat) @ 19:09

Yes, catchers and DH hit better when they play one of the 7 fielding positions.

#15 beanumber 2015/09/30 (Wed) @ 14:34

#6. I think that we are simply adding the positional adjustment in a different place. That is, our RAA_batting figures are *relative to the league average hitter at that defensive position*. Similarly, our RAA_fielding figures are *relative to the league average fielder at that defensive position*. No additional positional adjustments are necessary.

It seems like you want to do the same thing as we do for fielders, but then you want to evaluate hitters *relative to the league average hitter (at any position)*. Because of this choice, it is then necessary to add a positional adjustment to account for what defensive position they played.

So I think it will come out more or less the same in the end. It might even be as simple as noting that A + B + C = (A + C) + B. I might argue that our way is more elegant, in particular because the positional adjustments come directly from a regression model, rather than having to be estimated on their own.

You are right in pointing out that it changes how you would interpret the RAA_batting component of WAR.

#16 beanumber 2015/09/30 (Wed) @ 15:20

#11.

> I am concerned about your use of the hit location data for the defensive portion of OpenWAR. Nowhere in your paper do you acknowledge that the MLB it locations are where the ball is picked up rather than where balls that are hits land or pass through the infield. That makes the data vastly inferior to the data used by MGL for UZR.

True, but MGL pays a lot for that data. It’s not open data, so there is no way that we could have used it for openWAR. We did the best we could with what we could get.

> You also do not seem to adjust the data to compensate for the errors in the field diagrams used by MLB stringers to enter the data.

I’m not sure what you mean. If you have an idea of how we could do this, I’m happy to field a pull request. The relevant code is the “recenter()” function here:

https://github.com/beanumber/openWAR/blob/master/R/GameDay.R#L524

> Nor do you field adjust the data for each player to compensate for actual differences in the dimensions of the field.

Ditto the previous comment.

> I also found it impossible to resolve the differences in Figure 1 from figure 2 both showing probability contour lines. Figure 1 shows a correct representation for outfielders with probability shown as an oval with the major axis perpendicular to the line from the plate to the fielder’s start position. Figure 2 shows an incorrect probability pattern of an oval with the major axis oriented along the line from the plate to the player. Why the change? Also the probability contours are very close to concentric in Figure 2, but clearly are not in Figure 1.

Figure 1 shows the fit for non-parametric model based on a two-dimensional kernel smoother. This model is not constrained to have “circles” so it just kind of is what the data lets it be. The model in Figure 2 is a logistic regression model that is constrained to have a certain geometric shape. So that is why the circles are closer to being concentric.

As for the issue about the “circles” being tall rather than wide—I’m not sure what’s going on there. This could be related to the poor data quality that you mentioned previously, or perhaps the terms in the model are not the best choices. My hunch is that the fielding models could be improved rather easily with some more careful study along these lines, and that is exactly what we were hoping would happen, given the open nature of the project. The CF model is here:

https://github.com/beanumber/openWAR/blob/master/R/getModels.R#L197

and I would happy to discuss any improvements that might make sense for openWAR v2.0.

> It is also unclear from your paper if a fielder that doesn’t successfully make the play on a ball that is fielded by a teammate (whether for an out or not) is nevertheless docked his positions probability of making that play.

We talked about this, but I don’t think we did anything about it. The relevant place to put it would be here:

https://github.com/beanumber/openWAR/blob/master/R/makeWAR.R#L374

> I also don’t understand why you lump all types of base advancement together as you explain in the supplement. Clearly advancing on a hit is different than advancing on a passes ball and both are different than advancing on a stolen base. MLB Gameday data clearly differentiates these different plays.

Yes, but figuring out how to deal with them presented a non-trivial data wrangling challenge. We just punted, but again, if you have a better idea I’m happy to field a pull request. We even have an issue open on GitHub for this:

https://github.com/beanumber/openWAR/issues/12

> Similarly, I don’t understand your statement in the supplement that MLB doesn’t give the hit ball type on every hit ball. They do, in the event description where it can be parsed out.

So this would be great! Please send a pull request. It would have come in the “readData.gameday()” function here:

https://github.com/beanumber/openWAR/blob/master/R/GameDay.R#L97

> “Thus, openWAR is not an attempt to reverse-engineer any of the existing implementations of WAR. Rather, it is a new, fully open-source attempt to estimate player performance on the scale of wins above replacement.”

> This statement from section 1.3 is admirable as is your intention to make OpenWAR reproducible and provide estimates of error.

Thanks!

> However, it is disappointing that you seem to use it as an excuse to ignore the reasoning of why previous implementations may have parts that actually make more sense than yours.

We tried to do our best with this, but we wanted to end up with something that was as elegant and logical as possible. Once again, we don’t consider openWAR to be a finished product, but rather an initial attempt that will evolve into something we can hopefully reach a consensus about.

> It does not make much sense to have a reproducible methodology that no one wants to reproduce.

I guess only time will tell!

#17 Tangotiger 2015/09/30 (Wed) @ 15:21

Ben, the more important point is Tango/7.

#18 Peter B 2015/09/30 (Wed) @ 15:40

Ben, regarding this:

> You also do not seem to adjust the data to compensate for the errors in the field diagrams used by MLB stringers to enter the data.

I’m not sure what you mean. If you have an idea of how we could do this, I’m happy to field a pull request. The relevant code is the “recenter()” function here:

https://github.com/beanumber/openWAR/blob/master/R/GameDay.R#L524

It looks like “recenter()” uses universal adjustments for x, y and scale. Does Gameday scale all the fields uniformly now? They didn’t back in 2007-8, and I haven’t seen an update to this issue since: http://www.hardballtimes.com/using-gameday-to-build-a-fielding-metric-part-1/

#19 beanumber 2015/09/30 (Wed) @ 15:46

#8.

It depends on what question you are asking. If you are asking “Who is the better player?”, then I think the resampling methodology gives you a reasonable sense of how you might compare players with different point estimates for WAR. It assumes that each player’s true talent level is fixed, but provides an estimate of what a reasonable range of outcomes are for that player in similar hypothetical seasons.

If you are asking, “who had the better season?” then the point estimates for their respective WARs already tell you that. However, MGL argues that there is ZERO uncertainty in these point estimates, which is not true. There is uncertainty associated with the models that you fit in order to arrive at those point estimates. What we argue is that that uncertainty is small, because the models are fit to hundreds of thousands of observations. But it is certainly greater than zero.

But there is a third source of uncertainty, which is the situational uncertainty. Since we used play-by-play data, we argue that this uncertainty is not present in openWAR. However, if you are using linear weights, then you do have this uncertainty. This is what Colin Wyers was working on with WARP when he got hired by the Astros:

http://www.baseballprospectus.com/article.php?articleid=21656

#20 beanumber 2015/09/30 (Wed) @ 15:55

#18. Interesting.

According to the 2012 MLBAM specs:

“The x and y coordinates are mapped against the MLBAM field graphics, each a 250 pixel square here the upper left corner (deep left field) is at 0.00,0.00, home plate is at approximately 25.00,199.00 and second base is at approximately 125.00,148.00.”

#21 MGL 2015/09/30 (Wed) @ 15:56

Does “Open Source” mean that the code itself has to be in the public domain or just the exact methodology (and then people can code it any way they want)?

#22 beanumber 2015/09/30 (Wed) @ 16:01

#21. The code itself.

And actually, the term “reproducible research” means even more than that. From our paper:

“Buckheit and Donoho (1995) asserted that a scientific publication in a computing field represented only an advertisement for the scholarly work – not the work itself. Rather, “the actual scholarship is the complete software development environment and complete set of instructions which generated the figures” (Buckheit and Donoho (1995).”

So under this definition, not only do your have to publish your code and your data, but also the programs you use have to be open source.

It may seem like a high bar, but the lack of reproducibility in science has become a huge problem.

#23 Guy 2015/09/30 (Wed) @ 17:19

I think that we are simply adding the positional adjustment in a different place. That is, our RAA_batting figures are *relative to the league average hitter at that defensive position*. Similarly, our RAA_fielding figures are *relative to the league average fielder at that defensive position*. No additional positional adjustments are necessary.

My #6 and Tango’s #7 are fundamentally the same issue, just stated differently. As Ben says, openWAR effectively places the position adjustment in the offensive category, rather than considering it as part of defense. Ben, however “elegant” you may consider that to be, it really is misleading in some important ways, and will be a very serious obstacle in terms of use of your metric. There are two serious problems with your approach.

1) As Tango notes, your defensive metric suggests that each position contributes equal fielding value. That is simply not true, as can be demonstrated in many different ways. You can say you are measuring within-position fielding value, but that just begs the question of each position’s average value. It’s not unlike normalizing offensive value for each lineup position, and then reporting that #3 and #8 hitters had equal offensive production!

2) Your offensive values will be artificially compressed, understating the true range of values created by hitters. I understand you are reporting how well player X hit “for a shortstop,” or “for a left fielder,” but hitters do not in fact bat in either of those positions. They stand in the batters box, and their hits, walks and strikeouts all have the same run value regardless of where they may stand when playing defense. When we say a player “hits OK for a shortstop,” what we are really saying is “this player is a so-so hitter, but he provides a lot of additional value to the team by playing shortstop.” If that weren’t true, why in the world would we tolerate his mediocre bat?

Some would add a third critique, arguing that it’s a mistake to use differences in offensive production to calculate position adjustments (which is what you are effectively doing). I don’t agree with that—I think offensive production is a reasonable proxy, and the answers you get are basically the same as other methodologies. You just need to change your accounting methods, so the position adjustment is treated correctly as defensive value.

#24 Tangotiger 2015/09/30 (Wed) @ 17:58

You can say you are measuring within-position fielding value, but that just begs the question of each position’s average value. It’s not unlike normalizing offensive value for each lineup position, and then reporting that #3 and #8 hitters had equal offensive production!

Well said!

Ben, can you accept the point that the avg LF is not necessarily equally to the avg CF in every single season in the existence of MLB. Accept that and we can actually have a productive discussion on this issue.

#25 Tangotiger 2015/09/30 (Wed) @ 19:07

I’m sorry for sounding harsh. I just need Tango/7 answered.

#26 studes 2015/09/30 (Wed) @ 20:22

#19. Ben, since you use RE24 in our calculations, I assume you are more interested in “who had the better season” instead of “who is the better player.” If so, I agree with that.

When you use RE24, you’re rewarding batters who hit better in high-leverage situations and vice versa. However, when you re-sample the data (and I don’t understand how the re-sampling works) are you in effect randomizing the situational hitting? By doing that, aren’t you essentially turning situational hitting into linear weights and re-centering the data around linear weights instead of the actual situational hitting?

#27 beanumber 2015/10/01 (Thu) @ 10:11

#25. See #23. #7 is equivalent to #6.

#23. (1) When you say “contributing value”, relative to what? Relative to the league average? Relative to replacement level? Or relative to some posited standard? Note that while the average RAA values do come out to zero, the average WAR values do not, since it depends on the distribution of talent among the replacement players.

(2) I see your point. I will think about it and discuss with my co-authors. All we would have to change would be one line of code:

https://github.com/beanumber/openWAR/blob/master/R/getModels.R#L152

My memory is that this is how we had it originally, but then all pitchers in the NL come out as terrible players because they have grossly negative offensive RAA values. So what do you do about that? Give them some compensatory value since they are pitchers? How do you decide how much?

Also, I disagree with your assertion that hitters do not bat in their defensive positions. They do! If they didn’t, then of course, our method would make no sense, but they do!

#28 Peter Jensen 2015/10/01 (Thu) @ 10:14

studes #26 - There are three aspects to a hitter’s production during a specific season. There is the hitters distribution of hits and outs. There is the distribution of baseout situations which the batter faces when he comes to bat. And thirdly, there is the interaction of the two: the specific distribution of the hitters hits and outs in the specific distribution of baseout situations. The value added approach looks this third aspect, the specific distribution of hits and outs in the specific distribution of baseout situations. Linear weights looks at the the specific distribution of a hitters hits and outs in a random distribution of baseout situations derived from distribution of baseout situations in which each type of batting event occurred during the year for the entire league. Re-sampling, as used by this study, takes a batter’s specific distribution of hits and outs and randomly reassigns them to a batter’s specific baseout situations that he personally faced over the year. So re- sampling does not essentially turn situational hitting into all the way into linear weights, but it also does not leave it completely preserved as situational hitting like the value added approach does. As others have already written re-sampling in this study does not seam to serve the author’s stated purpose of creating an error range to a hitter’s performance nor does it serve any other useful purpose that I can envision.

#29 Tangotiger 2015/10/01 (Thu) @ 10:20

Ben:

Let me rephrase to be more specific:

In your system, will the total WAR handed out to all LF *always* be equal to that handed out to all CF?

#30 Guy 2015/10/01 (Thu) @ 11:01

Also, I disagree with your assertion that hitters do not bat in their defensive positions. They do! If they didn’t, then of course, our method would make no sense, but they do!

In fact, they don’t bat in defensive positions. MLB rules tell managers to include defensive position on their lineup cards as a “courtesy” but it has zero significance. A player can move from LF to RF during the game, or even be asked to pitch an inning, but his place in the lineup remains unchanged. Indeed, their really are no official “positions” in baseball other than pitcher and catcher—the other 7 guys can stand anywhere they want, as long as they are in fair territory.

Obviously, by convention we know that players tend to stand in particular places (though this is changing!), and that the best fielders are usually told to stand in particular places (SS, C, 2B) where their fielding skill is best leveraged. But again, that’s similar to lineup position for hitters. We wouldn’t normalize hitting by lineup position (even though hitters actually are assigned a lineup position), because that would create the absurd illusion that #3 hitters are equal to #8 hitters. So why normalize by fielding position, when we know position assignment is similarly dependent on fielding skill?

More fundamentally, the difference you are measuring here is obviously a difference of value when players are in the field. Most teams have players in AAA who could outhit their SS and their C. Why don’t they bring these guys up and put them in the lineup? Because it would be a defensive disaster, of course. So I don’t really see the point of this debate.

#31 Peter Jensen 2015/10/01 (Thu) @ 11:32

Ben - Thanks for taking the time to answer my questions in your post #16. The study published in the Hardball Times referred to by Peter B. in Post 18 is mine and outlines why adjusting the MLB hit locations to specific fields is necessary. MLB may have the general locations for home plate and second base to which you refer in your post, but the actual application to specific field diagrams used by stringers varies from park to park. You may also find useful parts two and three of that Hardball Times series which describe in detail my creation of a fielding metric built around corrected MLB hit locations and using value added run values.

For evidence of the need for adjustment you need look no further than the field diagrams used in your paper. Do the foul lines meet at home plate at a 90 degree angle? No they don’t and neither do they in many of the field diagrams used by the stringers, but in some field diagrams they do. Also the proportions of the outfield to the infield are incorrect and differ in each park diagram so separate scalars need to be calculated for the infield and outfield portions of each park. A third minor concern is the pitcher’s rubber is shown in the wrong position on all diagrams affecting the true “y” ball positions of all balls near the middle of the infield.

“As for the issue about the “circles” being tall rather than wide—I’m not sure what’s going on there. This could be related to the poor data quality that you mentioned previously, or perhaps the terms in the model are not the best choices.”

I would not blame poor data quality since your raw data plots show contours greater in width than depth. Any outfielder can tell you that you get a better read on balls hit directly at you so your range on balls hit to either side is going to be greater than those hit behind or in front of you.

“Yes, but figuring out how to deal with them presented a non-trivial data wrangling challenge. We just punted, but again, if you have a better idea I’m happy to field a pull request.

Separating out the various non batting events is non trivial but it is doable. Unfortunately, I don’t program in R, but I would be happy to share my parsing code for non-batter events (SB,CS,WP,PB etc.) written in VBA with you. Shane Jensen has my email address.

The problem is more serious than just a coding problem. Advancement on hit balls is very dependent on the type and position of the hit ball. You currently are giving all the extra advancement to the runner. But a closer look at runner advancement would give much of the extra advancement runs to the batter. Here is where MLB’s ” where the ball is picked up” hit locations may actually be more useful than BIS’s “where the ball lands” hit locations.

> Similarly, I don’t understand your statement in the supplement that MLB doesn’t give the hit ball type on every hit ball. They do, in the event description where it can be parsed out.

So this would be great! Please send a pull request.

Same problem with me not being an R coder. However the process is the same as the one you used in your code to parse the DESC file for non batting events. You just need to use the phrases used for various hit ball events. Again, an example of how how I do this in VBA would be in my parsing code.

#32 studes 2015/10/01 (Thu) @ 17:03

So re- sampling does not essentially turn situational hitting into all the way into linear weights, but it also does not leave it completely preserved as situational hitting like the value added approach does.

Thanks, Peter. I guess I don’t understand why re-sampling doesn’t essentially turn RE24 into linear weights. I would think the only variation would be due to the batter facing different situations than typical, so you’d be creating linear weights based on batter-specific situations faced instead of league situations faced.

Which is kind of interesting. But is that their true intent? Am I misinterpreting?

#33 Peter Jensen 2015/10/01 (Thu) @ 18:43

so you’d be creating linear weights based on batter-specific situations faced instead of league situations faced.

Yes, that is correct. No, I don’t think that was their intent.

#34 atpkinein 2015/10/01 (Thu) @ 19:44

is there a reason the bootstrap is used and not a jackknife? i dont necessarily understand the exact theoretical implications of each, but resampling with replacement seems like it would be possible to come up with seasons that couldnt possibly exist. for instance, on the low end because the player would get benched or on the high end because lucky homeruns wouldnt get repeated.

#35 atpkinein 2015/10/01 (Thu) @ 19:52

maybe i should clarify my last point. i dont think that a player with 35 HRs would be as likely to have a symmetric distribution around those 35 HRs as a player with 15 HRs, because the player with 35 HRs likely had better luck. does the resampling with replacement create seasons centered and symmetric on, for instance, the HR numbers?

#36 Guy 2015/10/02 (Fri) @ 10:03

I think Peter’s summary of the resampling methodology is spot-on. As Ben admits in comment #19, it treats a player’s actual performance as though it were a measure of true talent, and then tells us the range of possible seasonal outcomes for such a player. But since his performance is not an accurate measure of true talent, it’s not clear how or why this would ever be useful. In any case, it sounds like everyone agrees (including Ben) that resampling does *not* give us an estimate of measurement error, and thus does not tell us how likely it is that one player provided more actual value than another.

*

After changing the accounting for the position adjustment, I’d say the next most important fix for openWAR is probably the problem of controlling for batter and pitcher handedness. This likely skews their results a fair amount. Just doing some back-of-envelope estimates, it looks to me like this would roughly turn an average LHH into a 93 RC+ hitter.

This is a mistake because handedness is intrinsic to a player, not an external condition for which we want to control. As always, we want to know what would happen under the same conditions with a statistically average player at the plate. In MLB, that means a virtual player who hits LH about 42% of the time and RH about 58% of the time. To the extent that a LHH enjoys the platoon advantage more than a RHH, that’s part of that player’s skill set, not a favorable “condition” which any other player in the same situation would inherit and benefit from.

Bryce Harper has had the platoon edge 71% of the time this season, far more than a RHH would. Is that an advantage for Harper? Sure it is. But so is being 6’-3” and 220 lbs. Should we control for hitters’ height and weight too? (All bow before Jose Altuve, lord and master of MLB.) What about their speed? Eye-hand coordination? Obviously, the answer is no. And handedness is just as much a part of a player’s skill set as are those qualities.

#37 Tangotiger 2015/10/02 (Fri) @ 10:35

In my question to Ben:

In your system, will the total WAR handed out to all LF *always* be equal to that handed out to all CF?

If my understanding of the system is correct, then the answer is “yes”.

This is exactly what Pete Palmer did in The Hidden Game, something that I’ve considered and rejected.

As I said, if a group of LF hit worse than a group of CF, you can set them all to “0” for an “internal” comparison, so that an average fielding LF = 0 and an average fielding CF = 0 and an average hitting LF = 0 and an average hitting CF = 0.

But when you compare LF to CF, you need yet another adjustment.

You simply cannot have the overall LF = overall CF. This doesn’t reflect reality in any way.

Therefore, the sum of all LF WAR cannot by definition be forced to equal that of CF.

#38 Guy 2015/10/06 (Tue) @ 10:55

Here’s a good example of the problem we were discussing earlier, in a tweet from one of the openWAR creators:

SITW ?@StatsInTheWild
Top 5 fielders: Pillar, Cespedes, Guyer, Hardy, J., Gordon, A. https://gjm112.shinyapps.io/openWAR/#openWAR

When your metric tells you that 3 of the top 5 fielders in MLB are corner outfielders, you’ve got a problem.

Also hoping that Ben will find time to respond to the other issues raised here (understanding that he may need to spend some time actually teaching).

#39 Tangotiger 2015/10/06 (Tue) @ 11:31

That’s the problem when you just put out scales like that. You really need to get ahead of that, and that’s why you put the positional adjustment on the defense side, not offense side.

Here’s how Fangraphs orders its players on Defense:
http://www.fangraphs.com/leaders.aspx?pos=all&stats=bat&lg=all&qual=300&type=8&season=2015&month=0&season1=2015&ind=0&team=0&rost=0&age=0&filter;=&players=0&sort=20,d

CF Kiermaier
SS Andrelton
SS Hech
SS Crawford
SS/2B Addison Russell
SS Ahmed

That makes a lot more sense.

What’s disappointing is that we’ve talked about this for years. We’ve considered the alternatives, we’ve hashed out the issues, we’ve gone through all of this. This is, basically, settled law.

If we were given NEW arguments, new evidence, a new way of thinking, then I’d have something new to think about.

Instead, I’m just rehashing discussions from 10 years ago. We can’t keep looking back at those who simply refuse to follow the better path.

#40 J-Doug 2015/10/10 (Sat) @ 16:48

So glad for this thread. Wish we could still get email alerts.

One of the great things about open source is that one can fork openWAR and resolve many of the controversial choices, such as RE24 resampling and applying positional adjustments to offensive performance, neither of which is defensible based on the last fifteen-to-twenty years of research into player value. Those should be simple modifications.

#41 Guy 2015/10/11 (Sun) @ 10:43

I learned via a Twitter exchange with Greg Matthews that openWAR generates a new replacement level for each position in each season (based on performance of non-starters at that position). Maybe everyone else already knew that, but I didn’t. As you can imagine, the result is very unstable (and must often yield crazy results when used mid-season). So, for example, Harper loses about one full win compared to Trout because the average-vs-bench player gap is apparently much larger in CF than RF this year. Another issue to fix.

#42 Tangotiger 2015/10/11 (Sun) @ 10:58

Can someone post the replacement level for each position for each year, since, I dunno, 1998?

Tangotiger Blog

Thursday, September 24, 2015

OpenWAR

Latest...