[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Showing posts with label Fielding. Show all posts
Showing posts with label Fielding. Show all posts

Tuesday, March 01, 2011

Comments on Bill James Gold Mine 2010, pt. 2

2. Defensive Win Shares and Loss Shares

James has revamped Win Shares over the last couple of years to include Loss Shares. I think this is a very good thing, although I look forward to when (if?) the entire methodology is published. Without the full explanation, it's dangerous to comment about isolated details, but James' essay on "Explaining Defensive Win Shares to a Dead Sportswriter" is tough to ignore. My Twitter-friendly take on it: He's going to have trouble explaining it to a lot of people, not just dead sportswriters.

Again, it's impossible to evaluate the method while knowing so little about it, but James makes this extraordinary statement:

Making outs increases the team's responsibility to play defense. When you make more outs, that increases the team's responsibility to play defense. Therefore, if two players are the same in the field but of them makes more outs, the one who makes fewer outs has to come out ahead when you compare the player's defense contribution to his defensive responsibility.

Lest you think that was just a slip, he doubles down:

While we are in the habit of thinking of offense and defense in baseball as un-connected, they are in fact not un-connected. There is a very important connect between them, which is the rule that for every out you make on offense, you must record an out on defense.

Bill James is obviously a very intelligent man, and you a very intelligent reader, so I am hesitant to respond to this--the response should write itself. Limiting myself to a paragraph or less, I suppose it is technically true that each out on offense is matched by a defensive out, barring walkoffs and rainouts and the like. But there is no causation between the two. The rules of the game require three outs per inning and nine innings per team. Each team makes 27 outs regardless of the rate at which they use them (think OBA) or any other factor.

An individual who makes outs at a higher rate than some comparison player does not increase the number of outs that his defense must record. The defense must record 27 outs regardless of what an individual does at bat. What does happen is that by consuming excess outs, the individual batter leaves less outs to be consumed by the other eight members of his lineup, and fails to generate additional plate appearances for them.

James later seems to suggest that the revamped DWS-LS system assigns the same responsibility to field to each position, regardless of where it stands on the defensive spectrum. He then states his objection to offensive-based positional adjustments, and so it seems as if the stuff about making outs might be a backdoor way of applying positional adjustments. It's unclear, though, and still doesn't follow logically.

James’ discussion of positional adjustments also seems to gloss over the use of defense-based positional adjustments or the fact that most of us who still use offensive positional adjustments do so because we believe they provide a ballpark estimate of the defensive differences between the positions. When I use an offensive positional adjustment, I'm not saying that I think a shortstop with a 5 RG is a better hitter than a first baseman with a 5 RG. What I am saying is that the difference between aggregate offensive performance between shortstops and first baseman (when considered carefully and over a long period of time) approximates the inherent difference in defensive value.

You are certainly free to reject that argument (and many sabermetricians that I respect very much do just that), but please recognize that the sabermetrician using an OPADJ is likely not making the claim that a player's offensive contribution is altered by his fielding position.

More important than my own positional adjustment folly is an apparent failure by James to recognize that the positional adjustments that are now used most prominently in the community (generally Tango's, which have made their mark on the PADJs used in WAR figures from both Chone and Fangraphs) are based on estimates of the defensive difference between positions, sometimes informed by offensive averages. Furthermore, the sources do not lump the positional adjustment into the offensive ranking--they break everything (offense, fielding, baserunning, position, etc.) into smaller components, which are then summed to produce RAA, WAR, or some other total value metric.

Again, it is possible that I have misunderstood James' point, or that he has done a poor job of expressing himself, and that DWS is completely logical. However, I think it is going to take a much more thorough explanation of the system to give people that read the Gold Mine piece a lot of confidence in his methodology.

3. Strikeout rate

One of the most thought-provoking essays is "Whiff 7", which discusses the phenomenon of strikeout rates continuing to reach all-time highs. James argues that there is no end to this in sight under current conditions, as teams have an incentive to find power pitchers but no disincentive to find batters that avoid striking out. James argues that the standard deviation of power (he doesn't use that terminology) has decreased over time, and so league homer rates have gone up while the top individual performers hit about as many homers as they did in previous eras.

James then offers some suggestions of rule changes that would slow or reverse the trend. It's an interesting piece, and it didn't prod me to respond to it directly, but rather to make a tangential and mostly unrelated point about how we measure strikeout rates--a wholly unoriginal and stale one at that.

I have for a long time advocated using K/PA rather than K/IP as the measure of pitcher strikeout proficiency (I’m not claiming this is unique, as others have carried that banner with much more vigor and coherent arguments than I have offered). Through no effort of mine, the use of K/PA has increased in the sabermetric community, with sites like The Hardball Times and Fangraphs prominently utilizing K/PA.

As an example of how the different denominators can change perception, consider the point that most long-term successful pitchers have at least average strikeout rates. This is a point that the average fan still mystifyingly misses a great deal of the time. Take Greg Maddux for example. Maddux is apparently seen by some as a non-strikeout pitcher. Here is a table with his K/9 versus the league average, with KAA being strikeouts above average per inning:



For his career, Maddux struck out 6.1 per nine, while the league average was 6.4. He struck out 206 less batters than an average pitcher would have in the same number of innings. Without seeing the same figure for a lot of pitchers, it's hard to contextualize that, admittedly.

Suppose that instead you look at Maddux through K/PA:



Now Maddux' strikeout rate is essentially average--he struck out 17% of opposing batters, the same as the league average. Maddux' career rate is lower (it's actually 16.5% to 16.6%), but just barely so, and by this metric he only recorded 22 less strikeouts than average.

In Maddux' peak years (I think 1992-98 stand out), he was above-average even by K/9--+90 KAA, while he was an even more robust +196 when K/PA is the standard.

This is not intended to recast Maddux as a strikeout fiend--certainly he was not, even at his best. Still, Maddux' strikeout rate is more impressive when viewed in light of the number of opposing batters he actually faced rather than in terms of innings pitched, which really is just a measure of the percentage of outs a pitcher gets via the K rather (this is obscured by displaying strikeouts per 9 innings rather than strikeouts per 27 outs).

In addition to K/9, there are several other per-inning pitching ratios in common usage--H/9, W/9, HR/9, WHIP. What all of those have in common is that they are ratios of bad things (offensive successes) to good things (outs recorded). K/9 is a ratio of really good things (outs recorded by strikeout) to another set of plain old good things that includes the really good things (total outs recorded). As such, it's best viewed as a measure of a pitcher's reliance on strikeouts.

Monday, December 06, 2010

Statistical Meanderings, 2010

This is about as close as I get to writing a Jayson Stark-style piece throughout the course of the year. Sine I hate that format, hopefully there will be something of greater interest than sheer trivia here. Most of the statistics mentioned come from my End of Season Stats and are explained in that post:

* Last year the AL/NL scoring gap in terms of R/G was the largest it had been since 1998; this year, at .12 (4.45 to 4.33) it was the narrowest it had been since 1990 (4.30 to 4.20). The overall scoring average of 4.38 was the lowest for the majors since 1992 (4.12); 1992 was also the last time that either of the leagues individually had as low of a scoring rate.

The offensive difference between the AL and NL was largely due to a difference in league batting averages. As a group, the AL and NL had nearly identical walk rates (.095 and .096 walks/at bat) and isolated power (.147 to .144), but the AL BA was five points higher (.260 to .255). The NL slugged just .399, the first sub-.400 league figure since the 1993 NL.

* I list two different winning percentage estimators in my team report. EW% is based on actual runs scored and allowed, while PW% is based on runs created and runs created allowed. Teams whose actual W% were very similar to both of the estimates included (W%, EW%, PW%): Atlanta (.562, .567, .564), Cincinnati (.562, .567, .564), Florida (.494, .501, .495), and Texas (.556, .564, .557).

An interesting group of teams is those whose PW% tracked their actual W% much better than EW% did. These are teams that may be over/underrated for 2011 by those that put a great deal of stock in Pythagorean record as an indicator. Such people are largely strawmen, but regardless, some of the teams in this group are Baltimore (.407, .386, .411), the Cubs (.463, .447, .468), Pittsburgh (.352, .324, .351), and St. Louis (.531, .564, .542). I'll leave it to the reader to find the more conventional Pythagorean watch teams, those whose EW% and PW% are in general agreement and diverge from actual W%.

* Last year, SF games were the lowest scoring in MLB at 7.83 RPG, which was the lowest figure since the 2003 Dodgers. In 2010, 7.83 RPG would have ranked just third-lowest, as Seattle (7.48) and San Diego (7.69) each exhibited a lower scoring context. The Mariners' 7.48 still couldn't touch the 2003 Dodgers at 6.98, but it was the lowest RPG for an AL team since the 1981 Yankees (7.14). Of course, Seattle's RPG was lowered by the .97 park factor, but even after park-adjusting the figure to 7.71, it was still the lowest AL scoring context since the 1989 Angels (7.70, not park-adjusted).

Commenting on Seattle's offensive ineptitude can be considered hitting after the whistle at this point, but allow me to indulge. Their 3.17 runs/game was the lowest since the 1981 Blue Jays averaged 3.10. Seattle's 2.95 R/G at home was the lowest since 1972, when both the Padres (2.71) and Angels (2.76) scored less. While Safeco had a large home/road split in 2010, the five-year PF is .97--a pitcher's park, yes, but not an extreme one.

They were more respectable on the road, averaging 3.38 R/G, a mark which the Pirates (3.14) managed to keep from being even the lowest in 2010 (although outside of the Pirates' showing, it was the fewest since the 1994 Pirates scored 3.20 away from home. I did not run the numbers relative to league average, but it probably wouldn't do to much to help Seattle; while the AL's average of 4.45 R/G is low relative to recent seasons, it's still a perfectly normal league scoring level in historical context.

* Unfortunately, we never got to see the playoff matchup between New York and Tampa Bay. While the concerns about the Rays running wild in such a series were likely overstated, it is true that the Yankees struggled at controlling the stolen base game. The 85.2 SBA against them was easily the highest in MLB, with the Red Sox (80.1) next. The Cardinals lead baseball with only 58.9% of opposition attempts successful.

* It will come as no surprise that Pittsburgh had a terrible defense in 2010. The degree of their anti-dominance may be a little jarring though: last in BA (by ten points, .283), last in OBA (by eight points, .347), last in SLG (by fourteen points, .451). The only team offense that exceeded any of the Pirates' allowed figures was Toronto, which slugged .456.

The Pirates were also last in innings/start (by .11 innings, 5.38), starters' eRA (by a whopping .62 runs, 5.86), and DER (.659). Their bullpen ranked only fifth-worst in eRA (4.82), and their modified fielding average was third-worst (.962). All of this predictably resulted in allowing 5.4 R/G (more than any team managed to score).

* In 2009, playoff teams averaged +72 runs above average on offense and just +44 on defense. In 2010, the teams exhibited more balance, as you can see:



I'd usually snark about defense winning championships at this point.

* You're probably aware that the long-term trend in MLB, pretty much dating all the way back to 1871, has been for fielding averages to increase. For the most part this holds, but there was an odd blip in 2009. The all-time high ML mFA is .9704, set in 2007. In 2008, the mFA rounded to four decimal places was the same but actually was a bit lower. In 2009, however, mFA dropped all the way to .9669, the lowest since 2001. The decline of .36% was the largest in the post-war era.

In 2010, mFA rebounded to .9693, an increase of .25%. That is still the lowest average (excluding 2009) since 2004. I am not claiming that fielding average is an important metric, or that there is a meaningful explanation for the fluctuations, but in looking at league fielding totals it caught my eye.



* Major league teams had a .559 W% at home in 2010, the highest mark since 1978 (.573). 93% of major league teams (28/30) had better records at home than on the road, which sounds like a lot, but while high it isn't extraordinary. (San Diego had the same record home and away). The average for 1961-2010 is 83%, but as recently as 2007-08, 29/30 teams have had better home records. In both 1978 and 1989 all teams had better records at home.

Much was made about the Pirates' .210 road W% (17-64), the worst since the identical showing by the 1963 Mets. Also notable was Detroit's home/road split of .642/.358, which was of equal magnitude to that of the Pirates and was the largest by a .500 or better team since the 1996 Rockies (.679/.346). The Rockies and Braves chipped in to make it four of the 23 highest differentials since 1961 in 2010.

* Cleveland fans seem to be pretty happy with new closer Chris Perez, and given his performance (7th in the AL among relievers with 20 RAR), but it would be a mistake to assume that he's proven himself as the long-term answer at the end of the game. He allowed a low .234 %H, so his 3.72 dRA is well above his 2.10 RRA or 2.86 eRA. The batted ball metrics are even less impressed--4.45 cRA, 4.74 sRA.

* Jonathon Papelbon pitched 67 innings with a 4.24 RRA, which results in 5 RAR; Scott Atchison pitched 60 innings with a 4.16 RRA, for 5 RAR as well. Of course, the similarities end there, as Papelbon's peripherals were much better than Atchison's, but it's never a good thing when your pricy closer is no more effective than the seventh man out of the pen.

* Bobby Jenks has been non-tendered, and obviously I have no insight to offer on his health or his PitchF/x data or anything like that. What I can tell you is that his peripherals were pretty good in 2010: 2.89 dRA (his %H was very high at .365), 3.34 cRA, 2.91 sRA. If he's healthy, he might be a good buy.

* Chad Qualls gave up a massive .397 %H; he actually looks serviceable in dRA (4.26) and the batted ball metrics (4.33 and 3.89).

* Trevor Hoffman was terrible in his swan song, but at least he was consistent across the board in RA estimators: 5.95 RRA, 5.82 eRA, 5.89 dRA, 5.45 cRA, 6.10 sRA. Ryan Madson was consistent in a good way: 2.47, 2.80, 2.88, 2.83, 2.90.

* Last year I pointed out that Francisco Rodriguez didn't pitch very well in the first year of his big contract, so I feel obligated to point out that he was pretty good in his 57 innings in 2010: 2.34 RRA, 3.16 eRA, 3.10 dRA, tied for fifteenth among NL relievers with 16 RAR. Of course, his off-the-field performance took a corresponding nose dive...

* Who is Wilton Lopez? I probably saw less Houston games than any other team this season, so I never saw him pitch. The 27 year-old Nicaraguan rookie ranked among the ten most valuable relievers in the NL (not considering leverage) with a 2.02 RRA in 67 innings and solid, consistent peripherals (3.41 eRA, 3.24 dRA, 3.24 cRA, 3.22 sRA). He was shelled last season in 19 innings, and his strikeout rate (6.7) leaves a lot to be desired, and for the season as a whole he wasn't trusted by Brad Mills, with a below-average Leverage Index. He did inherit .49 runners/appearance, but sometimes a high IR/G goes hand-in-hand with a mop-up role. As if that wasn't enough cold water, his minor league numbers don't look like anything special from a quick glance. It was a nice season in any event.

* I don't list Inherited Runs and Bequeathed Runs Saved on the reports themselves, but if you download the spreadsheets, they are included. The AL leaders in IRSV were Matt Thornton (5.7), Randy Choate (5.1), and Joaquin Benoit (5.0). Dan Wheeler (4.0) also had a particularly good showing from Tampa's pen. Eddie Bonine trailed the AL at -8.4

Among AL Relievers, Dusty Hughes got the least support from subsequent relievers as 15/25 scored (7.2 BRSV). Lance Cormier benefited the most as only 1/30 bequeathed runners came around to score (-8.1 BRSV).

In the NL, Wilton Lopez (9.2), Javier Lopez (9.2), and Santiago Casilla (8.7) were the leaders in stranding runners; Lopez allowed just 1/33 to score. A pair of Dodgers were the trailers: George Sherrill (-5.7) and Ramon Troncoso (-10.5 on 22/37).

Apparently Ronald Belisario was one of their victims, as he got the least support from subsequent relievers (4.6 BRSV on 12/24). Joe Thatcher was the most fortunate, as his Padre penmates prevented all 35 runners he bequeathed from scoring (-10.5).

* The flip side to the last bullet point is bequeathed runs saved for starting pitchers. In the AL, Rich Harden got the most support (2/20, -4.2) followed by Jake Westbrook and Jon Lester (both -3.7). Jason Vargas got the least help (12/20, 6.0).

In the NL, Jonathan Sanchez (3/25, -4.9) and a pair of Braves (Derek Lowe, -4.3 and Tommy Hanson, -3.0) were the best-supported by their pens. Scott Olsen (10/16, 5.0), Chris Narveson (4.8), and Kevin Correia (4.6 despite pitching for San Diego with their excellent bullpen) were the least supported.

* Presented without comment: Max Scherzer 37 RAR, Edwin Jackson 25. Clayton Richard 31 RAR, Jake Peavy 17.

* I usually only include pitchers with 15 starts on the starters report, but I had to throw Stephen Strasburg into the mix. Among NL pitchers with 15 starts (plus himself), he ranked twelfth in RA, fourth in eRA, and first in dRA, cRA, and sRA. Obviously that was in just 68 innings, but it was fun while it lasted.

* It's a shame there is no LVP award, as it would be a runaway in both leagues--Ryan Rowland-Smith, -26 RAR in the AL and Charlie Morton -31 in the NL. Rowland-Smith was 1-10 with a 7.83 RRA in 109 innings, and none of his peripheral RAs were much better (6.37 sRA was his best). He was last in the AL in all of the run averages, plus QS% (20).

Morton's season was more respectable, as he pitched 80 innings and gave up a .367 %H, which means his dRA (5.50) and batted ball RAs (5.10 and 4.56) weren't terrible. Teammate Zack Duke was next on the RAR trailer list (-20) and matched Morton with -42 RAA thanks to being allowed to pitch 159 innings.

* Cliff Lee and Jon Lester are very close in most of the categories I list, in addition to both having names that start with "L" and being left-handed. Lee pitched 4 1/3 more innings, with essentially the same RRA (3.55 to 3.52), eRA (3.19), and cRA (3.55 to 3.52). Lee was a little better in dRA (3.08 to 3.26), Lester better in sRA (3.77 to 3.36). Lee's %H was .300 to Lester's .295; Lee made 106 pitches/start, Lester 105; Lee made 64% quality starts, Lester 63%. Both get credit for 21 RAA, while Lee gets one more RAR (51 to 50).

* The bottom 12 starting pitchers in the AL in RAR are: Ryan Rowland-Smith, Brian Bannister, David Huff, Scott Kazmir, Rich Harden, Josh Beckett, Scott Feldman, Jamie Shields, Nick Blackburn, Jeremy Bonderman, Tim Wakefield, and AJ Burnett. The NL's bottom dozen is a much more conventional list of lousy pitchers: Charlie Morton, Zach Duke, Kyle Lohse, Nate Robertson, Manny Parra, Jeff Suppan, Paul Maholm, Bud Norris, Kevin Correia, Craig Stammen, Dave Bush, and John Ely.

* We all know that Texas owes its success largely to pitchers going deep in games, right? Wait, they only averaged 5.87 IP/S, ninth-lowest in the majors? Well, surely that must be because of some bad starts from Rich Harden or something.

Well, here are the P/S (counting relief appearances as half-starts) for the Rangers' starters with 15 or more starts (including starts made for other teams in the case of Cliff Lee): Lee 106, Wilson 104, Lewis 103, Feldman 95, Harden 93, Hunter 85.

Tampa Bay, their playoff opponents: Price 107, Garza 101, Shields 100, Davis 96, Niemann 89.

A few other teams: White Sox: Danks 106, Peavy 100, Buehrle 100, Floyd 97, Garcia 88.

Boston: Lackey 109, Lester 105, Matsuzaka 105, Beckett 103, Buchholz 100, Wakefield 85

Oakland: Gonzalez 102, Cahill 100, Mazzaro 99, Sheets 98, Braden 95, Anderson 95

Angels: Weaver 109, Santana 108, Saunders 100, Pineiro 100, Kazmir 98

Detroit: Verlander 113, Scherzer 106, Porcello 96, Galarraga 95, Bonderman 93

Of course, facts aren't particularly important when you need to rhetorically twist a playoff series into a morality play.

* Earlier in the season, I posted about the number of no-hitters thrown in 2010 and how it compared to expectation based on both the historical frequency of no-hitters and a theoretical probability of a no-hitter. The details are in that post and will not be repeated here.

There were six no-hitters pitched in 2010 out of 4,924 possible games (double-counting games since of course each pitcher has an opportunity to throw a no-hitter). Given the long-term frequency of no-hitters figured by Tom Flesher (.06%), we'd have expected 2.954 no-hitters. The Poisson distribution yields this expected distribution:



I also offered a distribution based on the theoretical probability derived from the overall major league BA (.2573) with 2.734 expected no-hitters (.056%):



Whichever model you choose, there's nothing particularly shocking about six no-hitters in one season, something that has about a 6% chance of occurring by chance. If I wanted to get cute, I'd point out that 6% is about once every twenty years, and the last big no-hitter season was in 1990, but that would be sabermetric malpractice.

* Speed Score leaders and trailers by position (with the caveat that these are based on just 2010 data when Bill James intended them to be based on multiple years):

C: Miguel Olivo (6.1)/Chris Snyder (.7)
1B: Kevin Youkilis (5.5)/Adrian Gonzalez (1.0)
2B: Sean Rodriguez (6.2)/Luis Valbuena (1.7)
3B: Evan Longoria (5.4)/Wilson Betemit (.9)
SS: Rafael Furcal (7.8)/Juan Uribe (2.2)
LF: Carl Crawford (9.1)/Pat Burrell (1.0)
CF: Dexter Fowler (9.1)/Torii Hunter (2.5)
RF: Will Venable (8.6)/Ryan Ludwick (1.9)
DH: Johnny Damon (5.5)/Willy Aybar (.7)

* RGs for Yankee hitters with 300+ PA: 6.8, 6.1, 5.9, 5.8, 5.6, 5.4, 5.2, 4.3, 4.1. Without doing any formal checks, I have to assume that's one of the more balanced league-leading lineups you are likely to see. The Yankees led in team RG with 5.14; the Red Sox were second at 5.08. Their breakdown for 300+ PA hitters was: 7.6, 6.7, 6.5, 6.2, 5.9, 5.1, 4.9, 4.9, 4.2. The range is only about one run wider, but it's more top-heavy.

* Ryan Braun and Prince Fielder had different shapes to their production (Braun had a .305 BA and .289 SEC while Fielder was at .263/.409), but the outcomes were very similar. Braun created 110 runs while making 434 outs for 6.45 RG, while Fielder created 109 runs in 427 outs for 6.49 RG; both were at +36 HRAA. Position adjustments put Braun several runs ahead in the categories where they are included, but it would be tough to find two stars on a team better matched.

* Park-adjusted stats: Matt LaPorta .222/.307/.364, 3.84 RG, 0 RAR. Justin Smoak .215/.305/.365, 3.87, 0. These two make an interesting pairing since both were the centerpiece of a trade of a former Indians left-handed ace. Of course it has been two years since LaPorta was traded to Cleveland while Smoak was traded this summer, but they both will need to improve on those performances to avoid the trades becoming Santana II and III. Personally, I think Smoak is a much better bet--he's younger and LaPorta was nagged by several injuries during 2010.

* The Indians got nearly identical offensive production out of their two primary center fielders, Trevor Crowe and Michael Brantley. Crowe hit .252, Brantley .247. Each had a .299 OBA. Crowe slugged .334, Brantley .328. Add in steals and they both created 3.42 runs/game and were essentially replacement level (1 RAR each). This was actually an improvement over Grady Sizemore, who hit even worse (.212/.264/.291) in his 137 plate appearances.

* Alex Rodriguez had the worst full-time season of his career in 2010. His hitting relative to the league average was a little worse in 1997, but he was playing shortstop at the time and had an additional 40+ plate appearances than he did in 2010. This is not meant as a condemnation of A-Rod in any way, as he was still a valuable asset and after all is 35 years old.

However, I was surprised by the lack of media excitement about this. Certainly it was pointed out that he was not his old self, but usually anything negative about the man is blown completely out of proportion, and the media could have had a field day if they'd framed the story in a certain light. Why didn't they? Speculation about motivation aside, RBI offer a statistical explanation. Rodriguez batted in 125 runs, most since his MVP campaign in 2007, and the sixth-highest total of his career.

At the same time, Rodriguez' ratio of RBI to RC (using no park adjustment in the latter) was the highest it had ever been at 125/92 = 1.38. His previous full-season high had come in 1999 (111/102, 1.09). For his career, A-Rod's RBI and RC are very close (1855 RBI, 1831 RC), and so he is not a hitter that has a RBI > RC tendency.

Among batters with 300 or more PA, only Pedro Feliz (40/26, 1.51) and Willy Aybar (43/30, 1.45) had a higher ratio than Rodriguez. Ichiro turned in the major's lowest ratio (43/91, .47). A-Rod's relative lack of production was masked by his RBI count, and if it has anything to do with preventing a media freakout, I'm personally pleased it was.

Matt Klaassen of Fangraphs has written about ARod's RC/RBI ratio as well.

* I hadn't noticed how poorly AL shortstops hit until the Silver Sluggers were announced. When I heard "Alexei Ramirez", I was little surprised. Of course, then I looked at the numbers a little more closely, and it's ugly. These are all AL players with 300+ PA who were primarily shortstops:



None of them managed to match even the AL average RG, which leads to this amusing chart of Silver Slugger winners together with their HRAA:



Admittedly, using average as a baseline is a cheat for shock value in this case.

* Another hitter of historical note whose 2010 wasn't up to his own previous standards was Albert Pujols, at least if you believe some accounts. It has been seemingly common to claim that 2010 was Pujols' worst season, but I beg to differ.

Looking at unadjusted (for league or park) RG, it was only Pujols' fourth-worst season, as he had lower RC rates in 2001, 2002, and 2007. Factoring in league and park, I have his ARG at 206, which was actually a smidge better than his pre-2010 career mark of 204, and puts the season well ahead of the three years already discussed and in the same general pool as 2004-2006. Factor in that Pujols set a career high with 690 PA, and my fielding-less WAR pegs it as the fifth-best season of his career--right in the middle. And still a season strong enough to be worthy of NL MVP honors.

Sunday, August 22, 2010

Rudimentary Team Fielding Metrics

If you divide baseball into offense, pitching, and fielding, there's no question that fielding is the one I spend the least amount of time on as a sabermetrician. Just look at the labels on the side of the page; as of this writing I have 61 posts labeled "Offense", 24 labeled "Pitching", and just 3 labeled "Fielding". This even understates it a little, since none of the fielding posts include any new ideas put forth by me, and because a lot of what is classified as "Offense", like run estimators, is equally applicable to the defense, but as a whole.

It's not that I don't think fielding is important to winning ballgames. It's not that I think sabermetrics has got fielding all figured out. It's just not a topic that I have ever had much to contribute towards.

One of the major reasons for this is the same reason that I don't do any work with Pitch F/x data--I don't really understand it well enough to come up with anything useful. I love algebra and probability and statistics and most calculus; I hate geometry and trigonometry and those calculus problems in which you try to figure out the volume of a cylinder rotated around the y-axis. Have a problem which requires use of the quadratic equation, the binomial distribution, or partial differentiation? Sign me up. But the minute you start tossing around polar coordinates or angles, I'm just as math-averse as the average old-school sportswriter.

This limitation can be quite an inhibition when it comes to being on the cutting-edge of fielding or pitch analysis, so I stick with the topics in which one can safely avoid any angles or sines. I was reading an article the other day in which "3-space" was mentioned, and it was the first time in reading a sabermetric piece that I could relate to the guy who says "I like baseball, not math."

None of that is intended to in anyway suggest that the research being done in those areas is less important than the things I write about. It's more of an apology for the rudimentary nature of the team fielding metrics that follow.

The impetus for this post is that I wanted to add a couple of team fielding metrics to my end-of-season stats, just to make it clear that I realize fielding is part of the game. The philosophy with those stats has always been to stick to either official categories or things that are easy enough to find otherwise (like doubles and triples allowed, or inherited runners). So any of the advanced fielding metrics are already disqualified from inclusion, and even if they weren't it would be pointless because I would just be copying someone else's work so to speak.

So, limiting the scope of categories available just to the official and semi-official categories, what can one do about team fielding? Obviously there's Defensive Efficiency Record, which is very important even in the PBP fielding age. There's team Fielding Average, which is not particularly useful but is still widely cited in the mainstream. You could do something with double plays, passed balls, or stolen base percentage and after that the pickings are fairly slim. I've passed on double plays because they are highly correlated with the groundball tendencies of the pitching staff, and to look at them without that context would be misleading at best.

As a result, I'm including just three categories: DER, a modified fielding average, and a rate of wild pitches and passed balls:

(1) Battery Mishap Rate (BMR)

It is hardly a novel idea to combine wild pitches and passed balls; while I was working on this post, by chance I stumbled upon Bill James describing the distinction between WP and PB as the "silliest distinction in the records" in 1988. I agree with him, so BMR is simply the ratio of WP and PB to baserunners, multiplied by 100:

BMR = (WP + PB)/(H + W - HR)*100

A battery mishap can occur without a baserunner (a mishandled third strike that allows the batter to reach) but baserunners make more sense as the opportunity factor than anything else. The highest team BMR of the last twenty years (1990-2009) was 6.0, by the 1993 Marlins (a wonderful combination of a knuckleballer with an expansion team); next is the 1990 Yankees (5.8, without any such easy excuse). The lowest rate was 1.5 by the '92 Padres, and the average was 3.4 overall and in 2009. The lowest BMR was 2.0 (BAL); the highest was 5.4 (KC).

(2) Modified Fielding Average (mFA)

Fielding Average has many issues, foremost of which is that it is built on the silly distinction between a hit and an error. Still, it's not going anywhere and it won't hurt anything to list it on a spreadsheet.

A really easy alteration to traditional FA is to remove strikeouts, since they are generally easy putouts with little opportunity for errors. In fact, the most common mishap on a strikeout is a wild pitch or a passed ball, and thus not scored as an error at all. So we can define kFA (strikeout-adjusted FA) for a team as (PO + A - K)/(PO + A - K + E).

I think there's another modification that's simple but justified, and I wouldn't be surprised if someone has already proposed it, although I couldn't find anything in a quick search. Consider this theoretical inning:

1. 6-3
2. E5
3. 6-4 fielder's choice
4. 5-4 fielder's choice

Three putouts, one error, three assists = .857 FA

And this one:

1. fly to 8
2. fly to 9
3. E4
4. 6 unassisted fielder's choice

Three putouts, one error, no assists = .750 FA

Team A is credited with a better FA (ostensibly a lower error rate) than Team B, but does this really make sense? Each team recorded three outs and made one error. In the first case, plays were completed by assists, while in the second all plays were made unassisted.

It's possible to make the case that plays involving assists take more skill, generally, than those that don't. But even someone taking that position would have to admit that there are many cases in which there is no meaningful distinction (such as the fielder's choices with and without assists). In some cases, like a first baseman with bad knees flipping to the pitcher, or a rundown involving more players than necessary, the assist is actually indicative of a poorer fielding outfit.

The practice of including assists in fielding average appears to me to be a reflexive application of the same formula that one would use for individuals. There's no reason why the same formula must be used for teams as well. Considering the number of players that handle the ball obscures the fact that the goal of a team in the field with respect to errors (which isn't really the goal at all) is to make as few errors as possible while recording outs, not collecting chances.

So I offer a modified FA for teams:

mFA = (PO - K)/(PO - K + E)

One thing I should note is that there's a decent case to be made for looking at the complement of FA--making errors the numerator rather than putouts. Since all the numbers are clustered in a small range in the upper .900s, they look better on paper clustered in a small range less than .050.

For most teams, using mFA makes very little difference. For 1990-2009, the correlation between kFA and mFA is +.994. Over that period, the average team has a ratio of .51 assists per (PO - K), ranging from .43 ('02 MIN) to .59 ('03 LA). mFA correlates better with DER, but not significantly so. The teams with high ratios would generally have been teams that got more groundballs, and as a driving factor for why kFA and mFA diverge, there is a messy and intertangled relationship between mFA, DER, and overall team defense (including pitching).

Even if one believes that plays involving assists should be given extra weight, do you really think the appropriate weight on assists is one, double-weighting those plays in establishing the opportunity factor for errors? mFA weights them at zero; perhaps it would make more sense to use, say, .3, but one seems excessive in any event.

One might ask why BIP is not the denominator. The drawback to using BIP is that a team's error rate would be reduced by allowing a hit. It makes more sense to combine errors and hits as failures and compare them to BIP--which is exactly what DER does.

The average mFA over the period and in 2009 was .967. The highest mFA was .981 by Seattle in 2003; they ranked third in standard FA. The top nine teams in FA rank are also the top nine in mFA, although in different order. The lowest mFA was .951 by the 1992 Dodgers--that's what happens when Jose Offerman plays short and makes 42 errors. That Dodgers team was second-to-last in traditional FA; the opposite combination is true for the 2009 Nationals.

The team whose ranking improves the most by using mFA is the 1992 Tigers (.981 FA, .969 mFA). They recorded just 693 strikeouts, the fewest of any team in the period in a non-strike season. The team with the biggest drop in ranking is the 2003 Cubs (.983 FA, .965 mFA), and they struck out more batters than any team in the period. Strikeouts pad the putout total and obscure the true error rates of fielders when they are included in fielding average.

Here are the 2009 team figures, with ML rank in FA and mFA, sorted by difference in ranks. Positive differences indicate teams that rank higher in mFA than in FA:



(3) Defensive Efficiency Record (DER)

Given that Bill James' DER is the most-widely used measure of team fielding, you'd think his original formula would be easy to find online, or in one of the STATS or Baseball Info Solutions publications James contributed to. You'd be wrong. I'm sure it's out there somewhere, but it's not easy to find, so I saved time by rummaging through the closet to dig out my one of my Abstract copies.

DER is the percentage of balls in play that are converted into outs, and James used two estimates to establish the numerator of plays made. One used putouts as its starting point; the other begins with plate appearances. The second is now used by most analysts, as the data is more accessible and it is also for all intents and purposes the complement of BABIP, the metric whose behavior is at the heart of DIPS theory.

I'll use that second estimate exclusively as well, but for completeness, the first formula is:

PM1 = PO - K - DP - 2TP - CS - ofA

This estimate assumes that a putout occurs on a batted ball unless it's a strikeout, or multiple outs are recorded on the same play (DP, TP), or it is a baserunning out (CS, ofA).

PM2 = PA - K - H - W - HB - .71E

The second estimate assumes that every batter is out on a ball in play unless he reaches safely (H, W, HB) or on an error (ROE is estimated to be 71% of total errors), or he strikes out.

PM is then figured as the average of PM1 and PM2, and DER follows:

DER = PM/(PM + H - HR + .71E)

I have gone with the PA form, as many others have--it takes a lot more effort to run down team outfield assists, and the two estimates are always very close.

For 1990-2009, here are the correlations between each of these metrics, plus Run Average, Unearned Run Average, and W%. I'm not offering this table up as being analytically important, and some of the correlations are silly--BMR with DER, for instance. The computer spits them all out, though, so I might as well list them:



As always, you have to be careful when interpreting correlations of this sort, and not putting too much stock in them. mFA and kFA have weaker correlations with W% and RA than FA, but that is not unexpected and tells us next to nothing about their performance as measures of fielding. Removing strikeouts isolates fielding results, but removes valuable information about how good the team was at defense overall (defense defined as pitching + fielding).

Monday, January 09, 2006

Win Shares Walkthrough, pt. 7 (Conclusion)

Distributing Fielding Win Shares to Individuals
Just as for hitting and pitching, we will assign fielding win shares to individuals by calculating claim points, and giving them the same percentage of the position’s win shares as they have of the position’s claim points. Each position, with the exception of second and short, has a different claim point formula. I will figure the claim points for the Braves’ starter at each position, and their win shares.

At catcher, the formula for claim points is:
cCP = PO + 2*(A - CS) - 8*E + 6*DP - 4*PB - 2*SB + 4*CS + 2*RS
where RS = (TmERA - CERA)*INN/9
RS is runs saved, based on Catcher’s ERA. I do not have Catcher’s ERA for 1993, so I have just set all of the catcher’s RS equal to zero. The Braves’ primary catcher Damon Berryhill recorded 570 putouts, 52 assists, 6 errors, 2 double plays, 6 passed balls, 62 steals, and 28 caught stealing. His claim points are 570 + 2*(52 - 28) - 8*6 + 6*2 - 4*6 -2*62 + 4*28 = 546. Just as in the offensive and defensive portions of the process, we zero out any negative claim points. Braves’ catchers totaled 990 claim points, giving Berryhill 546/990 of the 7.528 win shares for catchers, or 4.15.

At first base, we have:
1bCP = PO + 2*A - 5*E
Sid Bream had 627 PO, 62 A, and 3 E for 627 + 2*62 - 5*3 = 736 of the team’s 1638 total claim points, giving him 736/1638 of the 2.644 shares, or 1.19.

At second base, we introduce the concept of Range Bonus Plays. RBP are credited to any player whose Range Factor ((PO+A)*9/INN) is higher then the team average at the position, and are figured as:
RBP = (RF - PosRF)*Inn/9
We only credit RBP for players whose RF exceeds the positional average; therefore, there are no negative figures. Then we have this formula for CP at 2B and SS:
2bCP = ssCP = PO + 2*A - 5*E + 2*RBP + DP
Atlanta second baseman had a range factor of 5.320. Mark Lemke recorded 329 PO, 442 A, 14 E, and 100 DP in 1299 innings. His range factor therefore was (329+442)*9/1299 = 5.341. This is higher then the team average range factor at second base, so he gets (5.341-5.320)*1299/9 = 3 RBP. Then he has 329 + 2*442 - 5*14 + 2*3 + 100 = 1249 of the 1401 claim points, giving him 9.34 of the 10.476 win shares at second.

Jeff Blauser, at shortstop, had 189 PO, 426 A, 19 E, and 86 DP in 1323 innings, for no range bonus plays. This gives him 189 + 2*426 - 5*19 + 2*0 + 86 = 1032 of the 1204 win shares, giving him 6.65 of the 7.753 win shares at short.

At third base, the formula is:
3bCP = PO + 2*A - 5*E + 2*RBP
Terry Pendleton had 128 PO, 319 A, and 19 E in 1392 innings for 5 RBP. He has 128 + 2*319 - 5*19 + 2*5 = 681 of the 708 third base claim points, for 7.33 of the 7.62 win shares at third base.

For outfielders, we divide the RF by 3, since there are three outfield positions. Center fielders will generally have a higher range factor then the guys in the corners, and James notes that one function of RBP is to give more of the credit in the outfield to the center fielder. Then we apply this formula:
ofCP = PO + 4*A - 5*E + 2*RBP
Otis Nixon was the Braves primary center fielder, recording 308 PO, 4 A, and 3 E in 998 innings for 66.4 RBP. His claim points are 308 + 4*4 - 5*3 + 2*66.4 = 442 claim points out of 1190, for 6.71 of the 18.068 outfield win shares.

To find Fielding Win Shares, we simply add win shares credited at each position for a given player. One Brave, Bill Pecota, wound up with win shares at three positions(second, third, and outfield), although his total is just .38. The fielder with the most FWS for Atlanta was Mark Lemke with 9.34, all at second base.

My take: I don’t really have any opinion here; the system seems sound as far as I can tell, but of course the important stuff was done when we credited some of the defense to fielding and then the fielding to each position.

Putting it All Together
Win Shares for a player are just the sum of their batting, pitching, and fielding win shares. Then a rounding process is used. The Win Shares are rounded to whole numbers which must sum to 3 times the team win total. You could also display Win Shares unrounded, but James says that the difference between, say, 30 and 31 WS is very small to begin with and to display decimal places implies more accuracy then is actually there. He would prefer to keep the property that the team total sums to 3 times team wins.

So Bill’s rounding process is to round all numbers down to integers, and sum them. Then he orders the players by the remainders, and gives one win share to the player with the highest remainder until the player’s win shares sum up to the proper team number.

For example, suppose there was a team with 5 players that earned 25 win shares:
A had 10.005
B had 3.764
C had 0.963
D had 5.468
E had 4.800
Rounding down, A has 10, B has 3, C has 0, D has 5, and E has 4. That gives 22 WS, three short. Player C has the highest remainder, so he gets one WS. Then comes player E, who also gets one. Player B is third on the list, and also gets one. Now the team total is 25 and we stop with these final figures:
A has 10
B has 4
C has 1
D has 5
E has 5

My take
: In my spreadsheet, I keep the fractional numbers, but it is true that small differences are not significant. However, I don’t really see any reason to further reduce the accuracy by rounding it off. Bill’s position is to display imprecision and acknowledge that it is imprecise. My position is to display precision and acknowledge that it is imprecise. Just a difference of opinion, and not in any way a flaw in the method.

We can now compare the Win Shares that I found for the 1993 Braves to the actual Win Shares. There are certain to be some differences as I did not know exactly what years Bill used to set the park factors, nor did I have clutch hitting data, nor did I use precisely the same RC formula, nor did I have holds for a couple of relievers, nor did I have catcher ERA. Outside of those things, though, I believe I had all of the data needed.

My results did not match perfectly. I had the offense at 130.1, while Bill had 130.2. He had the pitching at 129.4, I had it at 127.8. He had the fielding at 52.4, I had it at 54.1. These differences are not insignificant, so it is possible that there is an error in my spreadsheet, but I have not been able to find it. Anyway, I will list the team in order of Win Shares that I came up with and put James’ in parentheses:
Blauser 29(29)
Justice 27(29)
Maddux 26(25)
Gant 24(25)
Glavine 20(20)
Avery 19(19)
Pendleton 19(16)
McMichael 17(17)
Smoltz 16(16)
Lemke 16(15)
Nixon 15(13)
McGriff 14(16)
Sanders 9(11)
Berryhill 9(8)
Bream 8(8)
Bedrosian 6(7)
Howell 6(6)
Mercker 5(6)
Olson 5(5)
Stanton 5(5)
Wohlers 4(4)
Smith 4(4)
Pecota 2(2)
Belliard 2(2)
Cabrera 2(1)
Klesko 1(2)
Lopez 1(1)
Jones 1(0)

Again, I am not quite sure what has caused the errors with the pitchers. Some of it is the data differences, but the offense/defense split should not have been affected (unless it was by different park factors). For hitters, I used a different RC formula since I did not have the clutch hitting data, so that is probably the largest factor contributing to the errors.

Final Thoughts on Win Shares
There will be no “My Take” section here because that’s what this whole part is. I have already listed my concerns/questions/disagreements/quibbles with the Win Shares method, and I will not rehash all of those here.

Instead, I will just state that personally, I don’t have a lot of use for the end result, Win Shares, but I will concede that there may well be useful stuff inside the process. For example, the idea of evaluating the team’s fielding and then breaking that down to individual credit may prove to be very useful. Bill claims to have had some new insights into fielding stats and I don’t doubt him; it’s just that I don’t keep up as much as I should with fielding evaluation so I’m not the right person to evaluate that and comment about it.

I do however, believe that the hitting and pitching components are not a step forward. They do not allocate absolute wins; well, they do, but they don’t do it correctly. So what are you left with? You are left with a runs above replacement (a very low replacement), with the scale nuked. I don’t find this very useful.

One aspect of Win Shares that most other systems do not incorporate is reducing the player’s rating if the team underperforms their expectations. If you create 100 runs on a team that creates 700, but only scores 650, your RC will be adjusted down. If you play on a team that, based on it’s R and RA, should win 55% of its games but only wins 52%, your runs will be worth less in terms of wins. These things are disliked by some people, but they are perfectly defensible in a value method. There may be some room for disagreement on whether to apply the adjustments proportionally to a player’s production, as James does, or whether to distribute them proportional to playing time. But these adjustments in theory are fine.

But you don’t have to go through the Win Shares system to apply similar adjustments. You can apply similar adjustments to an individual’s RC, or his WAR, or whatever. You could even find the team marginal runs/win (derived from the WS system), and use this as the RPW converter to convert RAR to WAR, or RAA to WAA. But there’s no need in my opinion, when evaluating hitters, to go through the WS process.

For pitchers, ideally it would be nice to have a different baseline depending on their reliance on the fielders, but it should clearly be individually-based, not team-based, so the mechanism in Win Shares won’t get you there. Even if you understood why it is designed the way it was.

So in short, I don’t think Win Shares is necessary for hitters, I don’t think it’s necessary for pitchers, and I think it may provide some insights into fielding evaluation but can’t really tell you. I don’t think it would get nearly as much attention as it does if it was not invented by Bill James, but I also think that Bill James is clearly the biggest name in sabermetrics so I do not begrudge him this.

What I do resent is people who accept WS because it comes from Bill James without questioning it themselves. I’m not thinking of any specific people, I have just seen in various discussions in places sentiment to the effect of “well, Bill spent a lot of time on it, it must make sense.”

There is nothing wrong with using WS in an analysis or talking about how the system could be improved, etc. as long as you recognize the strengths and weaknesses of the method and recognize how they might affect the analysis you are using them for. This of course is true to some extent for all sabermetric measures.

Tuesday, January 03, 2006

Win Shares Walkthrough, pt. 6

Distributing Fielding Win Shares to Positions

Win Shares takes a different approach from other fielding evaluation methods in that it first assigns a value to each position, then splits that up among the men who played that position. This allows James to use data for the team as a whole, rather then try to estimate how many strikeouts there were when a particular player was in the field.

Each position has four criteria which are used to assign Win Shares. A “claim percentage” is derived from the sum of these four scales divided by 100. Each position has different criteria and different weightings assigned to them. I will call the four criteria N, X, Y, and Z in order to keep the quantity of abbreviations to a minimum.

At catcher, the four criteria are Caught Stealing Percentage, Error Percentage, Passed Balls, and Sacrifice Hits. The 50 point scale is CS%(meaning that this criteria will compose 50% of the rating). CS% = CS/(SB + CS) for the team as a whole. The Braves allowed 121 SB and 53 CS, for a CS% of 53/(53 + 121) = .305. N = 25 + (CS% - LgCS%)*150 The NL average was .3149, so the Braves’ N is 25 + (.305-.3149)*150 = 23.52. I should point out now that all of the scales at each position have a minimum value of 0 and a maximum value of the number of points the criteria is assigned (50 in this case).

Error Percentage for a catcher is E% = 1 - (cPO + cA - TmK)/(cPO + cA - TmK + cE). This removes the putout credited for a strikeout from the catchers’ total. The Braves catchers had 1056 PO, 89 A, and 13 E, while the team struck out 1036. So their E% is 1-(1056+89-1036)/(1056+89-1036+13) = .107 X = 30 - 15*E%/LgE%. NL catchers had an E% of .084, so the Braves’ X is 30-15*.107/.084 = 10.89 on a 30 point scale.

The Passed Ball criteria incorporates something we will use in many fielding formulas, the Team League Putout Percentage(TLPO%). TLPO% = Tm(PO - K)/Lg(PO - K), or in other words is the percentage of the total putouts in the field in the league recorded by our team. The Braves had 4365 PO, while the league had 60854 PO and 13358 K, so the TLPO% = (4365-1036)/(60854-13358) = .070.

Y = (LgPB*TLPO% - TmPB)/5 + 5. The Braves had 13 PB versus 199 for the league, so their Y was (199*.07-13)/5 + 5 = 5.19 on a 10 point scale.

The final criteria is based on team sacrifice hits allowed and is Z = 10 - TmSH/(LgSH*TLPO%)*5. Atlanta allowed 77 SH while the league allowed 1110, so the Z = 10 - 77/(1110*.07)*5 = 5.05 on a 10 point scale.

The claim percentage for Atlanta catchers will be, just as it will be at all positions, (N + X + Y + Z)/100. This is (23.52+10.89+5.19+5.05)/100 = .447. An average team would score .500.

At first base, the criteria are Plays Made, Error %, “Arm Rating”, and errors by shortstop and third baseman. To find Plays Made, first a very complicated estimate of estimated unassisted putouts by first baseman is made:
Est1BUnPO = (1bPO - .7*pA - .86*2bA - .78*(3bA + ssA) + .115*(RoF + SH) - .0575*BIP)*2/3 + (BIP*.1 - 1bA)*1/3
BIP = IP*3 - K, and is an estimate of Balls in Play. The Braves’ first baseman had 1423 PO and 130 A, while P, 2B, 3B, and SS had 228, 496, 331, and 476 A respectively. We earlier found RoF as 1287, and the BIP is 4314. So the Braves’ Estimated UA PO at first is (1423 - .7*228 - .86*496 - .78*(331 + 476) + .115*(1287 + 77)- .0575*4314)*2/3 + (4314*.1 - 130)*1/3 = 177.9

We also need to find what is called the LHP+/-, the number of balls in play against left-handed pitcher’s above what you would expect from the league average. The formula is:
LHP+/- = TmBIP(lefties) - (LgBIP(lefties)/LgBIP*TmBIP)
Atlanta lefthanders pitched 582 innings with 349 Ks for 1397 BIP. The NL had 5918 IP from lefties with 3655 Ks for 14099 BIP, while the total league BIP was 47494. Therefore, the Braves’ LHP+/- is 1397-(14099/47494*4314) = 116.4

Then N = ((Est1BUnPO + 1bA + .0285*LHP+/-) - Lg(Est1BUnPO + 1bA)*TLPO%)/5 + 20. The league first baseman had 3114 estimated unassisted putouts and 1670 assists, so the Atlanta N = (177.9 + 130 + .0285*116.4 - (3114+1670)*.07)/5 + 20 = 15.27 on a 40 point scale.

The E% at all positions other then catcher is figured as E/(PO + A+ E). The Braves 1B made 9 errors, for an E% of 9/(1423+130+186) = .0058. X = 30 - 15*E%/LgE%, so with a LgE% at 1B of .0085, Atlanta gets 30 - 15*.0058/.0085 = 19.76 claim points on a 30 point scale.

The Arm rating is figured as Arm = 1bA + .5*ssDP - pPO - .5*2bDP + .015*LHP+/-. Braves 2B and SS had 108 and 97 DP, while the pitchers had 119 PO, giving an Arm of 130 + .5*97 - 119 - .5*108 + .015*116.4 = 7.25. Y = (Arm - LgArm/T)/5 + 10, where T = the number of teams in the league. The LgArm was 27.32 per team, so the Braves get (7.25-27.32)/5 + 10 = 5.99 points on a 20 point scale.

Z = 10 - 5*(3bE + ssE)/(Lg(3bE + ssE)*TLPO%). NL third baseman made 357 errors and the shortstops made 389. Braves third baseman and shortstops each made 19, so the Z is 10 - 5*(19 + 19)/((357+389)*.07) = 6.36 on a 10 point scale. The claim% at first base is (15.27+19.76+5.99+6.36)/100 = .474

At second base, the criteria are team DP, Assists, E%, and Putouts. N = 20 + (TmDP - ExpDP)/3. We already found the Braves’ ExpDP of 133.1, and they actually turned 146, giving a N of 24.3 on a 40 point scale.

The Assists rating is found as:
X = ((2bA - 2bDP) - (Lg(2bA - 2bDP)*TLPO% - 1/35*LHP+/-))/6 + 15. The Braves 2B had 364 PO, 496 A, 15 E, and 108 DP. The NL 2B had 4863 PO, 6776 A, 236 E, and 1412 DP. So they have an X of ((496-108)-((6776-1412)*.07 - 1/35*116.4))/6 + 15 = 17.64 on a 30 point scale.

Y = 24 - 14*2bE%/Lg2bE%. Atlanta’s E% at second is .0171 versus .0199 for the league, for a Y of 24-14*.0171/.0199 = 11.97 on a 20 point scale.

To find the putout criteria, we first find expected 2B PO by this formula:
Exp2bPO = Tm(PO - K)*Lg2bPO/Lg(PO - K) + 1/13*(W - Lg(W/IP)*TmIP) + 1/32*LHP+/-. The team PO-K is 3329 while the league is at 47496. The Braves walked 480 batters, while the league average was .3502 W/IP. This gives:
3329*4863/47496 + 1/13*(480 - .3502*1455) + 1/32*116.4 = 342.2.
Z = 5 + (2bPO - Exp2bPO)/12, giving 5 + (364-342.2)/12 = 6.82 on a 10 point scale. The claim% at 2B is (24.3+17.64+11.97+6.82)/100 = .607

At third base the criteria are Assists, Errors Above Average, Sacrifice Hits, and Double Plays. We first find Exp3bA = TmA*Lg3bA/LgA + 1/31*LHP+/-. Braves 3B had 131 PO, 331 A, 19 E, and 32 DP against the league totals of 1676, 4414, 357, and 374. The Braves had 1769 total assists and the league had 24442. Therefore, the Exp3bA = 1769*4414/24442 + 1/31*116.4 = 323.3. N = 25 + (3bA - Exp3bA)/4, giving 25 + (331-323.3)/4 = 26.93 on a 50 point scale.

Exp3bE = (3bPO + 3bA)/LgFA@3B - (3bPO + 3bA). The league FA at third base is .945(figured as (PO + A)/(PO + A + E), giving the Braves (131+331)/.945 - (131+331) = 26.9 expected errors. X = 15 + (Exp3bE - 3bE)/2 or 15 + (26.9 - 19)/2 = 18.95 on a 30 point scale.

The Sacrifice Hit criteria uses what I will call Sacrifice Hit Rating, or SH/(G + L) = SH/(W + 2*L). The Atlanta SHR is 77/(104 + 2*58) = .35 against a league average of .326. Y = 10 - SHR/LgSHR*5, or 10 - .35/.326*5 = 4.63 points on a 10 point scale.

Expected DP at third base are found very simply as ExpDP*Lg3bDP/LgDP, or 133.1*374/2028 = 24.55. Z = (3bDP - Exp3bDP)/2 + 5 or (32-24.55)/2 + 5 = 8.73 on a 10 point scale. The Claim% at 3B is (26.93+18.95+4.63+8.73)/100 = .592

For shortstops, the criteria are Assists, Double Plays, E%, and Putouts. First we find ExpssA = TmA*LgssA/LgA + 1/100*LHP+/-. Atlanta shortstops had 217 PO, 476 A, and 19 E versus league totals of 3647, 6930, and 389, giving an expectation of 1769*6930/24442 + 1/100*116.4 = 502.7. N = (ssA - ExpssA)/4 + 20 or (476-502.7)/4 + 20 = 13.33 on a 40 point scale.

X = 15 + (TmDP – ExpDP)/4 = 15 + (146-133.1)/4 = 18.23 on a 30 point scale.

Y = 20 - 10*ssE%/LgssE% = 20 - 10*.0267/.0355 = 12.48 on a 20 point scale.

Expected PO at shortstop are found by ExpssPO = Tm(PO - K)*Lg(ssPO/(PO - K)) + 1/14*(W - LgW/IP*TmIP) - 1/64*LHP+/- or 3329*3647/47496 + 1/14*(480-.3502*1455) - 1/64*116.4 = 251.7. Then Z = 5 + (ssPO - ExpssPO)/15 = 5 + (217 - 251.7)/15 = 2.69 on a 10 point scale. Therefore, the Claim% at shortstop = (13.33+18.23+12.48+2.69)/100 = .467.

For outfielders, the criteria are Putouts, the team’s Defensive Efficiency Record, “Arm Elements”, and E%. Outfield putouts are first expressed as a percentage of team putouts less strikeouts and assists (assists generally come on groundballs). I will call this Putout Rating, POR = ofPO/(TmPO - TmA - TmK). Braves outfielders recorded 1055 PO, 19 A, 21 E, and 5 DP while the league had 15361, 480, 337, and 87. So the ATL POR is 1055/(4365-1769-1036) = .6763. N = 20 + 100*(POR - LgPOR) so with a league POR of .6663, it is 20 + 100*(.6763-.6663) = 21 on a 40 point scale.

The second criteria is very easy to calculate, using CL-1 from way back in the process when we were dividing defense between the pitchers and the fielders. X = CL-1*.29 - 9, which for Atlanta is 134.82*.24 - 9 = 23.36 on a 30 point scale.

The third criteria, “Arm Elements”, compares the team sum of outfield assists and DP less SF to the league total of the same, discounted at the TLPO%:
Y = ((ofA + ofDP - TmSF) - Lg(ofA + ofDP - SF)*TLPO%)/5 + 10. Since the Braves allowed 39 SF and the league 701, their Y is ((19+5-39)-(480+87-701)*.07)/5 + 10 = 8.88 on a 20 point scale.

Finally, Z = 10 - 5*E%/LgE%, which is 10 - 5*.0192/.0208 = 5.38 on a 10 point scale. The OF Claim% is therefore (21+23.36+8.88+5.38)/100 = .586.

We are now ready to distribute the fielding win shares to each position. Each position has an “intrinsic weight”, which we will abbreviate IW. These weight the claim percentages at each position by the importance of that position. The IWs are: C = 38, 1B = 12, 2B = 32, 3B = 24, SS = 36, and OF = 58. We take, for each position (Claim% - .200)*IW, and sum these:
C = (.447-.200)*38 = 9.39
1B = (.474-.200)*12 = 3.29
2B = (.607-.200)*32 = 13.02
3B = (.592-.200)*24 = 9.41
SS = (.467-.200)*36 = 9.61
OF = (.586-.200)*58 = 22.39
These sum up to 67.11. So catcher’s get 9.39/67.11 = 14% of the team’s 54 FWS, or 7.56. Doing this for all positions(and not rounding the numbers):
C = 7.523, 1B = 2.644, 2B = 10.476, 3B = 7.616, SS = 7.753, OF = 18.068

My take: As I said earlier, fielding analysis is not something I am really qualified to pontificate about. I will leave it to the Tangos and the Mike Emeighs and the MGLs, etc. to debate the merits of the method. I will instead focus on its similarity to Defensive Winning Percentage.

DW% was used by James in his early Abstracts to evaluate fielding and then combined with OW% to give a total player rating. DW% was not used after the 1984 book. My first reaction when I saw the Defensive Win Shares formula was “it’s a revised DW%”.

Just like in WS, each position had four criteria, rated on scales that added up to 100. The criteria have changed over the years, sometimes based on better data being available (for example, James used to use opposition SB/G to rate catchers, whereas now we know SB and CS and can find the percentage) or based on new research and ideas of how to evaluate fielding (James used A/G for 1B, but now estimates unassisted putouts as well). But many of the criteria are the same or similar.

Another feature of the system was that each position had an intrinsic weight. These were 10 at C, 3 @ 1B, 8 @ 2B, 6 @ 3B, 11 @ SS, 4 @ LF, 6 @ CF, and 5 @ RF. These sum up to 53, which is the value in games given to fielding (both wins and losses) for a 162-game season. Dividing 53 by 162 gives .327, i.e. the system considers fielding to make up 32.7% of defense. Win Shares puts fielding, for an average team, at 32.5%. If you consider the outfield as a unit and scale these to 200 (the total of the intrinsic weights in Win Shares), you have:
C = 37.7(38); 1B = 11.3(12), 2B = 30.2(32), 3B = 22.6(24), SS = 41.5(36), OF = 56.6(58)
The numbers in parentheses are the WS intrinsic weights. As you can see, both systems say that fielding is approximately 32.5% of defense and weight the positions equally (shortstop is the only position with a significantly different weight).

I must reach the same conclusion as my first glance: Fielding Win Shares is an updated Defensive Winning Percentage. This is not necessarily a bad thing; perhaps the original system was very good to begin with, and it has been improved by better data, better estimates, and presumably more research. And of course a huge difference is the fact that DW% looks at each fielder individually while WS starts by crediting the team, and distributes value to the players from there. But the similarities between the two systems, separated by twenty years, are still striking, at least to me.

Again, just as in the stage where responsibility was split between pitching and fielding, there is an explanation of how, but not why. Why is the intrinsic weight at shortstop 36? Why are team sacrifice hits allowed weighted double a third baseman’s double plays over expectation? Etc. These questions are not answered, nor even acknowledged by James. That is not to say that he did not think of them himself, as I’m sure he did--just that we have no way of knowing what the thought process behind the system was, and are left to puzzle over it ourselves.

Along the same line, there are differences in how the ratings are formed at each position. Most positions are given a rating for errors based on their error percentage. But at third base it is based on errors above average. These sorts of things seem like inconsistencies within the system, but if there is a good reason for them, we have not been told what it is.

Aside from the fielding nature of the method itself, the subtraction of .200 from each claim percentage hammers home that the system is giving out absolute wins on the basis of marginal runs. 50% of the league average in runs scored, with a Pythagorean exponent of 2, corresponds to a W% of .200. It is for this reason that in old FanHome discussions myself and others said that WS had an intrinsic baseline of .200 (James changed the offensive margin line to 52%, which corresponds to about .213).

In an essay in the book, James discusses this, and says that the margin level(i.e. 52%) “is not a replacement level; it’s assumed to be a zero-win level”. This is fine on it’s face; you can assume 105% to be a zero-win level if you want. But the simple fact is that a team that scored runs at 52% of the league average with average defense will win around 20% of their games. Just because we assume this to not be the case does not mean that it is so.

Win Shares would not work for a team with a .200 W%, because the team itself would come out with negative marginal runs. If it doesn’t work at .200, how well does it work at .300, where there are real teams? That’s a rhetorical question; I don’t know. I do know that there will be a little bit of distortion every where.

In discussing the .200 subtraction, James says “Intuitively, we would assume that one player who creates 50 runs while making 400 outs does not have one-half the offensive value of a player who creates 100 runs while making 400 outs.” This is either true or not true, depending on what you mean by “value”. The first player has one-half the run value of the second player; 50/100 = 1/2, a mathematical fact. The first player will not have one-half the value of the second player if they are compared to some other standard. From zero, i.e. zero RC, one is valued at 50 and one is valued at 100.

By using team absolute wins as the unit to be split up, James implies that zero is the value line in win shares. Anyone who creates a run has done something to help the team win. It may be very small, but he has contributed more wins then zero. Wins above zero are useless in a rating system; you need wins and losses to evaluate something. If I told you one pitcher won 20 and the other won 18, what can you do? I guess you assume the guy who won 20 was more valuable. But what if he was 20-9, and the other guy was 18-5?

You can’t rate players on wins alone. You must have losses, or games. The problem with Win Shares is that they are neither wins nor wins above some baseline. They are wins above some very small baseline, re-scaled against team wins. If you want to evaluate WS against some baseline, you will have to jump through all sorts of hoops because you first must determine what a performance at that baseline will imply in win shares. Sabermetricians commonly use a .350 OW%, about 73% of the average runs/out, as the replacement level for a batter. A 73% batter though will not get 73% as many win shares as an average player. He will get less then that, because only 21%(73% - 52%) of his runs went to win shares, while for an average player it was 48%. So maybe he will get .21/.48 = 44%. I’m not sure, because I don’t jump through hoops.

Bill could use his system, and get Loss Shares, and have the whole thing balance out all right in the end. But to do it, you would have to accept negative loss shares for some players, just as you would have to accept negative win shares for some players. Since there are few players who get negative wins, and they rarely have much playing time, you can ignore them and get away with it for the most part. But in the James system, you could not just wipe out all of the negative loss shares. Any hitter who performed at greater then 152% of the league average would wind up with them, and there are (relatively) a lot hitters who create seven runs a game.

James writes in the book that with Win Shares, he has recognized that Pete Palmer was right after all in saying that using linear methods to evaluate players would result in only “limited distortions”. And it’s true that a linear method involves distortions, because when you add a player to a team, he changes the linear weights of the team. This is why Theoretical Team approaches are sometimes used. But the difference between the Palmer system and the James system is that Palmer takes one member of the team, isolates him, and evaluates him. James takes the entire team.

So while individual players vary far more in their performance then teams, they are still just a part of the team. Barry Bonds changes the linear weight values of his team, no doubt; but the difference might only be five or ten runs. Significant? Yes. Crippling to the system? Probably not. But when you take a team, particularly an unusually good or bad team, and use a linear method on the entire team, you have much bigger distortions.

Take the 1962 Mets. They scored 617 and allowed 948, in a league where the average was 726. Win Shares’ W% estimator tells me they should be (617-948+726)/(2*726) = .272. Pythagorus tells us they should be .304. That’s a difference of 5 wins. WS proceeds as if this team will win 5 less games then it probably will. Bonds’ LW estimate may be off by 1 win, but that is for him only. It does not distort the rest of the players (they cause their own smaller distortions themselves, but the error does not compound). Win Shares takes the linear distortion and thrusts it onto the whole team.

Finally, the defensive margin of 152% corresponds to a W% of about .300, compared to .213 for the offense. The only possible cutoffs which would produce equal percentages are .618/1.618 (the Fibonacci number). That is not to say that they are right, because Bill is trying to make margins that work out in a linear system, but we like to think of 2 runs and 5 allowed as being equal to the complement of 5 runs and 2 allowed. In Win Shares, this is not the case. And it could be another reason why pitchers seem to rate too low with respect to batters (and our expectations).

Finally, one little nit-picky thing; why do expected putouts by second baseman and shortstops go up as walks go up? Obviously, more walks means more runners on first who may be putout at second on fielder’s choices, or steal attempts, or double plays, but so do singles and hit batters. Am I missing something really obvious here?

Tuesday, December 20, 2005

Win Shares Walkthrough, pt. 4

Splitting Defensive Win Shares between Pitching and Fielding

This process involves seven “Claim Point” formulas, which when combined give an estimate of the percentage of defense attributable to pitchers. Each one of these claim points is either classified as being a “pitching” claim point, i.e. one that is attributable to the skill of the pitchers, or a “fielding” claim point that is attributable to the fielders, with the except of the first. The first formula is based on the team’s Defensive Efficiency Record, which as an estimate of the percentage of balls hit into play against them that are turned into an out. This one, called CL-1, counts for both pitching and fielding. We first find the DER = (BF - H - W - K - HB)/(BF - HR - W - K - HB). The Braves faced 6015 batters, and allowed 1297 hits, 480 walks, 22 hit batters, 101 homers, and 1036 strikeouts. So their DER is (6015 - 1297 - 480 - 1036 - 22)/(6015 - 101 - 480 - 1036 - 22) = .7267. We then find adjDER as 1 - (1 - DER)/PF(S). The Braves’ PF(S) was .995, giving an adjusted DER of 1 - (1 - .7267)/.995 = .7253. The claim points are found by:
CL-1 = 100 + (adjDER - LgDER)*2500
The NL DER was .7114, so the Braves’ CL-1 is 100 + (.7253 - .7114)*2500 = 134.75

The second criteria is a pitching claim point, based on the strikeout rate. First we find the team’s strikeouts per game as K*9/IP = KG. The Braves had 6.408, and we easily convert this to CL-2:
CL-2 = (KG + 2.5)/7*200
For the Braves, (6.408 + 2.5)/7*200 = 254.51

The third criteria is a pitching claim point, based on walks compared to the league average. The formula is:
CL-3 = Lg(W + HB)/IP * TmIP - W - HB + 200
In words, we find the league average of walks and hit batters per innings and subtract the team in question’s W and HB to find out how many above average they were, then add this to 200. The league average was .3782 and the Braves pitched 1455 innings with 502 W+HB, so:
.3782*1455 - 502 + 200 = 248.28

The fourth criteria is another pitching claim point, based on home runs allowed. We find the number of homers less then expected, multiply by 5 and add to 200.
CL-4 = (LgHR/IP*TmIP - HR/PF(HR))*5 + 200
The league average was .0964 HR/IP, and the Braves allowed 101 homers with a 1.019 PF(HR), so:
(.0964*1455 - 101/1.019)*5 + 200 = 405.73

The fifth criteria is the first of two that is for fielding only, and it is based on the rate of errors and passed balls. This is put together like this:
CL-5 = (Lg(E + .5*PB)/INN*TmIP - E - .5*PB) + 100
The league average of errors and half of PB per inning was .0974, while the Braves committed 108 errors and 13 passed balls, resulting in:
(.0974*1455 - 108 - .5*13) + 100 = 127.22

The sixth criteria is the most complex and compares the team’s double plays to the expected number of double plays (a fielding claim). First, calculate the percentage of non-HR hits that are singles in the league as Lg%S = S/(H - HR). Then make an estimate of Runners on First Base(RoF) for the team and the league:
RoF = (H - HR)*Lg%S + W + HB - SH - WP - BK - PB
In the 1993 NL, 77.8% of non-HR hits were singles. The Braves allowed 77 sacrifice hits, 46 wild pitches, and 9 balks, giving:
(1297 - 101)*.778 + 480 + 22 - 77 - 46 - 9 - 13 = 1287.49
The league has a RoF estimate of 19790. Expected DPs is the league DP per RoF, times the team RoF, times the ratio of team assists per inning to league assists per inning (this is used as an estimation of the opposing hitters’ groundball tendencies):
ExpDP = Lg(DP/RoF)*TmRoF*(A/IP)/Lg(A/IP)
The NL turned 2028 double plays, and the Braves recorded 1769 assists versus 24442 for the league(in 20284 innings). So the Braves had 1769/1455 = 1.216 assists/inning versus 24442/20284 = 1.205 for the league. Put it all together:
2028/19790*1287.49*1.216/1.205 = 133.14
CL-6 is just the excess double plays multiplied by 4/3, plus 100:
CL-6 = (DP - ExpDP)*4/3 + 100
For the Braves, who turned 146 DP, (146 - 133.14)*4/3 + 100 = 117.15

The seventh and final criteria is simply 405 times the team’s winning percentage:
CL-7 = 405*W%
For the Braves(104-58, .642), 405*.642 = 260

The percentage of defense attributable to pitching is the sum of the pitching claim points, plus 650, divided by the sum of all claim points (with CL-1 double counted because it is credited to pitchers and fielder) plus 1097.5.
Pitch% = (CL-1 + CL-2 + CL-3 + CL-4 + CL-7 + 650)/(2*CL-1 + CL-2 + CL-3 + CL-4 + CL-5 + CL-6 + CL-7 + 1097.5)
For the Braves:
(134.75 + 254.51 + 248.28 + 405.73 + 260 + 650)/(2*134.75 + 254.51 + 248.28 + 405.73 + 127.22 + 117.15 + 260 + 1097.5) = .7026

Therefore, we will assign 70.26% of the Braves 182 defensive win shares to the pitchers, or 128. The Field% = 1 - Pitch%, of course, and PWS = Pitch%*DWS while FWS = Field%*DWS. The Braves’ fielders get 54 win shares collectively.

There are a couple of constraints placed on these figures, but I it doesn’t appear as if they are relevant today. The first is that the Pitch% must be between 60 and 75%, and the second, which takes precedence over the first, is that a team must have between .16375 and .32375 FWS/game.

My Take: I do not know exactly the logic behind these steps, because Bill does not explain what that is (specifically I mean; obviously, the formulas are given and we know which criteria are credited to pitcher and which to fielder’s, but outside of that, we don’t know much), but I do not like this step. If Win Shares truly represents a step forward in measuring fielding, then the step that determines how much of a team’s defense the fielders deserve credit for is pretty darn important.

First let me explain the general logic behind the scales. Each CL-x formula has an average, which represents the amount of weight it is given. For example, the average CL-1 is 100, meaning it is weighted by 100. Let me just make a list of the averages:
CL-1 = 100
CL-2 = 200
CL-3 = 200
CL-4 = 200
CL-5 = 100
CL-6 = 100
CL-7 = 202.5
Most of these are easy to figure out, because they just take some difference from league average and add it to the number. Obviously, if the team is average, zero plus that number will equal that number. The only exceptions are CL-2, which assumes an average team will have 4.5 KG, and CL-7, which is just 405*.500 = 202.5.

If you plug those averages into the Pitch% formula, you will get .675, meaning that Win Shares assumes that an average defense is 67.5% attributable to pitching, which is in line with the 2/3 approximation that some sabermetricians use.

For all I know, all of these formulas could be very well-founded and work. The only problem is, James does not explain the weightings or how he reached them. In such a crucial stage of the process, there is basically no justification offered other then “it works”. I can accept that it may give a reasonable estimate, but do not expect me to adopt your system unless I have some hard data or reasoning to back it up.

One thing that puzzles me is the willy-nilly mix of counting numbers and rates. CL-1, uses a rate, DER. CL-2 uses a rate, KG. CL-3 uses counting, walks above average. CL-4 uses counting, homers above average. CL-5 uses counting, errors above average. CL-6 uses counting, double plays above average. CL-7 uses a rate, winning percentage. This is completely flummoxing. Why does a pitching staff’s K ability get expressed as a rate, while their control ability gets expressed as a count? If a full season is played, these things should even out but what about strike seasons? What if you try to use Win Shares in the middle of a season? This part won’t work. It takes a full season for the variances from expectation for walks, double plays, etc. get to the same proportion used in the formulas as the variances for the rates. KG is a number between 0 and 27 whether it is April 1 or the last day of the season. But how many errors above average can you possibly be after ten games?

I can’t express just how bizarre I think this is. Win Shares will not work if you use them in the middle of a season, because these formulas will not work. They will be comparing apples and oranges.

There are other questions to be asked to. For one thing, strikeouts are not compared to the league average. This makes absolutely perfect sense--after all, if everybody in the league has a KG of 9, that is four less outs in the field then a league where everybody has a KG of 5, even if no teams deviate from the average. So I understand this step. But why aren’t walks treated the same way? After all, if there is a league like the late 40s AL with a billion walks, won’t everybody in that league allow a lot of runs due to walks, which are not controlled at all by the fielders? If one league has an average of 400 walks/team, and a second league has an average of 500 walks/team, the second league’s pitchers are all allowing a lot of runs without involving the fielders at all. And the same argument goes for home runs. In a “three true outcomes” game, in which every play is a homer, walk, or strikeout, you don’t need fielders. You can do the old Satchel Paige legend and have them sit on the mound.

This would cause problems in the Win Shares system, though, because one can argue against that by saying that we cannot know whether a certain walk rate is good or not unless we evaluate it against the context (read league average). 4 walks/game is a solid performance in the 1949 AL where the average is 4.5, but atrocious in the 1880 NL where the average is 1.1 But is this not true for Ks as well? Walter Johnson was a great strikeout pitcher in his day, but his strikeout rates look like Nate Conejo compared to Nolan Ryan. Part of the problem, then, is that the pitching/fielding split is kept constant over time. If you had a league with a very low strikeout and walk rate, you would want to increase the fielding share, but still credit those pitchers that excelled in strikeouts and walks.

My point is that there are two aspects to the K/W/HR rates. One is obvious: how does it relatively compare to the other pitchers in the league. The second is more subtle: what do the absolute weights say about the importance of pitching in this league. The W and HR claim formulas address the first question, the K formula addresses the second question. Ideally, both questions would be answered. The first question might change the percentage of team defense attributed to pitching; the second would address the percentage that is assumed to be the case for an average team. In the "Three True Outcomes" game, fielding is zero. In a league in which there are a mix of the three true outcomes and balls in play, but all the pitchers allow them at the exact same rate, there is no difference which pitchers you have, and so pitching must be zero.

James says that even with the use of .52/1.52 instead of .5/1.5, pitchers seem to rate too low. And pursuing Win Shares lists, one tends to agree with him. According to Win Shares, as best as I can tell, the last pitcher ranked as the top player in the league was Steve Carlton in 1972. In the most recent year published in the book, 2001, the majors’ top rated pitcher was Randy Johnson with 25.70 Win Shares. The top five position players in each league easily exceeded this. The AL WS leader was Jason Giambi at 35.81, but fifth place Jim Thome came in at 29.44. The NL WS leader was Barry Bonds at 52.22, but Gary Sheffield came in fifth at 28.32.

One reason for the low ranking of pitchers could be a flaw in other sabermetric methods. For example, when we use Run Average or ERA together with a baseline to rank pitchers, we credit all of the marginal impact to the pitcher. But of course we also recognize that the defense has playd a part in preventing those runs. So we may well be overrating the pitcher’s impact. While that is likely a factor, I think there is a much more basic reason for the relatively low rankings of pitchers: the fact that the pitchers’ share is determined at the team level, and not the individual level.

Win Shares holds that excellence in getting strikeouts and preventing walks and homers shows that more credit should be given to the pitchers. But of course the K, W, and HR skills vary wildly among individuals on a staff. When the Diamondbacks 2001 Win Shares were split up, Randy Johnson and Curt Schilling both were excellent in those areas to help bolster the team’s pitching share. But soft-tossers like Mike Morgan and Greg Swindell were on that team too, bringing it down. The pitching share is obviously different depending on who is on the mound. And different from era to era.

But Win Shares, by apportioning the Win Shares to the pitching staff and then to individuals, uses the total team performance, and therefore cannot properly credit Johnson and Schilling. I would suggest that it could do more to properly credit them by emphasizing the three true outcomes in the pitcher claim point process which distributes the team pitching WS to the individual pitchers. But as we will see, the criteria there are R, ER, IP, W, L, SV, and HLD, essentially, nothing that will allow Johnson to gain tons of points for not relying on his defense (outside of the positive effect excellence in the three true outcomes has on ERA, Wins, etc.) What I am suggesting are additional bonuses for doing things that increase the team’s pitchers’ share. I am not suggesting that this approach would be ideal, because I think the best thing to do would be to allocate the percentage differently for each pitcher. However, the “bonuses” might be the easiest way for these factors to be incorporated into the Win Shares framework.

One final little thing is that the pitching/fielding split excludes pitchers from receiving fielding win shares. There is a good case to be made I think for doing this, given that pitcher's runs allowed totals include whatever defensive impact they had. However, I am not sure how this would impact comparisons of the pitching/fielding breakdown to other estimates of the breakdown done by other people who may have included pitcher’s fielding in the total share for fielding (of course, the fielding impact of pitchers is very small compared to other positions, at least as far as I know).

Component ERA in Win Shares
Component ERA(ERC) is used for a very small portion of the Win Shares method of distributing pitching win shares to individuals, but the formula is complex and so I put it in a separate section. ERC is an estimate of what a pitcher’s ERA should be given his component stats(IP, H, W, HR, etc.) It is based on the Runs Created model of runs = A*B/C where A = baserunners, B = advancement, and C = PA.
A = H + W + HB
B = ((H - HR)*1.255 + HR*4)*.89 + (W + HB - IW)*.56
C = BF
You then take A*B/C to estimate runs allowed. Runs allowed are divided by innings and multiplied by 9 to estimate run average. Then, if the RA estimate is >=2.24, subtract .56 to get ERC. If it is less then 2.24, multiply by .75 to get ERC.

However, in the Win Shares methodology, Bill just adds back the earned run part, so we can ignore it. Therefore, I will simplify the formula and call it RAC (for Component RA). RAC = A*B*9/(BF*IP)

I will run through this for Mike Stanton, who on the basis of recording a save or hold will get credit for “save equivalent innings” and therefore will have RAC play into our evaluation of his performance. Stanton pitched 52 innings, allowing 51 hits, 29 walks (7 intentional), zero hit batters, and 4 homers while facing 236 batters. Therfore:
A = 51 + 29 + 0 = 80
B = ((51 - 4)*1.255 + 4*4)*.89 + (29 + 0 - 7)*.56 = 79.06
His RAC is 80*79.06*9/(236*52) = 4.64, significantly better then the 6.06 RA he actually allowed.

My Take: The ERC formula is pretty straightforward assuming use of Runs Created. Of course, I would prefer to see a more accurate run estimator used as the basis, but that’s not happening. 1.255 is an approximation of the average number of TB per non-HR hit. I am not exactly sure why Bill takes estimated TB by .89 and walks by .56, while in the regular RC formula TB are weighted around 1 and walks around .26.

The subtraction of .56 is also a little confusing, since I believe research shows that assuming a multiplier all the way is more accurate (approximately 90% of runs are earned). If anything, one would expect the ERA and RA for very good pitchers to be closer (linearly) then those for bad pitchers, so multiplying by 75% for low RAs would make it worse. Of course, this is of no consequence in Win Shares because the unearned part is added back in.