Tangotiger Blog

Tuesday, December 19, 2017

Carry by Park

By Tangotiger

?In the second of three interesting articles, the author talks about the "carry" of the ball (over two separate articles). And he confirms findings I have found, and others have found years ago, that essentially 1 extra foot of carry results in 3% more HR. He also makes a good point regarding the fences being "closer": since the fences are not circular, a hitter will more likely hit a HR if he hits the ball closer to the line than if he hits it straightaway. While this is obvious, I like the description of saying that the hitter himself can "move" the fences closer. You can also think of "carry" based on the spin rpm and axis. You can think of shooting pool, where if you hit the ball too off-centered, then no matter how hard your "exit velocity" of the cue stick is, you won't get any speed along that attack angle.

() Comments • • Batted_Ball

Sunday, June 18, 2017

Sabercast Lesson Number 1: do not try to do too much too fast

By Tangotiger

?One of the new waves of saberists out there, Andrew, does some excellent work. One of the things he's doing is taking "estimated wOBA" and applying it in place of actual outcomes. And then using that to forecast future RA/9. Except. Well, except that the best predictor, even better than FIP, is simply K minus BB per PA. That's right. Completely ignore batted balls, whether hard hit or not, by angle or not. Ignore basestealing, ignore everything. Except K and BB.

His main issue is that he basically used a chart like this:

?

And simply added up those values for every batted ball. Prima facie, this is entirely reasonable. You have a high pop up? Let's count that as close to a 0 wOBA (i.e. out). You have a 28 degree 115mph shot? Let's count that as close to a 2.000 wOBA (i.e., HR). And I would say 99% of researchers would do exactly that.

But, what if I tell you that the ENTIRETY of a player's batted ball profile can be determined by the frequency of his barrels? That is, rather than assign a value to every batted ball, let's only assign one value, the same value, to simply those 6% of the balls that falls into the "barrels" category?

You'd think I'm crazy, right? Look how strong that relationship is, looking at barrels to wOBA the following year (not wOBA on batted balls, but overall wOBA including BB and K!)

Well, Andrew just demonstrated that it's better to discard 100% of the batted balls, than to include all of them. (Voros is smiling.) And I'm saying, let's at least START by discarding 94% of the batted balls, and focus on the 6% hit at the ideal speed+angle.

Then, you can start adding a bit more. You can add the near-barrels, those well-struck balls that just missing being barrels. You can add the flares and the burners. And so on. Once you do this, then you'll be in a much better position to forecast the future.

(1) Comments • 2017/06/18 • Batted_Ball

Monday, May 09, 2016

Positioning the outfielders

By Tangotiger

?Pirates are reeling them in.

Just to make sure this isn’t one of those instances where a team is all talk in the spring and no walk in the regular season, Mike Petriello’s got the data, and nobody’s playing a more shallow center field relative to last year than McCutchen. In 2015, McCutchen lined up 316 feet from home plate, on average. This year? He’s averaged a touch under 300 feet (data current as of May 2).

(14) Comments • 2016/05/19 • Batted_Ball • Fielding

Saturday, May 07, 2016

Flaw in Shift-Data Analysis

By Tangotiger

?Everyone's doing it. And just now, Ben over at BIS through Bill's site posted a counter to Pizza's recent article.

However, there's a flaw in these analysis, one that we've been pointing out for years. And one that Bill James also pointed out in a rather lengthy post a few years ago: you cannot limit the analysis to only those times when a ball is put in play. The question is very simple: very very simple: What is the batter's overall production when he is shifted? That's the question. That's the singular simple question. And to answer that you have to (a) know when he's shifted BEFORE seeing the results and (b) all the outcomes, including BB, SO, HR.

You cannot say that because a batter BB, SO, or HR that the shift had no effect in either way. Au contraire. Given that the batter is responding to the stimulus of a new defensive alignment, EVERYTHING is on the table. How he approaches that situation is at the very heart of the question.

Therefore, every single analysis you see... every single one, without exception.... if it doesn't reference a player's overall production, and that means including BB, SO, HR production can be simple discarded as to its conclusion.

***

It also gets worse because some of these trackers are only marking a shift if they think the shift affected the result. That is, a flyball hit to the LF is not being marked as "shift" data, because the reasoning is that regardless of whether there was a shift or not, the result would have been the same. That's silly on its face, and it's silly underneath the surface. I don't know if these trackers are still doing this, but this is wrong.

(14) Comments • 2016/05/12 • Batted_Ball • Fielding

Monday, May 02, 2016

Convergence of scouting and performance analysis

By Tangotiger

?For some 15 years, I've been talking about the convergence of scouting and performance analysis. That is what sabermetrics really is. Sabermetrics is not just about "stats". Scouting information actually provides the necessary prior to evaluate the resulting performances. If you had zero history on Felix and Armando, and you were told to watch their perfect games, a scout might be able to distinguish differences that the resulting performances (0 for 27) did not. As I noted back in 2009, "The idea is to create a model that is complex and comprehensive enough as to make both performance and scouting data obsolete." That is, the convergence of scouting and performance analysis is when they both disappear. As much as it's not possible for that to be true, the goal is to work toward that end.

And William provides some excellent data points toward that end. In terms of "extrapolation", I would have used Alan's Trajectory Calculator as a (very strong) prior. It's inconceivable that you could have a 100mph 8-second popup that travelled 100 feet. The hitter would have to have intentionally gone for a huge uppercut. Setting that small issue aside (on frankly data that won't exist anyway, so it doesn't really matter how you model the parts of a system that can't happen), we get into the fun stuff:

When used in conjunction with Steamer, the impact of exit velocity was still significant, both statistically and practically. We can expect a hitter to outperform his Steamer projected wOBA by roughly three points for each mph of previous-season exit velocity. (We found a similar effect when using either ZiPS projections or an average of ZiPS and Steamer.)

Note that since they are using wOBA (which includes BB and SO), what they are forecasting is not just on future batted balls, but on all plate appearances. Which may SEEM wrong, or at least weird, but when you think about it: guys who hit harder have an effect on how a pitcher will pitch to him. You wouldn't just want to forecast his wOBA on Contact based on his exit speed, but his wOBA on ALL his plate appearances. Naturally, you could break it up so you can see wOBA on Contact and wOBA on non-contact, as well as the FREQUENCY to which he contacts his balls in the future. By looking at future wOBA, all that gets rolled into one.

We've known about the importance of hang time ever since Robert Dudek's simple, yet groundbreaking article, way back in the premier issue of Hardball Times. William continues in that tradition, but also takes advantage of known data points. This is what sabermetrics is all about.

(4) Comments • 2016/05/03 • Ball_Tracking • Batted_Ball

Tuesday, April 19, 2016

Reliability of Batted Ball Exit Speed

By Tangotiger

Pizza gives us the results of his correlation. Note that since Pizza split his data based on time, we end up with a bias, which is most easily seen with the pitcher data. Setting that aside, I'll present his data, with one extra column:

BIP r Regression Amount

10 0.350 19

20 0.527 18

30 0.635 17

40 0.679 19

50 0.732 18

As he showed, the more number of trials you have, the higher the correlation. This is a given (setting aside systematic bias). If you had a billion trials, you'd have r almost exactly 1. That's why reporting correlation without reporting the number of trials is useless. So, good job on Pizza in showing us this progression.

The extra column I added was the Regression Amount, which is simply determined as (1-r)/r * number of trials. We hope and expect that the regression amount is constant, regardless of number of trials. And, lo and behold, it is! You basically simply add 18 batted balls at league average exit velocity, and, voila, you have an estimate of the hitter's true talent level of exit speed.

If you remember, we did something similar back in the early PITCHf/x days for a pitcher's THROWING speed. And if I remember right, the regression amount we added was... 1 pitch. That is, a pitcher's true talent throwing speed is almost instantly known.

Which of course makes sense, since there's no variable between the pitcher and his release that we really have to consider. With a batter, he has to respond to the pitcher, and he may not hit the ball squarely, or on the same plane. So, there's a gap between his base talent level and his results.

The closer what you do is to what you deliver, the less amount of regression you need. In addition, the wider the overall spread in talent in, the less amount of regression you need. When you have one guy averaging an exit speed at 90mph and another at 75mph, that's a wide spread to begin with. It's much quicker to determine that correlation than if everyone exited at 88-90mph.

(26) Comments • 2016/04/21 • Ball_Tracking • Batted_Ball

Friday, February 05, 2016

wOBA on contacted balls: (almost) halfway between Batting Average and SLG

By Tangotiger

?

This is a terrific chart from @darenw and @StatCast. I'd make each line proportionate to the frequency, so it really stands out in terms of Cabrera's swing plane matching his production level. That is, you'd like to see that where he gets the best production is also when he swings it the most.

Anyway, so Daren shows both batting average and HR, which is one of the very few times I like batting average. Batting average directly corresponds to successful contact, and HR directly corresponds to "perfect" contact.

Missing in there are doubles+triples, which is where wOBA comes in. However, given that both BA and HR are key pieces of data that you must show, the question is if we want to introduce a third piece of data, and whether that third piece should be wOBA or SLG. Let me make the case for SLG (even though my preference is wOBA).

Batting average has a "1" for 1B, 2B, 3B, HR. SLG has values of 1, 2, 3, 4, respectively. If you take 60% of batting average and 40% of SLG, you get this: 1, 1.4, 1.8, 2.2.

wOBA is 0.9, 1.25, 1.6, 2.0. If you multiply all that by 1.11 for scaling purposes, you get: 1.00, 1.39, 1.78, 2.22. What does this mean? In effect, wOBA (on contacted balls) is 60% batting average and 40% SLG. You can therefore introduce SLG, instead of wOBA, and you'll get to convey the information you need.

I should point out however that you CANNOT IGNORE SF, not for the purposes of the chart above. While batting average is an official stat, and so, we can't redefine it, "batting average on contacted balls" is not an official stat. So, we get to control the denominator. You can't throw away sac flies from contacted balls in the chart above. It's part of the frequency.

As well, reaching on error: those should count as well. After all, if you have a high exit velocity on a downward trajectory, that might increase the chance of error. Again, that's directly tied in to success. At some point, and soon, we'll think of "errors" being "bad" for offense as one of the most confusing things we'd have ever considered.

() Comments • • Ball_Tracking • Batted_Ball

Monday, January 25, 2016

When is having a low average score not always a bad thing?

By Tangotiger

?Suppose you are an outfielder that is positioned very well. You are involved in fairly aggressive shifting. Sharp flyballs or liners get caught at a higher rate on your team than on other teams as a result. Because the balls are hit hard, you likely would not have run too much. Maybe these balls are caught at an average travel distance of 30 feet.

Suppose you are an outfielder that always plays the same spot all the time. Your team doesn't believe in shifting, not even by batting hand. You won't catch most of those liners, but you will catch all those lazy flyballs that you jog 60 feet to catch.

Which outfielder did better? This is the concern when you think of average distance travelled as necessarily a good thing. Now, we don't necessarily live in such extreme conditions, so all that means is that there's layers of nuances to account for. All to say: be careful how you interpret data, like in this article.

These presentations are critical first steps, but they are the beginning. We need a lot of sifting before we can come to an opinion.

(9) Comments • 2016/01/28 • Ball_Tracking • Batted_Ball • Fielding

Tuesday, January 05, 2016

Performance by batted ball locations

By Tangotiger

?Good stuff from Jeff. Having read Jeff for many years, I think Jeff would likely characterize himself as NOT a stathead. Taken that as a given, I would also say that Jeff is an extreme saberist.

Jeff has the first key ingredient, and that is, he is a subject matter expert. What he thinks about, what he uncovers, what he looks for, these are things that only a true baseball fan would even think about. A non-baseball statistic expert wouldn't necessarily think about such things. And a pure stathead might stumble upon it. But a non-stathead baseball fan? Yes, that's the kind of things he thinks about.

Jeff has a second key ingredient, and that's to be able to translate the idea into something that can be organized into various components. And once you have those two things, all it takes is to roll up your sleeves and look for the right data in the right way. Then you get saber-magic. And you get nuggets like this.

And that Jeff is a writer at the quality of Joe Posnanski, that makes Jeff an extremely readable saberist, part of the Bill James family. Obviously, no one rises to Bill's level, but Jeff has all the little things that Bill has, like comparing Ken Griffey Jr to Willie Mays.

(2) Comments • 2016/01/06 • Batted_Ball

Monday, November 30, 2015

Blast from the past: Batted Ball FIP

By Tangotiger

?This was one of my favorite threads (start at post 8), notably because of Brian C's involvement a little later on, who does a tremendous job and presentation. I bring up this thread because of Jeff's recent article.

() Comments • • Batted_Ball

Tuesday, August 11, 2015

Bias of BIS stringers?

By Tangotiger

UPDATE: see comment 7.

***

?As I was reading this article, I was thinking: "human stringers are going to have a higher correlation than the camera/Doppler with ISO". And, it's true.

***

Interlude/update: I just read the comments, and Rally said the same thing:

When I see stronger correlations with ISO, SLG, etc. to hard hit%, the first question that pops into my head is how much scoring bias is here. If you’re trying to decide whether a borderline hit was hard or medium, I’d guess that the one that falls for a double is more likely to be scored hard hit than the same ball caught by an outfielder.

***

The author reports that the BIS stringer data has an r-squared of .70 (r=.84) between ISO and human-tracked "contact strength". ISO you will note is SLG minus batting average (basically, extra base hits, with extra weight for HR).

The author also reports that the correlation between batted ball speed and ISO is r=.62.

So, does this mean that how a human established "contact strength" is better than how a camera/Doppler/algorithm does it? No. It's basically evidence that a human stringer is more likely to mark a batted ball with higher "contact strength" if it went for extra bases than if it was caught.

Possible biases could be with ground balls. You can have a three-hopper going to the SS with an exit velocity of 95mph. But if it's a routine out to the 1B, what are the chances that the stringer is going to mark that as hard-hit? And similarly, a ball launched at 40-45 degrees at 95mph won't be marked as hard-hit as often as those launched at 15-20 degrees at 95mph.

We have to accept one thing: the exit velocities being reported (at least through Trackman/StatCast) are the gold standard. (Sportvision is the silver standard.) The human tagging of a play as hard-hit or not is inferior. And so, if say a human tags 20% (I'm making up the number) of exit velocities of 70-75mph as "hard hit", and furthermore, among those 20% are gap hits, that's a bias. A human bias.

In order to validate BIS (or any human stringer), we need to see the correlation of the BIS data and the outcomes (1b, 2b, 3b, hr, out) against the Gold (or Silver) standard (exit speed, and launch angle). Then we'll see the bias be apparent. And it'll explain the correlations noted above.

(8) Comments • 2015/08/12 • Ball_Tracking • Batted_Ball

Sunday, August 09, 2015

Bias in the published v unpublished StatCast data

By Tangotiger

?Great job by Henry (and Tony in the linked article) in trying to figure out what/how is the missing data.

() Comments • • Ball_Tracking • Batted_Ball

Wednesday, July 08, 2015

Infield Hit Rate by Speed Score

By Tangotiger

?A cool chart that illustrates the relationship between speed and getting infield hits.

(4) Comments • 2015/07/09 • Baserunning • Batted_Ball

Saturday, May 09, 2015

Flyball distance as function of batted ball speed

By Tangotiger

?Courtesy of our buddy Alan Nathan. He's showing there's a maximum distance by speed, presumably because of launch angle and backspin. Just a very lovely chart.

() Comments • • Ball_Tracking • Batted_Ball

Tuesday, May 05, 2015

Moustakas beats the shift, but you wouldn’t know it

By Tangotiger

?Terrific piece by Chris.

Chris does what others don't, and that is, look at ALL PA, not just the groundballs. Because as Chris shows, because Moustakas faces a severe shift, it allows him to go the other way and hit liners. Those liners the other way are a result of the shift. It's a cost to the defense of the shift. Don't say some sh!t like "oh, it's a liner, so, it's irrelevant if there was a shift". No, that's not how this works. You gotta look at it holistically.

() Comments • • Batted_Ball • Playing_Approach

Batted Ball data at Fangraphs

By Tangotiger

?David rolls it out, and Tony gives us some details.

(3) Comments • 2015/05/06 • Batted_Ball

Wednesday, March 18, 2015

Trajectory platoon

By Tangotiger

?Good stuff from Shane, as he follows up on some findings in The Book and looks at things in a more granular manner. He's basically showing that when same-type pitchers/batters face off, you get extreme results, at the expense of line drives. But when opposite-type players face off, they cancel themselves out, resulting in a bit more line drives. It's a nice way to show the effect.

() Comments • • Batted_Ball • Platoon

Thursday, June 12, 2014

Matt Cain and his ever-stagnant low BABIP

By Tangotiger

Lewis makes a nice statement here:

I fear we are becoming far too quick to identify outlier pitchers as exceptions to DIPS norms rather than understanding them to be manifestations of typical population variance.

I don't know who this "we" is, but the main point is interesting in its description. The basic idea is that we don't have true exceptions, or even true outliers. What we do have is simply a distribution of talent. With K/PA, that distribution of talent is quite wide in MLB. With BABIP, that distribution of talent is quite narrow. In either case, it's a distribution, not a bunch of players in one spot, and then a few exceptions on the tails.

This is why we apply Regression Toward The Mean. We have a reasonable idea as to the width of the distribuition of BABIP talent in MLB. Given the observed BABIP and the number of BIP those observations are based on (plus whatever park factors we have and if you can use FB/GB/LD tendencies, all the better), we can make an estimate as to what each pitcher's BABIP talent is.

And when it comes to an extreme case like Cain, we're going to move that needle somewhat toward the population mean, and we move it less, the more data we have. What we end up with therefore is still a distribution, no outliers, no exceptions.?

I don't know that it helps to be this technically correct. But to the extent that we shouldn't be lazy about it, I guess I agree.

(10) Comments • 2014/06/23 • Batted_Ball • Pitchers

Friday, April 04, 2014

Ode to Joey Votto

By Tangotiger

David is a bit of a fan.?

() Comments • • Batted_Ball

Sunday, March 02, 2014

“DIPS revisited” revisited

By Tangotiger

?This article from Max reminded me of this article by MGL ten years ago. It looks like Max is on the right path. I didn't look at all the particulars, so hopefully the Straight Arrow readers will critique it.

() Comments • • Batted_Ball

May 28 16:56		In support of Bill James against the implication of Catcher Framing
May 28 15:20		Statcast Lab: Switch Hitters and Swing Speed
May 06 13:59		Team depending on Free Agency
Apr 24 15:03		How bad will the A’s be?
Apr 11 13:38		Re-introducing WOWY NetGoals and NetShots for NHL
Apr 02 21:16		Bayesian inference: How much new information is contained in a streak?
Apr 01 21:25		Extra Innings: whatsup?
Mar 31 09:34		Goodbye Pythag Wins, Hello Gradient Wins
Mar 21 11:55		Revenge of the Defense
Mar 20 17:14		NaiveWAR and WAR2.0: Jacob deGrom
Mar 15 17:22		Statcast Lab: Catcher knee height prior to pitch release
Mar 07 09:12		Plesac says to NOT stack your lineup with RHH against LHP
Mar 06 17:40		Improving WAR: Pitching
Mar 04 15:59		Complete Historical Run Expectancy Chart
Mar 03 11:24		VOZ - Value Over Zero
Feb 16 11:35		Statcast Lab: Do some batters overswing?
Feb 16 09:09		Statcast Lab: Swing Speed Distributions by Pitch Types
Feb 16 00:55		The Math behind the NFL OT Playoff Rule
Feb 07 09:17		Pull Rate and xwOBA
Feb 06 23:42		Draft Function
Jan 23 10:48		Statcast Lab: Delta-Frequency Maps
Jan 22 11:30		Statcast: How credible are swing speeds for batters?
Jan 22 10:29		Statcast: Theoretical and Actual HR Park Factors
Jan 21 19:11		Statcast: Amount of Extra Carry at Each MLB Ballpark, 2020-2023
Jan 16 15:14		Statcast Lab: Proving with Science The Art of Framing
Older comments Page 2 of 150 pages < 1 2 3 4 > Last ›
Complete Archive – By Category Complete Archive – By Date 2024 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov 2023 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2022 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2021 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2020 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2019 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2018 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2017 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2016 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2015 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2014 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2013 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec FORUM TOPICS Jul 12 15:22 Marcels Apr 16 14:31 Pitch Count Estimators Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS Jan 29 09:41 NFL Overtime Idea Jan 22 14:48 Weighting Years for NFL Player Projections Jan 21 09:18 positional runs in pythagenpat Oct 20 15:57 DRS: FG vs. BB-Ref Apr 12 09:43 What if baseball was like survivor? You are eliminated ... Nov 24 09:57 Win Attribution to offense, pitching, and fielding at the game level (prototype method) Jul 13 10:20 How to watch great past games without spoilers

Tangotiger Blog

Batted_Ball

Tuesday, December 19, 2017

Sunday, June 18, 2017

Monday, May 09, 2016

Saturday, May 07, 2016

Monday, May 02, 2016

Tuesday, April 19, 2016

Friday, February 05, 2016

Monday, January 25, 2016

Tuesday, January 05, 2016

Monday, November 30, 2015

Tuesday, August 11, 2015

Sunday, August 09, 2015

Wednesday, July 08, 2015

Saturday, May 09, 2015

Tuesday, May 05, 2015

Wednesday, March 18, 2015

Thursday, June 12, 2014

Friday, April 04, 2014

Sunday, March 02, 2014

Recent comments

Older comments

Complete Archive – By Category

Complete Archive – By Date

FORUM TOPICS

Latest...