[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Friday, February 01, 2019

Park factor v sequencing

In this post, Hareeb makes this observation:

As it turns out, properly removing park factor noise (wRC+) is more important than capturing sequencing (Runs Scored).

I never really thought about it, but it seems like an insightful observation. Could we have figured that out without doing a regression?  Let's see.  I've never done this before, so let's see where it takes us.

As Hareeb reminds us, high runs scored is based on:

  1. high offensive talent (think true talent wOBA + true talent baserunning)
  2. timing of good events
  3. run-friendly parks?

Which is the most impactful?  We can try to make a decent estimate.  Let's take them one at a time.

Spread in team talent (offense and defense) 

  • One standard deviation in win% is about .072, which means that we can infer that one SD is .060. 
  • And since offense = defense, then we can estimate that one SD of win% attributed to offense is 1/root2 of .060, or .042.
  • And since 10 runs ~ 1 win, then one SD of true talent run scoring per game is 0.42
  • So over 162 games, that's 1 SD = 68 runs of true talent (or 1 SD = 82 runs of observation)

Spread in sequencing

  • Roughly speaking, one SD of random variation of wOBA over 162 games is: 0.5 x root(38PA x 162G) = 39, which we can scale to runs by x0.8 = 31 runs
  • If we add the random variation of wOBA to the true talent of team, we get one SD = 74 (root of 68^2+31^2)
  • We are still short 34 runs, which is probably the effect of sequencing.  I don't necessarily like this "leftover" approach, but we just need a decent starting point

Spread in parks

  • One SD in park factors is probably 5%, which means that with ~ 4.5 x 162 = 729 runs, 5% is 36 runs

Sooooo... spread in parks and spread in sequencing and spread in random variation are.... all about very similar, with parks taking the slight lead!  At least using this approach.

Hareeb points out that:

  • wOBA + minimizing park effect = wRC+
  • wOBA + park + sequencing = Runs Scored

And since wRC+ beat out Runs Scored, that means neutralizing park effects has more impact than ignoring sequencing!  A brilliant observation.  And given my approach, I would have expected something pretty close to that (though not necessarily to that magnitude).

Fantastic, I learned something new!


<< Back to main