[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

A blog about baseball, hockey, life, and whatever else there is.

Basketball

Basketball

Thursday, June 18, 2020

Floating Replacement Level

This discussion is easier to think of it for hockey: When Sidney Crosby goes down, his 22 minutes gets picked up by the other 11 forwards (1 minute each) and the 13th forward (11 minutes). So basically, Crosby gets replaced by 50% an average Penguins forward and 50% the bubble player.

On the other hand, when the 12th forward goes down, his 11 minutes gets picked up totally by the 13th forward. Same thing with the six defenders, or with the goalie.

When it comes with baseball, the concept of chaining would also apply, BUT NOT AS MUCH, as Patriot describes very well here (look for the section titled Chaining).  In hockey, players are much more fluid in terms of giving out playing time.  There's 120 minutes to give out to the defenders.  When one guy goes down, everyone below him steps up a bit, getting a couple more minutes, and the 7th player slides into the 6th slot.  With baseball, it's somewhere between goalie and defender: not as rigid as a goalie, but not as fluid as a defender.  You could slide someone up the batting lineup, but you wouldn't necessarily slide the regular 2B to SS.  It would be too unfamiliar.

And so, in the Crosby example, where you could argue it's basically half way between average and bubble, in baseball, it's going to be much closer to the bubble line, even for the top-end player.  And so we kinda take the lazy way out and apply 100% the bubble player.  But don't think that's RIGHT.  It's just EASY and close enough.  Be careful in applying the concept to other sports like hockey or basketball.

If you want a thought exercise: if your active roster was 40 players or 100 players in MLB, NHL, or NBA, would you take the LAST player as the bubble player?  No.  Then we can see how the easy way we applied on a 25-player roster is the WRONG way. It won't be close enough to right.  It'll be close enough to wrong.

So, you just have to be careful to understand WHY we made the choices we made, and see how it can apply to your circumstances.

Thursday, April 02, 2020

Do fans prefer small or large post-seasons?

​I asked that question of NHL, NBA, NFL, CFL, and Euro soccer fans. And to guage their interest in a tiny to wide open post-season, I offered stark choices: either 2 teams, or 75-80% of all the teams. No middle-ground. When the chips are down, are you a small-playoff or big-playoff fan?

I’ll start with NFL. I first asked to consider an 18-game season (which is more than the current 16, but it’s been talked about forever and it’s in-line with the CFL). By a 70/30 margin, those fans preferred a 2-team playoff (in other words, play right away for the Super Bowl) than a 24-team playoff. In other words, by going to 18 games, the fans did not have an appetite for an extended playoff season.

However, when I suggested an 8-game season, the tables were reversed: By a 60/40 margin, those fans preferred a 24-team playoff to a 2-team playoff. That is, when the regular season is too short, the fans would like an extended playoff season.

Logically though, 100% of fans should have preferred the 24-team playoff. After all, that would suggest another 4 or 5 rounds of playoffs, meaning that the bottom teams would play 8 games, while the rest of the teams would play 9 to 13 games, depending how far they go into the post-season. If you have an appetite for an 18-game regular season, why would you not want an 8-to-13 game regular+post season?

Anyway, so the midpoint is 12 games: if you have a 12-game regular season, fans are just as likely to prefer a 2-team post-season as a 24-team post season.

***

The NBA fans showed a similar split: 65% of fans prefer a 2-team post-season, after an 82-game regular season, while 57% of fans prefer a 24-team post-season after a 36-game regular season. The midpoint where fans are split down the middle we would infer as a 52-game regular season.

***

The NHL fans are much hungrier for the post-season, maybe lending to its history. At one point, they had 16 of 21 teams make the playoffs. As more teams have been added, the 16 became a mainstay.

So, 55% prefer a 2-team to a 24-team playoff with an 82-game schedule, while 63% prefer 24 to 2 with a 36 game schedule. Fans are split down the middle with an inferred 68-game regular season.

***

For Euro soccer fans, things are QUITE different. With a 38-game season (34 to 38 is the standard there), 73% prefer 2 teams. With an 18-game season, still 58% prefer 2 teams. Which logically makes no sense at all.

For example, suppose we construct a 38-game season such that the first 19 games is one game played against each of the other 19 teams in the league. Then, after that happens, the top 10 teams play one game against each other, while the bottom 10 teams play one game against each other. We’ve now constructed a 28-game regular season schedule.

And we add a provision that a win in the second half counts twice as much as a win in the first half. In other words, we get that playoff feel, but every team gets to play the same number of games. Wouldn’t THIS be preferred to stopping after 19 games, and simply awarding the championship to one of the top 2 teams?

***

The 9-team CFL fans were offered no playoffs at all, just award the Grey Cup to the top team after 18 games, or 8 of the 9 teams making the post-season. 59% preferred an auto Grey Cup. For a 12-game regular season, 54% preferred an 8 team post-season. The midpoint is a 14-game season.

Monday, July 29, 2019

We’re all Statheads; we just choose our own stats

"We're all Statheads; we just choose our own stats"

-- Cory Schwartz

I was reminded of what Cory likes to say when I made a flippant remark, in a not-so-subtle guise of a poll, to the point that hockey's plus/minus is better than totally useless.  I figured being better than totally useless was an easy bar for a baby to crawl over.  Unfortunately, a healthy one-third disagreed, and those who disagreed were more vocal about it.  Some like the usually loquacious MannyElk went with the direct "delete this".  CJ tried to be more nuanced. And everyone and his brother at EvolvingWild just disagreed.

Score Differential

In baseball, it's natural for us to look at team run differential, and make that the core to our metrics.  Indeed, that's the core to WAR found at Baseball Reference, extended down at the player level.

In hockey, you can follow a similar idea, look at goal differential, and make that the core to our metrics.  (And similar for basketball, football, soccer.  And while I know nothing about cricket, I'll say: cricket too.)  And naturally, if you are looking at it at the team level, you'd want the sum-of-the-parts to equal the whole.

Now, it's NOT NECESSARY that you apply the sum of parts theory.  After all, to do that, you'll have to decide how to handle Random Variation.  There's only so much you can identify at the player level, and so, there's going to be a gap in our knowledge.  Bill James for example in Win Shares simply plows through it, and insists on it.  I on the other hand am content to say "I dunno", and create a timing bucket.  Regardless though, the key point is that all the runs are accounted for.  They may not be accounted for at the PLAYER level (which Bill would insist upon), but at least I can account for the existence of all of them.

And the same would work in hockey.  If a team scores 4 goals and allows 2 goals, we should account for 4 goals scored and 2 allowed. And while we'd like to have all six goals assigned to players, I am content to say "I dunno", and create some sort of unknown bucket.  This could be timing, or random variation, or simply data that is too hard to assign to players.

If you do not do this, if you don't account for all the goals, then you are simply telling the reader: "trust me, I know what I'm doing, and it doesn't add up, because I don't need it to add up".  You can of course do this.  But you are creating an unncessary hurdle.  Rather, it's simpler to just acknowledge the gap.  In the above case, you may say a team scored 4 and allowed 2, but your process say the team is going to be assigned 2.7 goals scored and 3.2 goals allowed, because, "the process".  That's not the best way to sell something.

Extending Differentials

Now, goal differential is a core metric at the team level.  And extending it at the player level is also a core metric.  Hockey complicates things because of the man-advantage scenarios. And that players don't play with everyone.  And the number of goals is low to begin with.  Which is why we talk about adjustments.  This is common in baseball, where we can adjust our core metric, like say wOBA or ERA, based on the scoring environment or other influences.

Hockey's plus/minus is already in the currency we want for a core stat: goal differential.  However, there are other plus/minus stats you can do.  You can do it for all shots, which of course includes goals.  So as to not mix anything, I'll call goal differential as NetGoals.  And so we also have NetShots.  NetShots is of course at its core, NetGoals plus NetNongoalShots.  In other words, if you are going to praise NetShots and deride NetGoals, what you are saying is that NetNongoalShots is pivotal.  That including NetNongoalShots is what makes or break NetShots.  That relying on NetGoals is totally useless, even with adjustments.  And that is an untenable position.

Merging and Unraveling

You can also try to argue that since we have both NetGoals and NetNongoalShots, that we therefore no longer need to focus on them as components, that we can simply look at NetShots.  Or some sort of weighting of the two, but still, amalgamized into one metric.  This is like arguing that if you have wOBA, you don't need OBP.  Or you don't need K/PA.  Au contraire, the components are the key.  And that's because the weightings of the various components are not a given.  They are often necessary to keep them separate, because the weightings are dependent on the number of trials.

RBIs are totally useless if you already have wOBA and RE24.  That's because you can get to RBI through those two metrics.  But if you don't have RE24, then RBIs (and Runs Scored) do have some non-useless value.  They are not totally useless.  The timing of events is important.  And distinguishing between goals and non-goal-shots is important.  And how you distinguish between goals and non-goal-shots is not a constant.

The key thing that I follow in my metrics is "how".  How did this happen, why did this happen, how do we explain this happened.  I don't roll my stats up into one number to let it sit there and... sit there.  The metric has to be able to be unraveled back to its components.  And I have to be able to explain it all in english (or french if I'm feeling confident). That's how I construct my metrics.  You don't have to do it this way of course. The world is a big place.  

Monday, January 28, 2019

How do we know how many WAR to give out to nonpitchers and pitchers?  Or goalies?  Or QB?  Or?

?If you notice on Fangraphs, they hand out 57% of the WAR to nonpitchers and 43% to pitchers. This is actually the split that I determined some 15 years ago. Baseball Reference hands out at around 59/41, presumably based on a similar technique that Straight Arrow reader Rally Monkey came up with. I don't know how much Win Shares gives out, but I think it's around 64/36?

How did I come up with 57/43? We have to know the spread in TRUE TALENT. The problem is we don't actually know the spread, so we need to infer it. And we infer it based on observing what has actually happened, and removing the Random Variation that pollutes all observations. And when you go down that road, we end up with a standard deviation of a talent distribution that is roughly a ratio of 4:3 for nonpitchers and pitchers.

If you tried to do this for the NHL, the spread is going to be roughly 60/30/10 for forwards, defensemen, goalies.

I've never done it for the other sports. However, what you will typically find (not always, and not so strict) is that player salary is a decent approximation for the split. Again: not always; not so strict. But it's a decent guidepost. And where it deviates, then you will find a market inefficiency.

Wednesday, January 16, 2019

Introduction to WAR, part 1 of n

?While this is for the NHL, it borrows heavily from MLB and NBA, as well as heavy number of citations.  It's very thoughtfully laid out and should be required reading.

Disclosure: The Evolving Wild twins (might have been just one of them now that I think about it) asked me to take a look at it beforehand, and I provided some feedback. 

(1) Comments • 2019/01/17 • History Basketball Hockey

Saturday, January 12, 2019

What is the replacement pool in the NHL?

?I'm going to follow a similar process that I did for the NBA (a league that I know almost nothing about... Curry... Lebron... Lin still there?... that's about it). The idea there is straightforward enough: the regulation game is 48 minutes and with 5 players on the court, that's a total of 240 minutes. With 30 teams (still 30 I hope), that means we need 7200 minutes for 30 NBA teams, each game. 

Take the average minutes per game for each player, as well as their total minutes played. Sort by total minutes, then sum their minutes per game, until you get to 7200. 

The idea being that all you need are those guys if no one is hurt. They have enough talent to play 7200 minutes. Everyone else is part of the replacement pool.

Got that? No? Go back and re-read, I've got 5 minutes left here. Back? Good, let's keep going.

NHL is similar, though if someone wants to break it up into forwards and defenseman, I agree with you, and you can follow the same steps. Using data at Hockey Reference, we have a total of 754902 minutes by skaters for the 2017-18 season, or an average of 297 minutes per game. With 31 teams, that's 9207 minutes needed.

So we sort our players from most minutes played (Doughty, 2201) to least (Valk, 3 minutes). There's 890 players. Anyway, Doughty had 26.8 minutes per game, and the next game with the most minutes played (Suter) also had 26.8, and on and on we add. Again, sort on total minutes, sum on average minutes. That's important. With 532 players with the most total minutes, we end up with 9205 minutes played per game across 31 teams. So we can draw our line at players with at least 680 minutes played.

Their total minutes played is 665,777, or 88%. And so, that's what I would suggest comprise the normal NHL roster. The 12% of minutes played is what our replacement pool is going to cover.

I don't remember what it came out with NBA that time, but 12% seems reasonable for NHL. You could probably argue that you don't need 532 players, but maybe 496 (16x31) if everyone was healthy?

Anyway, that's one way to get there.

EV, PP, PK adds a wrinkle, and we'd want to split the pool between F and D

Friday, December 28, 2018

How to create WAR for any sport

?While WAR is wins above REPLACEMENT, the most important part of WAR is the comparison to AVERAGE. Indeed, the replacement step is both an after thought, and in some respects, unnecessary.

I'll use baseball and hockey as examples, but any sport will work the same way, whether basketball, volleyball, or cricket.

What you want to do is measure all the aspects of a player's performance relative to the league average. Not for the position, but the player, unless that position is very (VERY) distinct, like pitcher in baseball or goalie in hockey. Infield/outfield, and defenseman/forward are not distinct enough. Catcher might be, but we'll let that go for this thread.

So, figure out all the components. Hitting, running, fielding, pitching, scoring, passing, checking. As long as you got all the components, you are good. Measure the player however you want to, and compare to the league average. 

You want to measure in the currency that you can measure in, meaning bases, outs, runs, goals. And eventually, you want to convert into wins. For baseball you can use a standard 10:1 runs to win converter and in hockey 6:1 goals to wins. But, we can get into another thread how to get it more dynamic.

Once you have all that, you will have an Individualized Won-Loss Record for a player, or what I call The Indis. You simply add it up. For hockey, it might look like this:

  • 2.0 wins, 0.5 losses, scoring
  • 1.5 wins, 1.0 losses, passing
  • 0.5 wins, 1.0 losses, checking
  • 0.5 wins, 0.5 losses, positioning

You'd probably want to break it up into EV, PP, PK, and if you have more components, then by all means, include those.

Anyway, so now you have The Indis of:

  • 4.5 wins, 3.0 losses, total

And you can stop right here. Notice I haven't even talked about replacement level. Like I said: after thought. But people love lists and single dimensions, and so, we need a way to convert thatto a single dimension.

Replacement level in MLB is around .300 and probably .250 or .200 in NHL. That too is yet another thread. For the sake of illustration, I'll use .333.

  • 4.5 wins, 3.0 losses, total
  • 2.5 wins, 5.0 losses, replacement level for 7.5 individualized "games"

=========

  • +2 wins above replacement

That's how it works. This is the framework. Don't try to get cute and try to create a "offense above replacement". You will be wrong. Not as a matter of opinion, but a matter of fact. You CAN say "offense above the offense generated by a replacement level PLAYER". That's as far as you can take it.

But like I said, anything after The Indis is an after thought.

If there is one thing I did wrong when I rolled out WAR on my blog some 12 years ago was that I did not pause at The Indis level, and went straight to WAR. That's because that's what I needed at the time. Had I known WAR would take off the way it did, then I would have ensured that intermediate step would be more forceful. And it would reflect the replacement step is just a secondary optional step.

The replacement step IS required if comparing a position player to a pitcher, or a skater to a goalie, or two players of uneven playing time, like a starting pitcher and relief pitcher, or simply an injured player to a full time player. THEN we need it. Because we eventually want to translate that into some sort of dollar value, or any kind of value.

Have fun!

***

UPDATE in response to a comment below:

I literally spent 5 minutes writing that, so it was just a stream of consciousness how-to. So, yes, there’s about two hours worth of things I didn’t write!

But to your point: what you want to do is create “game spaces”.  Let me explain it in hockey and in baseball.

In hockey, about 10% of the game space goes to goalies, 30% to defensemen and 60% to forwards.  With 82 games, you assign 8.2 games to G, 24.6 to D, and 49.2 to F.  For each player at each position, you give him his share of those game spaces based on how much ice time he had.  It gets a bit more complicated because of EV, PP, PK.

In baseball, about 4/7ths of the game space goes to nonpitchers and 3/7ths to pitchers.  For nonpitchers you give them their share based on PA, though things get a bit complicated with subs (and DH).  For pitchers you base it on innings, or more accurately, leveraged-innings.

Basketball is the easiest, because it’s just a pure share by court time, no positional restrictions.  Or at least I don’t think so.  I don’t follow basketball.

Tuesday, June 12, 2018

Is replacement level important for the WAR framework to work?

?No!

Some twelve years ago on the old Book Blog, we would talk about off-season contract signings. And to do that, we needed to (a) forecast multi year and (b) have a single-dimensional metric to match... the single-valued contract number. You know all that stuff about "you can't put everything in one number"? Eventually, you HAVE to, either as a decision (yes or no) or as an agreement (a number, whether you are buying a pair of shoes or signing a mortgage). So, we had to come up with a number. And thus, the WAR framework that I championed took flight on that blog.

And it caught on, a bit too quickly. Because the single-number thing was too easy to miss what it was doing under the hood. The ENTIRE FRAMEWORK is built on "wins above average". The PROPER representation for that is The Individualized Won-Loss Record, or The Indis.

If for example I asked you to come up with your own represention of the W-L record for the 1983 Expos, you'd come up with something pretty close to what I have here. I don't know how far off you'd be from say Dawson's 10-1 record that I have, but I'm pretty sure that if a dozen of you provided your estimate, it would average to within 1 win of that (somewhere between 9-2 and 11-0). Probably.

The conversion from that TWO DIMENSIONAL record into the ONE DIMENSIONAL record (his WAR) is a step that is not necessary for the WAR Framework. Which is weird for me to say that the last step of WAR is not neeeded for WAR. What that last step does is simply establish the "zero point", the point at which "nothing changes no matter how much more, or less, I have". 

If there was one failing with the WAR that I championed a dozen years ago is that I didn't have the two-dimensional construction in place. At the time, I just needed WAR to evaluate contracts, and so, I needed the one-dimensional value. In effect, I didn't show what was under the hood, and just showed the car. And my attempts to try to bring into place the two-dimensional version has not been met with much enthusiasm. But the reality is that as WAR takes hold of other sports like NHL, NBA, and soccer, that it becomes imperative that the two-dimensional construction takes hold. Otherwise, the shortcut I took to get to the one-dimensional construction may get lost as to how and why I did that. 

Saturday, November 25, 2017

How many games do you need to forecast future games in the NBA?

?This is what I did: 

  1. I took each team's games, ordered from 1 to 82 by date, and discarded the last 2
  2. I created 8 sets of 10 games each for each team, by the above order
  3. I matched for each team, each consecutive sets.  So, for the Bulls, I took paired sets 1+2, sets 2+3, sets 3+4 .. sets 7+8.  I did this for every team.
  4. I ran a correlation.  I got a coefficient of 0.5244 correlating set 1 onto set 2
  5. I then repeated the process, but this time it was 3 consecutive sets
  6. This time, I took set 3 and removed 0.5244 from set 2, and ran a correlation of set 1 against the "remainder" (set 3 with set 2 coefficient applied).  I got a coefficient of 0.1419.  In other words: set3 = set2*0.5244 + set1*0.1419
  7. I continued to the next set.  This time, we got a coefficient of 0.1345 for the new set.
  8. Afterwards, I ended up close to 0 for the next 3 sets and stopped there (0, 0.0449, 0).  The two 0 were actually slight negatives, but that makes no sense, so, I capped it at 0.

In other words, the total coefficients was just over 0.84.  But by "forcing" a sequence like this, we can better isolate the time sequence.

We can force a fit of 0.46^Set, where Set is 1 through 7.

(12) Comments • 2017/11/26 • Basketball

Sunday, June 11, 2017

What is the true home site advantage of each sports team?

?I don't know what the true answer is yet.  But my expectation (very strong expectation for whatever that is worth) is that it is far tighter in NFL and NBA than in MLB and NHL.

What do I mean by that?  Well, each NBA game tells you a great deal more about the teams than each NHL game.  So, if we suspect that the Celtics have a .620 home site advantage and the Bruins have a .550 home site advantage, then we'd guess something like .610-.630 for Celtics and .525-.575 for Bruins.

Now, if we bring in the Patriots, we know more on a per-game basis for the Patriots home site advantage than the Celtics.  However, because we have so many more NBA games, our uncertainty level will get reduced far faster with NBA than NFL. 

More to come as I try to compose my thoughts on this matter...

Friday, January 20, 2017

Phil v ELO, men v women chess

?Phil is at it again, this time going through the obvious incompatibilitiy with NCAA v NBA (making you wonder why is he even bothering, but is actually a foreshadowing) to then look at men v women chess ratings.

I'll offer Phil even more relevant cases: MLB AL v NL, 2005-present (but since I cherry picked 2005, you should start in 2004).  You can look at NBA West v East.  And closer to home: CFL West v East, and NHL West v East.

It offers a good look at the "leakage" issue that Phil brings up, especially since inter-conference games are much fewer in MLB than NBA.  IIRC, I think I looked at the CFL, and it seemed that the proportion of games was just about right, such that their win% represented their true talent levels.  I think.

Anyway, would love to see the Phil-touch applied here.

Sunday, September 11, 2016

WPA in NBA

?I'm quite enjoying this article by Shane Jensen (of SAFE fielding metric in baseball) et al.

The intro had the effect of making me thinking he was being "analytic political".  You can tell that because you can make virtually the same argument and come to the opposite conclusion.  But, by the end of the intro, he was quite clear and honest, that he redeemed himself.

I'm reading the part with the three images, and I think figure 1A is so completely outlandish as a comparison point that it should never have  been introduced.  I mean NO ONE thinks like this, so why treat is as some sort of benchmark.  Anyway, if we just focus on 1B and 1C, that's where the payoff is.  Except there's no reason to do the smoothing by promixity points like he's doing.  When you have something complicated (like wOBA by exit speed, launch angle, spray angle), then yes, that becomes almost necessary (until we can create a function, which is my plan).  And that is how I have been doing it in Statcast Lab.  

But in this particular case, with just two simple parameters, time and lead, you can simply draw a smooth line for each lead, and the only thing you make sure is that the higher the lead the better chance of winning. Plotting like a heat map like they did is what I call "mathematical gyrations".  Whenever you can, make the solution as simple as possible... but no simpler.

I consider my comments more of a technical nitpick, because the overall conclusions won't change much if at all.  I'm now on section 2.2 and will update this blog post after I finish that.

***

Ok, this hasn't been addressed, but this is, to me, the main problem with all these plus/minus metrics.  There is a HUGE amount of "sharing" in hockey and basketball, unlike baseball, that the "opportunity space" isn't so clear cut.  If you put the top 5 NBA players on the same team, their "sum of the parts" will be less than the whole.  There is a diminishing returns aspect.

But, then we get back to "retrospective" v "predictive".  The title of the paper itself is a problem, because it talks about "chances": Estimating an NBA player’s impact on his team’s chances of winning.  You can read that both restrospectively and predictively.  Since in the intro they are quite clear that this is retrospective, they then don't need to consider the issue of diminishing returns.  They are taking, as a given, the context.  And if a top NBA player plays with great players, then it's possible that his impact is being muted, compared to a top NBA player who plays with nobodies.  OR, it's possible there is a leverage aspect that a top NBA player NEEDS at least one other top player in order to fully shine.  

I don't know the answer.  And this issue is also besides the point of this paper.  I just want the Straight Arrow readers to be aware that this issue, seemingly semantic, would have required a (potentially) drastically different approach, if the topic was about predictive.

Onto section 3.

***

Tremendous work on the "leverage profiles".  This was exactly what I was thinking about with my note above.  I love research papers that anticipate my questions and answer them.

***

[quote]A natural question to ask about any player evaluation metric is how stable it is year-to-year.[/quote]

And this is my fear.  Since the thrust of their evaluation was based on retrospective, now turning to the predictive, WHILE NOT CHANGING THEIR METHODOLOGY is a curious choice, and the semantic point I just made now becomes a central point.  Let's see how they handle it, if at all.

In many ways, this is like FIP and FutureFIP.  FIP is retrospective.  FutureFIP is predictive.  If you wanted me to talk about the past, I'd use FIP.  If you wanted me to talk about the future, I'd use FutureFIP.  It would be therefore foolish to compare a predictive metric to FIP.  

Back to reading...

***

[quote]We observe that the correlation between 2012–2013 and 2013–2014 Impact Score is 0.242, indicating a rather moderate positive trend.... Because it is context-dependent, we would not expect the year-to-year correlation for Impact Scores to be nearly as high as the year-to-year correlation for PER (correlation of 0.75), which attempts to provide a context-agnostic assessment of player contribution.[/quote]

Right, exactly.  So, in order to compare apples to apples, the score/lead should be removed from consideration in order to compare to something like PER.  The context-neutral Impact Score can actually be better than PER.  But, we don't know that!  Which is why this paper is screaming for that work to be done.  Well, maybe they do it.  Let me keep reading...

***

The paper is beautiful in asking the right questions.  If the authors of the paper provide some of the aggregated data, I'd love to put my little spin on it.

One of the things that is intriguing is the 5-man combo.  If I'm reading it right, the Curry-led combo has a net impact of +.183 wins per 48 minutes, or a .683 win%.  By the way, this is what they should be showing.  They show an "impact score" and minutes, but the scale that everyone cares about is win%.  I guess it's the difference between what you see in a research paper and what you see in a blog.

They talk about that unit facing the unit with the worst impact score.  Except I can't tell WHAT their score is.  And I can't tell what the final matchup win expectancy is, nor what log5 would suggest.  They do show the results as a distribution (not the mean).  But even that is problematic, because their scale is so wide that when you see Figure 14, you hardly see it as a huge difference.

***

So, this paper has all the ingredients you need to have one of the best saber-level research pieces of the year.  We just need to change the recipe a bit.

Monday, May 23, 2016

Who’s hurting math?  Case of Lebron and Dahntay Jones

?So, it appears that we have a player who was signed for one game.  His ONE-GAME salary was 8819$.  But his ANNUALIZED salary would be... well, I don't know what it would be.  Somewhere around 1 million$.  This chart suggests it would be 1.5 million$ for a veteran.  In order to get a one-game salary of 8819, that would have to be based on a ONE DAY salary.  That is, if you have 170 DAYS, then getting paid 8819 in one day is an annualized pay of 8819x170 = $1,499,230.  And if you go to that link I showed you, it matches almost perfectly.  If he's actually getting paid $8818.75, then it matches perfectly.

So, when it comes to fines, it says that the fine is 1/110th of your salary.  But, clearly that's based on your ANNUALIZED salary, and not on your one-day salary.  Which means the fine would have to be for over 13 thousand dollars, more than he was actually paid!  I suppose there's a cap on the fine.

Therefore, I'd like to see clarification on that, both that the fine is based on annualized salary, and secondly, if there's any cap in place.

Saturday, March 19, 2016

What is the current true talent level of home court advantage in NBA?

?I'm shocked it could be like this.  Can someone explain it?

Tuesday, March 08, 2016

NBA shot charts

?Code generously being shared.  If I remember right, Todd is the guy being Gambletron2000.

(3) Comments • 2016/03/09 • Basketball

Thursday, December 03, 2015

How many more shots should Curry be taking?

?Everything is about a tradeoff.  So, could the Warriors be even better than perfect?

(10) Comments • 2015/12/04 • Basketball

Tuesday, October 13, 2015

Odds of the home team winning

?Terrific chart.

Friday, October 09, 2015

NBA CARMELO forecasting system

?From the gang at 538.

(7) Comments • 2015/10/11 • Basketball

Tuesday, September 22, 2015

Arguing to argue, but not to conclude anything

?I liken the discussion of comparing players decades apart as basically just everyone yapping, without really trying to get to an answer.  Joe lays out the foundation for Bryce v Ted Williams and I chimed in a thing or two.

It's interesting how you get these discussions in baseball, but you don't get them in hockey or football.  Howie Morenz may have been voted the #1 player of 1901-1950, but we all understand that the game is far different today.  And at some point, even Bobby Orr and Wayne Gretzky will get relegated to their own era, especially when all players in a few decades will weigh at least 200 pounds.

It may be fun, but that's pretty much all you can do with these discussions: have fun.

Monday, August 17, 2015

Roland Beech and Dean Oliver

I haven't been keeping up on the comings and goings of the data analyticians in the NBA.  This sure looks pretty interesting.

(2) Comments • 2015/08/18 • Basketball
Page 1 of 7 pages  1 2 3 >  Last ›

Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

THREADS

June 18, 2020
Floating Replacement Level

April 02, 2020
Do fans prefer small or large post-seasons?

July 29, 2019
We’re all Statheads; we just choose our own stats

January 28, 2019
How do we know how many WAR to give out to nonpitchers and pitchers?  Or goalies?  Or QB?  Or?

January 16, 2019
Introduction to WAR, part 1 of n

January 12, 2019
What is the replacement pool in the NHL?

December 28, 2018
How to create WAR for any sport

June 12, 2018
Is replacement level important for the WAR framework to work?

November 25, 2017
How many games do you need to forecast future games in the NBA?

June 11, 2017
What is the true home site advantage of each sports team?

January 20, 2017
Phil v ELO, men v women chess

September 11, 2016
WPA in NBA

May 23, 2016
Who’s hurting math?  Case of Lebron and Dahntay Jones

March 19, 2016
What is the current true talent level of home court advantage in NBA?

March 08, 2016
NBA shot charts