[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


2013 Bill James Handbook

Advanced


THE BOOK--Playing The Percentages In Baseball

A blog about baseball, hockey, life, and whatever else there is.

Blogging

Friday, November 09, 2012

Why are polls so inaccurate?

By .(JavaScript must be enabled to view this email address), 03:46 PM

There must be tons of bias in the polling data.  If we had completely unbiased data, then one standard deviation would simply be 0.5/sqrt(N).  Since polls will typically get say 2500 repondents, then one SD is .01, or 1 point.  Even if they have small polls, say only 625 responders, then one SD is 2 points.

But as we saw with Nate Silver and Wang/Princeton, they were off by 4 to 5 points using all the polls compared to the people who voted.  The theory may have suggested they should be off by 1 or 2 points, but they were off by 4 or 5 points (at the 1 SD level).

Given also how much movement there is in the polls, shouldn’t we in fact be multiplying the polls’ uncertainty level by some factor like x3 or x5 or something to account for the bias?

What am I missing?

(25) Comments • 2012/11/13 Blogging

How did Princeton Election Consortium do state-by-state?

By .(JavaScript must be enabled to view this email address), 09:35 AM

Professor Wang was kind enough to provide me the data, so I repeated the same process as I did with Nate.  I get a standard deviation of 5.4 of the difference between PEC’s Margin of Victory (MoV) and the actual. And for the 11 states where the MoV was under 6 points, PEC standard deviation was only 2.0, which is great. In either case, both numbers are slightly worse than Silver’s.

Data is below:

Read More

(24) Comments • 2012/11/12 Blogging

How did Nate Silver do state-by-state? Take Two

By .(JavaScript must be enabled to view this email address), 12:23 AM

Thanks to Alan for getting the data (xls), I show the Margin of Victory (MoV) that Nate forecasted, the actual MoV, the difference between the two, and the number of standard deviations (z-score) each difference was relative to Nate’s stated margin of error for each state.

The standard deviation of the difference was 3.7 points, which is also Nate’s stated average margin of error.  The standard deviation of the z-scores should be close to 1, and it was (0.92).

For the 10 states that Nate had as “close” (MoV forecasted of under 6 points), the standard deviation of the difference (forecasted minus actual MoV) was only 1.4 (with an SD of the z-score of only 0.5).  This suggests a bias (or huge amount of luck).

Data is below:

Read More

(14) Comments • 2012/11/12 Blogging

Wednesday, November 07, 2012

How did Nate Silver do state-by-state?

By .(JavaScript must be enabled to view this email address), 06:22 PM

UPDATE (again): my data source is still out-dated.  Soooooo… consider this thread as a rough draft.  Once data has been certified, then I’ll update this thread.

UPDATE: Someone noted that the data source I used was not accurate.  I have updated the data to reflect the updated source.  It’s still the same source of data, but the old data I used was not complete.

http://uselectionatlas.org/RESULTS/data.php?year=2012&datatype=national&def=1&f=1&off=0&elect=0


***

Nate was the most confident with Hawaii, where he had Obama’s margin over Romney at 34%.  Based on the source I used, Hawaii won by 43%.  That’s not really good, but, at that point, who cares how much you win by.  That that was the fourth-worst prediction Nate made is really immaterial.  His worst was that he had CT at +14% for Obama, and instead he won it by +3%.  Alot closer than he suggested.

I decided therefore to break up his forecasts into five groups: a margin of victory of:
12%+ for Obama
6.0-12% for Obama
rest
6.0-12% for Romney
12%+ for Romney

He had 12 states with a margin of victory (MoV) of at least 12% for Obama.  The average predicted MoV was 21% for these states.  The actual MoV was 22%.

He had 18 states with a MoV of at least 12% for Romney.  The average predicted MoV was 23% and the actual was 23%.

For the “leaning Obama” as described above, the four states had a predicted of 9% and was actually 9%.

For the “leaning Romney” as described above, the five states had a predicted of 8% and was actually 11%. 

For the ten tossup states, the average predicted was +3% for Obama and actually +4%.  The really impressive showing here was that the standard deviation of the differences was only 1.2% for the tossup states, whereas he had 4.0% standard deviation for all 50 states.

It’s either that Nate worked especially hard on these states, or, voter turnout was more representative of all voters in these states.  For example, maybe Romney voters don’t bother going to Hawaii to vote, because they see the results, and they know it’s Obama country.

There’s a certain amount of luck that Nate ended up getting all the states.  For example, he had the margin of victory for Colorado at 2.5%, but it was actually 4.8% (meaning that it was 2.3% more than expected).  It could have just as easily gone the other way (i.e., 2.3% less than expected), and that would require a recount.

Nate had his forecasted standard deviation at around 3 to 4 points for each state, and it was 4 points overall.  So, Nate was a bit too aggressive in his certainty, and therefore, was too aggressive in predicting the chance of success for each state.  He had Virginia for example as a 79% chance of a win for Obama, despite only forecasting a 2% margin of victory.

So, I think there’s a possibleplace for improvement in his model there.  Either that, or, as I said, because he nailed the MoV of the tossup states so well that he could be as aggressive as he was.  In that case then, his uncertainty rate needs to shoot up for the non-tossup states.

The data I used is below.

Read More

(35) Comments • 2012/11/09 Blogging

Can you devise a system where more people vote?

By .(JavaScript must be enabled to view this email address), 05:31 PM

Bill James was talking about creating various regions of subregions and so on.  You have a precinct of 47 people, you have 47 precincts in a ward, you have 47 wards in a region, you have 47 regions in a district, and you have 47 districts.  None of them necessarily have to be done geographically.

The idea is that if there’s only 47 people in your precinct, your vote is going to matter.  Then, knowing that, then maybe your precinct will act as a tipping point for your ward.  But then, do we have too many degrees of separation?

Maybe you have 121 people in each precinct, and instead of 5 layers like Bill is proposing, you’d have only 4 layers.  That would seem to work maybe?

I dunno.  Anyway, I like the thrust of Bill’s idea, and I’m interested to hear if you guys have one, or if this has been discussed elsewhere.

(30) Comments • 2012/11/30 Blogging

Is MLB playoff format a necessary evil?

By .(JavaScript must be enabled to view this email address), 02:32 PM

Here’s an interesting comparison between NBA and MLB from David.

Because of the setup of a basketball game and a basketball series, you are FAR more likely to get the two best teams into the NBA Final than you would get the two best teams into the World Series.  In order to improve that for MLB, you’d have to have less playoff rounds (and maybe even more regular season games).  Is this necessarily a desirable outcome?

What is it that people want?  Drama?  Or drama that involves two good-maybe-great teams?  Or drama that involves two great teams?  Realizing of course that the more high drama points you want, the less number of drama episodes you will actually get.

To put this in non-sports terms: would you rather have only the Batman/Nolan trilogy, or would you rather have the five X-Men movies plus two Fantastic Four movies plus Green Lantern plus all five of the Superman movies plus the two Hulk movies plus Thor plus all the non-Nolan Batman movies (i.e., every Superhero movie outside of Avengers, Iron Man, and Spiderman)?

That is: quality v quantity?

(17) Comments • 2012/11/09 SabermetricsMLB_ManagementBloggingOther SportsBasketball

Median v Average: Nate v Princeton Election Consortium

By .(JavaScript must be enabled to view this email address), 11:29 AM

Over at Princeton, they also hit all 49 states, with Florida a tossup (as they actually are).  At the bottom of page two, you will see what they forecasted, with the total at 303.

Nate also had the exact same states.  But he has the “average” at 313.  I think Nate shouldn’t be reporting “average” as his big number like he shows.  He is showing a wide array of numbers, and we can see the better chart a little further down, where the “most likely” is there.  He had a bit over 330 as the most likely, which he’s going to hit if Florida goes his way.  Then he has some number just over 300, which he’s going to hit if Florida doesn’t go his way.  So, the mode makes more sense, as a number to show on this chart.  And overall, in terms of even-odds bet, the median is what should be the other big number to show.  The average is also good to show, as he does, but I would like to see the other numbers be the big numbers.

Anyway, what I really appreciate about Nate and the Princeton guys and those of their ilk is their non-gasbaggery.  I was listening to Sean Hannity on the way in last night pre-election, for the pure pleasure of how much he can spin things, and he had some guy saying how he doesn’t believe the polls, how much he knows his county and his state (Ohio), and how his county is a harbinger, and how there’s so many people voting GOP, he knows Romney would win. 

Nate and the others like him appeal to my sense of logic and rationality, and I thank them for that. They make this ridiculous 18-month process bearable.  A process that sees billions of dollars spent, only to have us right back to where we actually are…. 50% of the time.

(1) Comments • 2012/11/07 Blogging

Tuesday, November 06, 2012

Poll: In today’s election, this is how I did (or would have) voted:

By .(JavaScript must be enabled to view this email address), 08:30 PM

(25) Comments • 2012/11/09 Blogging

Sunday, November 04, 2012

Documenting the criticisms of Nate Silver

By .(JavaScript must be enabled to view this email address), 03:06 PM

Excellent job by Colby Cosh in this article.

(36) Comments • 2012/11/07 Blogging

Thursday, November 01, 2012

Where is JCP&L in NJ?

By .(JavaScript must be enabled to view this email address), 07:43 PM

Here is a snapshot at 4:19PM and 4:40PM, as reported on their site, but collected by me.  It would be great if a journalist at the Star Ledger or Bergen Record would do this work every 20 minutes, so we can track this.  I wouldn’t mind doing it, but… I have no power!  I’m leaving the office soon, so I leave it to those with the will and the way to keep this going.

Then, maybe a web developer can merge this data, and show us a nice map.

In any case, if you are in Sussex, PAssaic, or Burlington, things are improving.  Ocean and Hunterdon on the other hand are getting worse, as there are now MORE outages than there were 20 minutes ago.

11/1/2012 16:19	11/1/2012 16:40		
 Affected 	 Affected 	 DIFF 	County-CTV
 5,799 	 5,399 	 (400)	BURLINGTON (NJ)
 10,227 	 10,227 	 -   	ESSEX (NJ)
 41,452 	 42,273 	 821 	HUNTERDON (NJ)
 9,301 	 9,302 	 1 	MERCER (NJ)
 71,840 	 71,841 	 1 	MIDDLESEX (NJ)
 234,286 	 234,602 	 316 	MONMOUTH (NJ)
 146,014 	 146,113 	 99 	MORRIS (NJ)
 162,641 	 163,709 	 1,068 	OCEAN (NJ)
 8,399 	 7,967 	 (432)	PASSIAC (NJ)
 29,773 	 29,773 	 -   	SOMERSET (NJ)
 40,590 	 39,724 	 (866)	SUSSEX (NJ)
 20,788 	 20,788 	 -   	UNION (NJ)
 29,117 	 29,145 	 28 	WARREN (NJ)
		 636 	TOTAL

The state chart also doesn’t seem to be that optimistic:

838,544 10:50 AM 
838,544 11:10 AM
840,274 11:31 AM
829,801 11:52 AM
829,660 12:12 PM
???,??? 12:3? PM
823,842 12:54 PM
831,393 1:15 PM 
831,326 1:36 PM 
826,617 1:55 PM 
822,109 2:15 PM
823,656 2:36 PM
814,303 3:59 PM
810,227 4:19 PM
810,863 4:40 PM

We can cherry pick the numbers, but the worst case scenario is that there were 19,000 customers with power restored in a period of 4 hours and 48 minutes (a rate of 4000 customers per hour).  The best case scenario from the above chart is 30,000 customers with power restored in a period of 4 hours and 59 minutes (a rate of 6000 customers per hour).

I’m guessing the reporting of the numbers is very sketchy.  In any case, 5000 customers per hour gives us an expected coverage of roughly seven days.

(23) Comments • 2012/11/06 Blogging

Are political commentators similar to sports commentators?

By .(JavaScript must be enabled to view this email address), 07:29 PM

Mark Coddington says this about political commentators, and I think you can find parallels with sports commentators:

The other objection political journalists/pundits have to Silver’s process is evident here, too. They don’t just have a problem with how he knows what he knows, but with how he states it, too. Essentially, they are mistaking specificity for certainty. To them, the specificity of Silver’s projections smack of arrogance because, again, their ways of knowing are incapable of producing that kind of specificity. It has to be an overstatement.

In actuality, of course, Silver’s specificity isn’t arrogance at all — it’s the natural product of a scientific, statistical way of producing knowledge. Statistical analyses produce specific numbers by their very nature. That doesn’t mean they’re certain: In fact, the epistemology has long been far more tentative in reaching conclusions than the epistemology of journalism. As many people have noted over the past few days, a probability is not a prediction. Silver himself has repeatedly called for less certainty in political analysis, not more. But that split between specificity and certainty is a foreign concept to the journalistic epistemology.

 

(35) Comments • 2012/11/08 Blogging

Sunday, October 28, 2012

Broadway Reviews

By .(JavaScript must be enabled to view this email address), 03:47 AM

A couple of interesting sites that collects reviews of Broadway shoes from regular folks and critics.

http://www.stagegrade.com/productions/822
http://www.broadwaybox.com/reviews/theater/spider-man_turn_off_the_dark_musical_reviews.aspx

(10) Comments • 2012/10/31 Blogging

Wednesday, October 24, 2012

Responding to the R Word

By .(JavaScript must be enabled to view this email address), 10:58 AM

A wonderful response:

In fact it has taken me all day to figure out how to respond to your use of the R-word last night.

I thought first of asking whether you meant to describe the President as someone who was bullied as a child by people like you, but rose above it to find a way to succeed in life as many of my fellow Special Olympians have.

Then I wondered if you meant to describe him as someone who has to struggle to be thoughtful about everything he says, as everyone else races from one snarkey sound bite to the next.

Finally, I wondered if you meant to degrade him as someone who is likely to receive bad health care, live in low grade housing with very little income and still manages to see life as a wonderful gift.

Because, Ms. Coulter, that is who we are – and much, much more.

 

(2) Comments • 2012/10/24 Blogging

Tuesday, October 23, 2012

How athletes view gay marriage legislation

By .(JavaScript must be enabled to view this email address), 11:37 PM

Looks like hockey players are picking up all the slack.

(8) Comments • 2012/10/24 Blogging

Friday, October 19, 2012

Nate Silver on Daily Show

By .(JavaScript must be enabled to view this email address), 01:14 AM

Nate did a great job, so check it out.

(11) Comments • 2012/10/22 Blogging

Wednesday, October 17, 2012

Candy Crowley

By .(JavaScript must be enabled to view this email address), 05:43 PM

I thought she did an excellent job last night.

As for the Libya fact-check: I thought she exhibited a very even-handed approach, in first confirming what Obama said, but then also giving Romney a hand by clarifying what Romney could have better worded.  The audience applauded HER, for both fact checks.  I think any opinion that says she sided to one side or the other on this issue is extremely biased.  You could see as she was saying it that she wanted to “even up the call”, like any good referee tries to do.  Anyway, I was impressed (a) that she had the b@lls to do a fact check on the spot and (b) had the good sense and quick timing to make sure to even up the call to move on.

Romney: his meme of “he got the first answer, so I get the last word” is really off-putting.  Candy explained that after they each get to answer once, then there’s no “turns” in the free-exchange portion.  It’s not one rebuttal each.

Obama and Romney: the talking over one another, and talking over Candy was terrible and off-putting. 

Anyway, the solution is pretty simple.  They were showing the clock counting back from 2:00 in green, and when it got to 0:10 it went yellow and when it went to 0:05 it went red.  When it got to zero, it stayed black!  There was no continuing of the clock, to see how much over they went.  There’s no disincentive to ignoring the clock.  Now, there’s no way that you’d want to cut off their mike. 

But I would say that you earn a penalty box!  They keep a running clock of how much overage time they have, and once they get to two minutes overage, they cannot rebut on the next question.

It’s clear that neither side actually wants to have the debate, as they way both of them would “pivot” and answer something else entirely is total bogus. It’s from the Sarah Palin school of debating.

***

Posting rules:
1. If you say anything bad about one guy, you have to say something at least half as bad about the other guy. 
2. You can say whatever good you want about one guy without having to be balanced.
3. I’ll delete any post I want for whatever reason and you must accept that I may delete your post and you can’t complain about it.

If you can accept these posting rules, have at it.  If you are too sensitive to deal with these rules, I suggest you bypass this thread and have a try at a presidential debate, where there are no such rules.

(42) Comments • 2012/10/19 Blogging

Can you prove meat cost money?

By .(JavaScript must be enabled to view this email address), 02:11 PM

Phil continues his assault on the laziness of regression to explain things that it just can’t explain.

If you need a two-line takeaway: there is so much noise, that it’s hard for the signal to come through.  And that we can barely see the signal doesn’t mean that the signal has only little impact.

If you need a graphic takeaway: if you are in a crowd, and someone kicks you in the b@lls, it’s tough to figure out who kicked you.  But, that you can’t tell who did it doesn’t mean that you won’t be crouched into a fetal position for what seems like an eternity.

There was a good bit on Johnny Carson once.  This comedian comes on, and tells the story of these two farmers disputing who should keep an egg that landed on the line of their property.  So, one farmer says: “Listen, we’ll do a test.  I kick you in the nuts, and then you’ll kick me in the nuts.  Whoever is left standing, gets the egg.”  So the first farmer kicks the other in the nuts, and the second farmer screams out, and holds onto himself for nearly a minute.  But he never collapses to the ground.  The second farmer says: “uuhhhh….ok…. my turn”.  The first farmer responds: “Keep the darn egg” and walks away.

(6) Comments • 2012/10/17 Blogging

Wednesday, October 10, 2012

“Punishing the whistleblower” playbook: suspend the kid being bullied

By .(JavaScript must be enabled to view this email address), 11:21 PM

Just a weird weird story.  Kid gets bullied, fights back, gets suspended, suspension reduced somewhat.  Then, prepares for an interview about bullying, and is then attacked by bullies with the camera crew there!

http://news.yahoo.com/blogs/lookout/teen-allegedly-bullied-television-interview-bullying-140353690.html

(6) Comments • 2012/10/12 Blogging

Tuesday, October 09, 2012

Romney’s thought of the day: kids don’t count

By .(JavaScript must be enabled to view this email address), 04:49 PM

I love me some Romney-bashing, especially if he hasn’t been programmed to deal with new situations.  He actually doesn’t want to talk to Nick News?

() CommentsBlogging

Friday, October 05, 2012

Why does Obama’s odds of winning keep going up?

By .(JavaScript must be enabled to view this email address), 04:17 PM

The ideal forecast is one where the odds of winning remain flat for the life of the forecast.  If your forecast for Obama winning on Nov 5 is 90%, then the best forecasting system would have his odds of winning on Oct 5, Sept 5, Aug 5 and so on also at 90%.  That would be the ultimate in forecasting.  A really good forecasting system would have him hover around 90% throughout the life of the forecast.  So, sometimes up to 92% other times down to 87% and so on.

But a typical forecasting system won’t give you that.  That’s because of uncertainty.  This is just like baseball, where if you are up by 1 after 1 inning, your chances of winning the game is 65%, but after 8 innings, being up by 1 means your chances of winning is 87%.

When I look at Nate’s forecast (right sidebar), this is what’s happening.  He shows Obama’s odds at 60% in early June, and it’s steadily rising after that andis currently sitting at 87%.  So, back in June, we were in the 1st inning, and Obama had a 1-run lead.  And now we’re entering the 9th inning, and Obama still has that 1-run lead.

Basically, neither side has actually gained much of anything.  They’re each trading outs and each trading runs.  Or if you want a football analogy, Obama is running out the clock.

You will note that Nate also shows the “nowcast” numbers.  What that shows is if the election was held now, Obama would have a 97% chance of winning.  This would be equivalent to ending a baseball game after 8 innings, or ending a football game two minutes early.  Nate has removed all the uncertainty and possibilities of the future.

The basic point is that if Obama keeps at his pace, and Romney keeps at his pace, then the odds of Obama winning is 97%.  But because it’s still 87%, there’s a possibility that Obama will give up a HR, or let Romney get a leadoff triple.

So back to the question I have: Obama’s odds keep going up precisely because there’s nothing really going on in the game.  Which you wouldn’t know it if you listened to all the gasbags going crazy at every word being uttered by either side.

(25) Comments • 2012/10/06 Blogging
Page 2 of 30 pages  < 1 2 3 4 >  Last ›