[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Tuesday, April 25, 2017

MLBAM xml/gd files

By Tangotiger 03:18 PM

With regards to the .xml or "gd" files available to the public on http://gd2.mlb.com:

Note that these files have been deprecated and are no longer in use in any of our MLB products. However, we will continue to support them for the forseeable future, so as to ensure a continued resource for researchers and other consumers. 

End points will reflect the data consistent with how we currently use the data for other business purposes, as well as our other products and services, without any translation or assurances of backwards compatibility beyond formatting and presentation of data.


#1    Kyle Boddy 2017/04/25 (Tue) @ 18:19

Going forward, where is the preferred and supported location to get raw target tracking data?


#2    Tangotiger 2017/04/25 (Tue) @ 22:34

Baseball Savant is the public facing site for typical users and research-heavy users, like myself.


#3    Rally 2017/04/28 (Fri) @ 10:20

Bill Petti has a THT article on using the Savant query:
http://www.hardballtimes.com/research-notebook-new-format-for-statcast-data-export-at-baseball-savant/

One question I had is this:

Don’t mean to complain because this is great stuff, and I’m grateful that MLB is willing to share so much data. But one thing I noticed is the umpire field is null in the downloads. Last year the umpire ID was there.

Is that an oversight or an intentional removal?


#4    Tangotiger 2017/04/28 (Fri) @ 14:10

That would be… neither!

Let me give you a bit of background, so that maybe it’ll cover some other questions that pop up.

In the old world, Daren was deriving his data from… I don’t know how many different data sources.  Let’s call this number… quatlu.  So he stitched this data together.  It’s alot of quatlu to stitch together.

What I’ve been working in the offseason is to tie-in all these data sources.  Not only that, but to resolve any kind of inconsistency, make it go through extra quality checks, etc.  This is the data warehouse.  Think of it as analogous to Retrosheet, but with tracking data.

I identified all the columns that I needed, generated the warehouse.  And after each iteration, Daren and the rest of the team would say “can you add this field?”.  So the whole thing ballooned to have 2x the number of columns I started with.

UmpireID was never part of the requests!  It’s really that simple.  So, when Daren went to flip the switch, he noticed a few gaps.  That was one. There were a few others.

So, the next iteration, which, I dunno when that will be, say 2 to 4 weeks, will have more data in there.  It’s just part of the process of this transition, while spending half our time doing other work unrelated to this.  So we end up with a delivery window that is twice what we wanted.

Anyway, so that’s how it all works.


Click MY ACCOUNT in top right corner to comment

<< Back to main


Latest...

COMMENTS

Nov 23 14:15
Layered wOBAcon

Nov 22 22:15
Cy Young Predictor 2024

Oct 28 17:25
Layered Hit Probability breakdown

Oct 15 13:42
Binomial fun: Best-of-3-all-home is equivalent to traditional Best-of-X where X is

Oct 14 14:31
NaiveWAR and VictoryShares

Oct 02 21:23
Component Run Values: TTO and BIP

Oct 02 11:06
FRV v DRS

Sep 28 22:34
Runs Above Average

Sep 16 16:46
Skenes v Webb: Illustrating Replacement Level in WAR

Sep 16 16:43
Sacrifice Steal Attempt

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

Aug 20 12:31
How to evaluate HR-saving plays, part 3 of 4: Speed

Aug 17 19:39
Leadoff Walk v Single?

Aug 12 10:22
Walking Aaron Judge with bases empty?

Jul 15 10:56
King Willie is dead.  Long Live King Reid.

Jun 14 10:40
Bias in the x-stats?  Yes!