Parse Catalog Faster · Issue #48 · ropensci/gutenbergr · GitHub

8000 Parse Catalog Faster · Issue #48 · ropensci/gutenbergr · GitHub

More Web Proxy on the site http://driver.im/

Parse Catalog Faster #48

Open

Open

Parse Catalog Faster#48

Assignees

Labels

The all_metadata step of parse_rdfs.R is very, very slow. This makes debugging tedious. Some of this slowness might be unavoidable (we're parsing a lot of data), but try to optimize if possible.

The Project Gutenberg docs imply that there's a single XML/RDF file available, but I don't see it. That would presumably be much faster to parse.

Metadata

Assignees

jonthegeek

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

0