8000 spark() method should be able to use other methods · Issue #2039 · Yelp/mrjob · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
spark() method should be able to use other methods #2039
Closed
@coyotemarin

Description

@coyotemarin

A job that defines the spark() method isn't going to be able to serialize self because self.stdin, self.stdout etc. are un-serializable, which we won't be able to serialize methods (no rdd.flatMap(self.some_method)).

We should sandbox the job prior to running its spark() method, just like we do in the Spark harness.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0