8000 Bug Found when using hadoop with localFileSink · Issue #132 · Netflix/suro · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Bug Found when using hadoop with localFileSink #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zhenchuan opened this issue Sep 22, 2014 · 2 comments
Closed

Bug Found when using hadoop with localFileSink #132

zhenchuan opened this issue Sep 22, 2014 · 2 comments

Comments

@zhenchuan
Copy link

with hadoop's configuration file in classpath, even the nested localFileSink will using hadoop's remote file system.

when deep into the code ,i found there has no judge on whether to use localFileSystem or not.

the FileWriterBase's constructor should be changed by adding a localFileSystem flag to control this.

    public FileWriterBase(String codecClass, Logger log, Configuration conf,Boolean localFileSystem) {
        this.conf = conf;

        try {
            if(localFileSystem == null) localFileSystem = false;
            fs = localFileSystem ? FileSystem.getLocal(conf) : FileSystem.get(conf);
            fs.setVerifyChecksum(false);
            if (codecClass != null) {
                codec = createCodecInstance(codecClass);
                log.info("Codec:" + codec.getDefaultExtension());
            } else {
                codec = null;
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

i'll give a pull request for this ASAP.

@metacret
Copy link
Contributor

FileWriterBase's constructor is called from TextFileWriter constructor or SequenceFileWriter constructor. Configuration conf is created by new Configuration() which should denote its file system as the local one by default. If this is still pointing to HDFS file system, that's what I have missed.

Suro does not need to run with remote file system directly without any reason. So, instead of localFileSystem boolean flag, you can feel free to send PR with the fix FileSystem.getLocal(conf).

@zhenchuan
Copy link
Author

thanks for your review, FileSystem.getLocal(conf) will be clear if Suro does not use remote file system.

iPinYou added a commit to iPinYou/suro that referenced this issue Sep 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0