-
Notifications
You must be signed in to change notification settings - Fork 9.1k
MAPREDUCE-7282. Move away from V2 commit algorithm #2349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAPREDUCE-7282. Move away from V2 commit algorithm #2349
Conversation
* Default to v1 commit algorithm * log at WARN on job set up when v2 is used * use a special log for that so people who know what they are doing can tell it to be quiet Change-Id: I9922cb6b86997e027870c6a445b715eb3ff5a39e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updated PR @steveloughran!
...ce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
Outdated
Show resolved
Hide resolved
log.warn("The v2 commit algorithm is deprecated;" | ||
+ " please switch to the v1 algorithm"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use the word deprecated. That implies that this algorithm will be removed in a future release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switching to your text
The version 2 algorithm is deprecated and no longer the default | ||
as task commits were not atomic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, remove "deprecated".
If a first task attempt fails part-way | ||
through its task commit, the output directory could end up | ||
with data from that failed commit, alongside the data | ||
from any subsequent attempts. | ||
|
||
See https://issues.apache.org/jira/browse/MAPREDUCE-7282 | ||
|
||
Although no-longer the default, this algorithm is safe to use if | ||
all task attempts for a single task meet the following requirements | ||
-they generate exactly the same set of files | ||
-the contents of each file are exactly the same in each task attempt | ||
|
||
That is: | ||
1. If a second attempt commits work, there will be no leftover files from | ||
a first attempt which failed during its task commit. | ||
2. If a network partition causes the first task attempt to overwrite | ||
some/all of the output of a second attempt, the result will be | ||
exactly the same as if it had not done so. | ||
|
||
To avoid the warning message on job setup, set the log level of the log | ||
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.Algorithm | ||
to ERROR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section should be moved to the end of the Algorithm 2 section below. You can add (see below for details) to the end of the line that says why algorithm v2 in no longer the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@jbrennan333 what do you think we should say instead of deprecated? "not recommended". I was thinking of adding a link to the JIRA and changing the issue text there to clarify
cleanup of job will delete the whole job attempt dir so that's the maximum time that a partitioned TA may commit work. There's no risk of some VM pausing for 3 hours, restarting and an in progress TA completing its work and overwriting the final output. This is good. |
(Yetus failure is from no new tests) |
@steveloughran It's hard to think of a terse warning for this. I think your comment above gets close. Maybe something like "The v2 commit algorithm assumes that the content of generated output files is consistent across all task attempts - if this is not true for this job, the v1 commit algorithm is strongly recommended." |
Change-Id: I2a069cc633e24c559311d69b5c0064aaa88a4f3b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @steveloughran. The code changes look good to me, but we need to resolve whether the default should be actually be changed - see @daryn-sharp's -1 in the Jira.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks @steveloughran for the work. But I'm not sure whether we should discuss in the mail list.
Once the manifest committer #2971 is in, the abfs and gcs stores will get something faster than v1 but with its task failure semantics
In this world, we can just leave people using v1/v2 alone, and for (spark) jobs in azure and google cloud say "use the manifest committer". |
tell it to be quiet
Supercedes #2320