-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-16664. Use correct GenerationStamp when invalidating corrupt block replicas #4568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
The other test failure is relevant TestFsck#testFsckCorruptWhenOneReplicaIsCorrupt:
|
💔 -1 overall
This message was automatically generated. |
Yetus failing because no new unit test was added:
Note that this change was unit tested via the existing test "TestDecommission#testDeleteCorruptReplicaForUnderReplicatedBlock", the only caveat is that this unit test is behaving differently when backporting HDFS-16064 to older Hadoop versions, which is what caused this bug to be discovered in the first place. See the section "Why does unit test failure not reproduce in Hadoop trunk?" in HDFS-16664 for additional details on difference in behavior. In summary:
Need additional time to investigate this difference in behavior, if possible I will try to make the unit test behavior more consistent Can also add some unit testing for new code in CorruptReplicasMap |
I have confirmed that the change in unit test behavior that uncovered this bug is related to a change in this condition which was made as part of: https://issues.apache.org/jira/browse/HDFS-15200 In trunk the block invalidation does not get postponed because "dfs.namenode.corrupt.block.delete.immediately.enabled" defaults to true I have added some additional test coverage for this change
|
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
I am ignoring the 2 checkstyle violations for the following reasons:
This is an existing comment: Line 285 in 60433bf
Since I am not modifying this comment (i.e. lines of code) in any way, I think its better that I don't touch it
"testDeleteCorruptReplicaForUnderReplicatedBlockInternal" is an existing method "testDeleteCorruptReplicaForUnderReplicatedBlock" which was renamed. Since the method was already merged I don't think its necessary that I reduce the number of lines in the method. |
Unit test failures:
|
All the unit test failures are caused by:
I strongly suspect this is not related to my change & is perhaps related to the runtime environment of the tests, perhaps the unit tests will pass if re-run again |
🎊 +1 overall
This message was automatically generated. |
-1 (Sorry for coming up so straight). This is a complete unnecessary change. It was mentioned in your last JIRA HDFS-16064 by yanbin.zhang that you are finding a wrong root cause in that JIRA and you didnt coordinate or reply them ? We have never seen this issue in our downstream process running cluster of 1000+ nodes since so many years. You should try to create first experience this issue in real world than just depending on a UT. This current change seems to unnecessary. If it would be a real production bug , someone in community would have experienced it. It is not like that you are reading the code someday and you identified some bug like this. Block management is very much critical and unnecessary changes always lead to more critical bugs... |
Hey Xiaohui, appreciate you taking the time to review this :) Just wanted to clarify a few things:
I did reply to yanbin.zhang in the JIRA, you can see my comment just under theirs. I would also add:
I understand that there is no clear reporting of this issue beforehand by the community. That being said, the behavior is reproducible (as per the testing details in this PR/JIRA) and the behavior results in block invalidation requests (i.e. DNA_INVALIDATE) sent to datanodes failing due to incorrect GenerationStamp. I would argue that sending an incorrect GenerationStamp in a DNA_INVALIDATE request is a bug because it causes the request to fail when it otherwise could have succeeded if the correct GenerationStamp is used. I would ask that you more clearly articulate why this change is un-necessary. Do you believe that sending an incorrect GenerationStamp in a DNA_INVALIDATE request causing the request to fail is not a bug? Or do you believe that this reproducible behavior should not be addressed simply because it has not been frequently reported as production impacting by the community? |
@XiaohuiSun1 XiaohuiSun1 Do you have any follow-up comments on this PR? Just wanted to give you a chance to reply before I reach out to a Hadoop committer for review |
Concrete Reproduce Steps
|
More abstractly the conditions to reproduce the issue are:
In this case the "GenerationStamp not matched" exception will continue to occur for each invalidation request sent to the Datanode with the corrupt replica up until that Datanode sends its next block report & the block replica is properly invalidated. An invalidation request will be sent for each block report from another datanode containing that block. |
Description of PR
Under certain conditions the Namenode can send the incorrect generationStamp to a datanode when invalidating a corrupt block replica.
Results in the following datanode exception:
See JIRA for additional details: https://issues.apache.org/jira/browse/HDFS-16664
How was this patch tested?
Validated the fix by leveraging the unit test "TestDecommission#testDeleteCorruptReplicaForUnderReplicatedBlock"
Failed Test - Before this change
Note the inline comments above which illustrate the bug
Successful Test - After this change
Logs:
Using "getCorruptReplicaGenerationStamp" allows the Namenode to get the correct generationStamp for the corrupt block replica
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?