HDFS-16582. Expose aggregate latency of slow node as perceived by the reporting node #4323

virajjasani · 2022-05-18T04:27:33Z

Description of PR

When any datanode is reported to be slower by another node, we expose the slow node as well as the reporting nodes list for the slow node. However, we don't provide latency numbers of the slownode as reported by the reporting node. Having the latency exposed in the metrics would be really helpful for operators to keep a track of how far behind a given slow node is performing compared to the rest of the nodes in the cluster.

The operator should be able to gather aggregated latencies of all slow nodes with their reporting nodes in Namenode metrics.

How was this patch tested?

Dev cluster and UT.

For code changes:

Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?

… reporting node

hadoop-yetus · 2022-05-18T12:50:42Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 2s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	37m 12s		trunk passed
+1 💚	compile	1m 43s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	1m 38s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	1m 26s		trunk passed
+1 💚	mvnsite	1m 48s		trunk passed
+1 💚	javadoc	1m 25s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	1m 52s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 42s		trunk passed
+1 💚	shadedclient	23m 1s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 25s		the patch passed
+1 💚	compile	1m 26s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	1m 26s		the patch passed
+1 💚	compile	1m 22s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	1m 22s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	1m 0s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 9 new + 23 unchanged - 0 fixed = 32 total (was 23)
+1 💚	mvnsite	1m 27s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	1m 36s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 24s		the patch passed
+1 💚	shadedclient	22m 42s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	392m 5s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	1m 15s		The patch does not generate ASF License warnings.
		501m 46s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4323/1/artifact/out/Dockerfile
GITHUB PR	#4323
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux c2925a8f042c 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `f8d26b2`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4323/1/testReport/
Max. process+thread count	3869 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4323/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

virajjasani · 2022-05-18T17:57:17Z

@saintstack @jojochuang @tomscut could you please review this PR?

hadoop-yetus · 2022-05-19T02:40:35Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 41s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	36m 51s		trunk passed
+1 💚	compile	1m 27s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	1m 22s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	1m 6s		trunk passed
+1 💚	mvnsite	1m 30s		trunk passed
+1 💚	javadoc	1m 7s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	1m 32s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 25s		trunk passed
+1 💚	shadedclient	21m 57s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 12s		the patch passed
+1 💚	compile	1m 18s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	1m 18s		the patch passed
+1 💚	compile	1m 10s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	1m 10s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 50s		the patch passed
+1 💚	mvnsite	1m 14s		the patch passed
+1 💚	javadoc	0m 48s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	1m 22s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 12s		the patch passed
+1 💚	shadedclient	21m 37s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	253m 9s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 52s		The patch does not generate ASF License warnings.
		355m 41s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4323/2/artifact/out/Dockerfile
GITHUB PR	#4323
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux 5ccbe15384ea 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `6e738e6`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4323/2/testReport/
Max. process+thread count	3446 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4323/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

jojochuang

Looks straightforward change. LGTM +1.

Thanks @virajjasani

jojochuang · 2022-05-21T00:04:11Z

...oop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SlowPeerJsonReport.java

+ * [de]serialization easy.
+ */
+@InterfaceAudience.Private
+final class SlowPeerJsonReport {


this class is adapted from the subclass ReportForJson in SlowPeerTracker.

jojochuang · 2022-05-21T00:20:09Z

@tomscut would you like to give it a review too?

tomscut · 2022-05-21T00:31:13Z

@tomscut would you like to give it a review too?

Thanks @jojochuang for ping me, I'm looking at this.

tomscut

LGTM.

tomscut · 2022-05-21T01:45:09Z

Thanks @virajjasani for the contribution! Thanks @jojochuang for the review!

… reporting node (#4323) Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Tao Li <tomscut@apache.org>

… reporting node (apache#4323) Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Tao Li <tomscut@apache.org>

HDFS-16582. Expose aggregate latency of slow node as perceived by the…

f8d26b2

… reporting node

checkstyle fix

6e738e6

virajjasani mentioned this pull request May 18, 2022

HDFS-16521. DFS API to retrieve slow datanodes #4107

Merged

1 task

tomscut self-requested a review May 20, 2022 01:29

jojochuang approved these changes May 21, 2022

View reviewed changes

tomscut approved these changes May 21, 2022

View reviewed changes

tomscut merged commit 93a1320 into apache:trunk May 21, 2022

tomscut pushed a commit that referenced this pull request May 21, 2022

HDFS-16582. Expose aggregate latency of slow node as perceived by the…

ab3a9ce

… reporting node (#4323) Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Tao Li <tomscut@apache.org>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDFS-16582. Expose aggregate latency of slow node as perceived by the reporting node #4323

HDFS-16582. Expose aggregate latency of slow node as perceived by the reporting node #4323

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HDFS-16582. Expose aggregate latency of slow node as perceived by the reporting node #4323

HDFS-16582. Expose aggregate latency of slow node as perceived by the reporting node #4323

Uh oh!

Conversation

Description of PR

How was this patch tested?

For code changes:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!