8000 HDFS-3246: pRead equivalent for direct read path by sahilTakiar · Pull Request #597 · apache/hadoop · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

HDFS-3246: pRead equivalent for direct read path #597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ 8000 ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 30, 2019

Conversation

sahilTakiar
Copy link
Contributor

HDFS-3246: First several iterations of this patch are posted on the JIRA. This PR is a continuation of this work, it was created to make the code more reviewable. The version of the patch posted here fixes a few minor issues reported by Yetus, and added some more Javadocs to the changes in CryptoInputStream to make the code easier to understand.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 32 Docker mode activated.
_ Prechecks _
+1 @author 1 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 73 Maven dependency ordering for branch
+1 mvninstall 1025 trunk passed
+1 compile 971 trunk passed
+1 checkstyle 191 trunk passed
+1 mvnsite 244 trunk passed
+1 shadedclient 1132 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 334 trunk passed
+1 javadoc 179 trunk passed
_ Patch Compile Tests _
0 mvndep 25 Maven dependency ordering for patch
+1 mvninstall 158 the patch passed
+1 compile 893 the patch passed
-1 cc 893 root generated 4 new + 9 unchanged - 0 fixed = 13 total (was 9)
+1 javac 893 the patch passed
+1 checkstyle 180 root: The patch generated 0 new + 115 unchanged - 3 fixed = 115 total (was 118)
+1 mvnsite 229 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 653 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 363 the patch passed
+1 javadoc 165 the patch passed
_ Other Tests _
+1 unit 499 hadoop-common in the patch passed.
+1 unit 123 hadoop-hdfs-client in the patch passed.
-1 unit 4913 hadoop-hdfs in the patch failed.
-1 unit 360 hadoop-hdfs-native-client in the patch failed.
+1 asflicense 46 The patch does not generate ASF License warnings.
12627
Reason Tests
Failed CTEST tests test_test_libhdfs_ops_hdfs_static
Failed junit tests hadoop.hdfs.TestDecommission
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
hadoop.hdfs.server.datanode.fsdataset.impl.TestProvidedImpl
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter
hadoop.hdfs.server.datanode.web.webhdfs.TestDataNodeUGIProvider
hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol
hadoop.hdfs.server.datanode.web.TestDatanodeHttpXFrame
hadoop.hdfs.TestDFSShell
hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy
hadoop.hdfs.server.datanode.TestBlockCountersInPendingIBR
hadoop.hdfs.TestDFSStripedOutputStream
hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart
hadoop.hdfs.server.datanode.TestDataNodeTcpNoDelay
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory
hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles
hadoop.hdfs.TestByteBufferPread
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement
hadoop.hdfs.server.datanode.TestDataNodeUUID
hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation
hadoop.hdfs.server.datanode.TestDataNodeInitStorage
hadoop.hdfs.server.datanode.TestFsDatasetCache
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux aa981774d031 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 34b1406
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
cc https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/artifact/out/diff-compile-cc-root.txt
CTEST https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/artifact/out/patch-hadoop-hdfs-project_hadoop-hdfs-native-client-ctest.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/testReport/
Max. process+thread count 4649 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/1/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@sahilTakiar
Copy link
Contributor Author

Rebased patch, fixed remaining cc issues. Not sure why all those tests failed, they all pass locally. Let's see if Hadoop QA hits them again.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 22 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 72 Maven dependency ordering for branch
+1 mvninstall 1078 trunk passed
+1 compile 940 trunk passed
+1 checkstyle 206 trunk passed
+1 mvnsite 228 trunk passed
+1 shadedclient 1173 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 348 trunk passed
+1 javadoc 175 trunk passed
_ Patch Compile Tests _
0 mvndep 22 Maven dependency ordering for patch
+1 mvninstall 169 the patch passed
+1 compile 916 the patch passed
+1 cc 916 the patch passed
+1 javac 916 the patch passed
+1 checkstyle 20 8000 7 root: The patch generated 0 new + 115 unchanged - 3 fixed = 115 total (was 118)
+1 mvnsite 242 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 701 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 393 the patch passed
+1 javadoc 170 the patch passed
_ Other Tests _
+1 unit 521 hadoop-common in the patch passed.
+1 unit 121 hadoop-hdfs-client in the patch passed.
-1 unit 5220 hadoop-hdfs in the patch failed.
+1 unit 396 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 86 The patch does not generate ASF License warnings.
13230
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.TestMaintenanceState
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/2/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux e685671bf809 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 1f47fb7
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/2/testReport/
Max. process+thread count 3394 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/2/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

}

return n;
} catch (ClassCastException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we catch the ClassCastException to check whether the inpustream has implemented an ByteBufferPositionedReadable interface ? I don't think it's a good way, because the catch{...} block need an full stack and not so efficient .
How about use the following :

if(in instanceof ByteBufferPositionedReadable){
    //...
}else{
    throw new UnsupportedOperationException(...)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, its probably not the most efficient way to do things, but thats how all other methods handle the same thing: PositionedReadable, Seekable, etc. + the exception handling code wouldn't be on the hot path. So in this case I would prefer to keep the code consistent with the rest of the class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm. I think we should probably do a follow-up JIRA to fix this, not for performance reasons, but because the try{...} block encompasses a lot of code. Let's say we accidentally screw up something in our encryption config and we get a ClassCastException somewhere inside decrypt. We'll swallow the real exception and claim that positioned read isn't supported, which isn't quite right.

So, I agree an instanceof check up front is probably the clearest from a code perspective and also avoids the above issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an instance check; all the FSDataInputStream classes do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense will do. Any objections if I just fix this here rather than filing a separate JIRA? The changes are pretty small / it would probably be less effort than creating, reviewing, and merging a separate patch just for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

try {
decrypt(localDecryptor, localInBuffer, localOutBuffer, localPadding);
buf.position(start + len);
buf.limit(limit);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit should be the same as the start + len + Math.min(n - len, localInBuffer.remaining()) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the invariant you want to preserve is that buf.put(inputBuf) should only be called with the original value of buf.limit() so that you don't exceed the given limit. Using start + len + Math.min(n - len, localInBuffer.remaining()) as the limit could violate this if n + start > buf.limit().

* <p>
* In the case of an exception, the values of {@code buf.position()} and
* {@code buf.limit()} are undefined, and callers should be prepared to
* recover from this eventuality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an exception encountered , we should reset to the original pos & limit ? change the pos or limit is not friendly to the upper layer...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I copied this from ByteBufferReadable, so I think we should leave it for now, and if we want to lift this limitation, then we can do so for both ByteBufferReadable and ByteBufferPositionedReadable in another JIRA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that the way it's implemented, it seems like on exception, the contents of the buffer are also undefined, right? ie we could have partially overwritten the buffer and then thrown?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for saying "buffer state undefined"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* <p>
* Many implementations will throw {@link UnsupportedOperationException}, so
* callers that are not confident in support for this method from the
* underlying filesystem should be prepared to handle that exception.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should provide an StreamCapabilities method for upstream to decide whether use the ByteBuffer pread , as we discussed in JIRA ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the code is there, updated the javadoc to reflect this.

ByteBuffer buf) throws IOException {
int n = 0;
int total = 0;
while (n != -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If n is 0, will we get stuck in a dead loop ? , try to use if(n > 0 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the HDFS read APIs make the same guarantees as InputStream#read(byte[], which returns -1 if there is no more data to read.

@@ -316,6 +331,17 @@ void hdfsFileDisableDirectRead(hdfsFile file)
file->flags &= ~HDFS_FILE_SUPPORTS_DIRECT_READ;
}

int hdfsFileUsesDirectPread(hdfsFile file)
{
return !!(file->flags & HDFS_FILE_SUPPORTS_DIRECT_PREAD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!! ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was copied from hdfsFileUsesDirectRead, but I agree it hard to understand so I cleaned it up.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 41 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 23 Maven dependency ordering for branch
+1 mvninstall 1177 trunk passed
+1 compile 1018 trunk passed
+1 checkstyle 211 trunk passed
+1 mvnsite 239 trunk passed
+1 shadedclient 1212 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 355 trunk passed
+1 javadoc 180 trunk passed
_ Patch Compile Tests _
0 mvndep 22 Maven dependency ordering for patch
+1 mvninstall 162 the patch passed
+1 compile 973 the patch passed
+1 cc 973 the patch passed
+1 javac 973 the patch passed
+1 checkstyle 215 root: The patch generated 0 new + 116 unchanged - 3 fixed = 116 total (was 119)
+1 mvnsite 226 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 742 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 373 the patch passed
+1 javadoc 178 the patch passed
_ Other Tests _
+1 unit 508 hadoop-common in the patch passed.
+1 unit 116 hadoop-hdfs-client in the patch passed.
-1 unit 5845 hadoop-hdfs in the patch failed.
+1 unit 404 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 47 The patch does not generate ASF License warnings.
14082
Reason Tests
Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized
hadoop.hdfs.server.mover.TestMover
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/3/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux e97235a00ae8 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 548997d
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/3/testReport/
Max. process+thread count 2968 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/3/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 61 Maven dependency ordering for branch
+1 mvninstall 1040 trunk passed
+1 compile 934 trunk passed
+1 checkstyle 196 trunk passed
+1 mvnsite 249 trunk passed
+1 shadedclient 1156 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 312 trunk passed
+1 javadoc 165 trunk passed
_ Patch Compile Tests _
0 mvndep 18 Maven dependency ordering for patch
+1 mvninstall 155 the patch passed
+1 compile 873 the patch passed
+1 cc 873 the patch passed
+1 javac 873 the patch passed
-0 checkstyle 183 root: The patch generated 2 new + 114 unchanged - 5 fixed = 116 total (was 119)
+1 mvnsite 211 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 600 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 340 the patch passed
+1 javadoc 160 the patch passed
_ Other Tests _
+1 unit 503 hadoop-common in the patch passed.
+1 unit 126 hadoop-hdfs-client in the patch passed.
-1 unit 4713 hadoop-hdfs in the patch failed.
+1 unit 366 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 40 The patch does not generate ASF License warnings.
12234
Reason Tests
Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/4/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux cdc24f282597 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 90afc9a
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-597/4/artifact/out/diff-checkstyle-root.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/4/testReport/
Max. process+thread count 4894 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/4/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

throws IOException {
checkStream();
try {
int pos = buf.position();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it would be good to rename this pos to 'bufPos' so it's clearer that it's referring to the position in the buffer and not the current position in the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -741,6 +830,7 @@ public boolean hasCapability(String capability) {
case StreamCapabilities.DROPBEHIND:
case StreamCapabilities.UNBUFFER:
case StreamCapabilities.READBYTEBUFFER:
case StreamCapabilities.PREADBYTEBUFFER:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like preadByteBuffer requires the underlying stream to have this capability, so this should probably delegate to ((StreamCapabiltiies)in).hasCapability(PREADBYTEBUFFER), right?

(interestingly, the same goes for a few others of the capabilities like dropbehind, I think. Curious what @steveloughran has to say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you forward to another stream, yes, wrap it. Theres's ongoing work in #575 to do more of this, with a matching PathCapabilities call for filesystems so you can see if a file you open under a path will have the feature before you open it (#568). These are both background bits of work so have languished neglected, but I'd love to see them in. StreamCapabilities is in hadoop 2.9+ BTW, so while the features you can look for evolves, at least the client code can compile without problems (and if you copy the new capabilities, link)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}

return n;
} catch (ClassCastException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm. I think we should probably do a follow-up JIRA to fix this, not for performance reasons, but because the try{...} block encompasses a lot of code. Let's say we accidentally screw up something in our encryption config and we get a ClassCastException somewhere inside decrypt. We'll swallow the real exception and claim that positioned read isn't supported, which isn't quite right.

So, I agree an instanceof check up front is probably the clearest from a code perspective and also avoids the above issue.

return n;
} catch (ClassCastException e) {
throw new UnsupportedOperationException("This stream does not support " +
"positioned read.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should specifically say "with byte buffers" or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/**
* Decrypt n bytes in buf starting at start. Output is also put into buf
* starting at current position. buf.position() and buf.limit() should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc is confusing me a bit (haven't looked at the impl yet). It seems this both reads and writes to the same buf, but the read is happening from the 'start' whereas the write is happening at 'buf.position()'? That seems somewhat unexpected and opens up some questions about whether the output range and input range can overlap

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate nit: "should be unchanged" -> "will be unchanged" or "are not changed". Should be sounds awfully wishy-washy for a postcondition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I copied this from the other #decrypt methods. Regardless, I rewrote this part so hopefully it is easier to understand.

int len = 0;
Decryptor localDecryptor = null;
try {
localInBuffer = getBuffer();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO here that we can likely avoid one of these copies, at least when the byte buffer passed by the user is a direct buffer? It looks like the patch is currently doing:

pread -> user buffer
for each chunk:
  copy from user buffer to tmp input
  decrypt tmp input to tmp output
  copy from tmp output to user buffer

but we could likely decrypt back to the user buffer directly, or decrypt from the user buffer to a tmp, and then write back. (this all assumes that the crypto codecs don't support in-place decryption, which they might)

10000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking briefly through openssl code, it seems like it actually supports in == out encryption, so maybe we could avoid both buffer copies for full blocks, and maybe avoid one buffer copy for the non-bytebuffer case as well by making inBuffer == outBuffer.

Again, probably just something to file for later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO

ByteBuffer localInBuffer = null;
ByteBuffer localOutBuffer = null;
final int pos = buf.position();
final int limit = buf.limit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of saving pos/limit here and restoring them later, would it be easier to duplicate() the bytebuffer? then you could easily just set the limit to match 'n' and not worry about it? The loop bounds might become a bit easier, too (while buf.remaining() > 0) etc since you no longer need to consider the passed-in length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with this a bit, calling #duplicate avoids having to save and then reset the original limit and position, but I think you still have to set the limit when processing each chunk, other the call to localInBuffer.put(buf) will throw a BufferOverflowException. The Javadocs of #put say:

     * If there are more bytes remaining in the
     * source buffer than in this buffer, that is, if
     * <tt>src.remaining()</tt>&nbsp;<tt>&gt;</tt>&nbsp;<tt>remaining()</tt>,
     * then no bytes are transferred and a {@link
     * BufferOverflowException} is thrown.

* <p>
* In the case of an exception, the values of {@code buf.position()} and
* {@code buf.limit()} are undefined, and callers should be prepared to
* recover from this eventuality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that the way it's implemented, it seems like on exception, the contents of the buffer are also undefined, right? ie we could have partially overwritten the buffer and then thrown?


/**
* Reads bytes using the read(ByteBuffer) API. By using Java
* DirectByteBuffers we can avoid copying the bytes from kernel space into
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DirectByteBuffer avoids an extra copy into the java heap vs the C heap. It's still copying data out of the kernel to user space either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

* If the underlying stream supports the ByteBufferReadable interface then
* this method will transparently use read(ByteBuffer). This can help
* improve performance as it avoids unnecessary copies between the kernel
* space, the Java process space, and the C process space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per above, the kernel->user transition's the same here, it's just avoiding some JVM heap copies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Seekable, PositionedReadable, ByteBufferReadable,
ByteBufferPositionedReadable, HasFileDescriptor, CanSetDropBehind,
CanSetReadahead, HasEnhancedByteBufferAccess, ReadableByteChannel,
CanUnbuffer, StreamCapabilities {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding one extra interface shouldn't force reformatting this entire set of lines -this only makes backporting on a common conflict point worse. Can you just add the new interface and the end and leave the rest alone? thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sahilTakiar
Copy link
Contributor Author

Addressed comments, rebased patch.

}
jthr = invokeMethod(env, &jVal, INSTANCE, jFile,
JC_FS_DATA_INPUT_STREAM, "hasCapability", "(Ljava/lang/String;)Z",
jCapabilityString);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:tabs in line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 27 Maven dependency ordering for branch
+1 mvninstall 1023 trunk passed
+1 compile 976 trunk passed
+1 checkstyle 196 trunk passed
+1 mvnsite 245 trunk passed
+1 shadedclient 1163 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 341 trunk passed
+1 javadoc 211 trunk passed
_ Patch Compile Tests _
0 mvndep 59 Maven dependency ordering for patch
+1 mvninstall 161 the patch passed
+1 compile 928 the patch passed
+1 cc 928 the patch passed
+1 javac 928 the patch passed
-0 checkstyle 184 root: The patch generated 3 new + 112 unchanged - 7 fixed = 115 total (was 119)
+1 mvnsite 230 the patch passed
-1 whitespace 1 The patch 1 line(s) with tabs.
+1 shadedclient 600 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 362 the patch passed
-1 javadoc 59 hadoop-common-project_hadoop-common generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
_ Other Tests _
+1 unit 503 hadoop-common in the patch passed.
+1 unit 118 hadoop-hdfs-client in the patch passed.
-1 unit 6789 hadoop-hdfs in the patch failed.
+1 unit 392 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 45 The patch does not generate ASF License warnings.
14594
Reason Tests
Failed junit tests hadoop.hdfs.TestDFSStartupVersions
hadoop.hdfs.TestDFSStorageStateRecovery
hadoop.hdfs.TestStateAlignmentContextWithHA
hadoop.hdfs.TestReplication
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
hadoop.hdfs.TestFileChecksum
hadoop.hdfs.TestErasureCodeBenchmarkThroughput
hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
hadoop.hdfs.server.mover.TestMover
hadoop.hdfs.TestDatanodeDeath
hadoop.hdfs.TestEncryptionZonesWithKMS
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
hadoop.hdfs.server.balancer.TestBalancerRPCDelay
hadoop.hdfs.server.namenode.TestFSEditLogLoader
hadoop.hdfs.TestFileCreation
hadoop.hdfs.TestEncryptedTransfer
hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.server.namenode.TestFileTruncate
hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy
hadoop.hdfs.server.namenode.snapshot.TestSetQuotaWithSnapshot
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
hadoop.hdfs.TestDecommission
hadoop.hdfs.TestRenameWhileOpen
hadoop.hdfs.TestErasureCodingPolicies
hadoop.hdfs.TestFileChecksumCompositeCrc
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
hadoop.hdfs.TestMaintenanceState
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux e977b7bd0023 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / ab2bda5
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/artifact/out/diff-checkstyle-root.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/artifact/out/whitespace-tabs.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/artifact/out/diff-javadoc-javadoc-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/testReport/
Max. process+thread count 4767 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/5/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

Decryptor decryptor = null;
try {
// TODO we should be able to avoid copying chunks between the input buf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather creating a new JIRA for this (now), over leaving a TODO in the code. You know that TODO is going to get forgotten about otherwise....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the TODO and filed HDFS-14417

@sahilTakiar
Copy link
Contributor Author

Rebased a fixed a few checkstyle issues.

}
jthr = invokeMethod(env, &jVal, INSTANCE, jFile,
JC_FS_DATA_INPUT_STREAM, "hasCapability", "(Ljava/lang/String;)Z",
jCapabilityString);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:tabs in line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 31 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 76 Maven dependency ordering for branch
+1 mvninstall 1040 trunk passed
+1 compile 956 trunk passed
+1 checkstyle 186 trunk passed
+1 mvnsite 219 trunk passed
+1 shadedclient 1079 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 339 trunk passed
+1 javadoc 171 trunk passed
_ Patch Compile Tests _
0 mvndep 23 Maven dependency ordering for patch
+1 mvninstall 168 the patch passed
+1 compile 990 the patch passed
+1 cc 990 the patch passed
+1 javac 990 the patch passed
+1 checkstyle 189 root: The patch generated 0 new + 112 unchanged - 7 fixed = 112 total (was 119)
+1 mvnsite 220 the patch passed
-1 whitespace 0 The patch 1 line(s) with tabs.
+1 shadedclient 644 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 366 the patch passed
-1 javadoc 57 hadoop-common-project_hadoop-common generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
_ Other Tests _
+1 unit 541 hadoop-common in the patch passed.
+1 unit 119 hadoop-hdfs-client in the patch passed.
-1 unit 5027 hadoop-hdfs in the patch failed.
+1 unit 352 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 47 The patch does not generate ASF License warnings.
12788
Reason Tests
Failed junit tests hadoop.hdfs.shortcircuit.TestShortCircuitCache
hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
hadoop.hdfs.server.namenode.TestHDFSConcat
hadoop.hdfs.server.namenode.TestCommitBlockWithInvalidGenStamp
hadoop.hdfs.server.namenode.TestProtectedDirectories
hadoop.fs.viewfs.TestViewFsWithXAttrs
hadoop.fs.contract.hdfs.TestHDFSContractDelete
hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy
hadoop.hdfs.TestDFSMkdirs
hadoop.hdfs.server.namenode.TestSecondaryWebUi
hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer
hadoop.hdfs.server.namenode.TestNNThroughputBenchmark
hadoop.TestGenericRefresh
hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints
hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized
hadoop.hdfs.TestDatanodeConfig
hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
hadoop.hdfs.TestDistributedFileSystem
hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality
hadoop.hdfs.security.token.block.TestBlockToken
hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus
hadoop.hdfs.tools.TestDelegationTokenFetcher
hadoop.hdfs.TestFileConcurrentReader
hadoop.fs.TestSymlinkHdfsFileContext
hadoop.hdfs.server.namenode.TestDeleteRace
hadoop.hdfs.TestRestartDFS
hadoop.hdfs.server.namenode.TestNameNodeMXBean
hadoop.fs.TestFcHdfsSetUMask
hadoop.hdfs.TestDatanodeRegistration
hadoop.hdfs.server.namenode.TestGenericJournalConf
hadoop.TestRefreshCallQueue
hadoop.fs.viewfs.TestViewFileSystemHdfs
hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics
hadoop.hdfs.TestErasureCodingAddConfig
hadoop.hdfs.server.namenode.TestFSDirectory
hadoop.fs.viewfs.TestViewFileSystemWithXAttrs
hadoop.hdfs.TestHdfsAdmin
hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots
hadoop.hdfs.server.namenode.ha.TestHarFileSystemWithHA
hadoop.fs.contract.hdfs.TestHDFSContractAppend
hadoop.hdfs.TestRead
hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile
hadoop.hdfs.tools.TestStoragePolicyCommands
hadoop.hdfs.TestFileLengthOnClusterRestart
hadoop.hdfs.server.namenode.TestQuotaWithStripedBlocks
hadoop.hdfs.server.namenode.snapshot.TestRandomOpsWithSnapshots
hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics
hadoop.hdfs.server.namenode.ha.TestHAAppend
hadoop.hdfs.TestDistributedFileSystemWithECFileWithRandomECPolicy
hadoop.hdfs.server.namenode.ha.TestHAMetrics
hadoop.hdfs.server.namenode.TestReencryptionWithKMS
hadoop.hdfs.TestStateAlignmentContextWithHA
hadoop.hdfs.server.namenode.ha.TestHASafeMode
hadoop.hdfs.TestReplication
hadoop.hdfs.TestLeaseRecoveryStriped
hadoop.hdfs.server.namenode.TestAuditLogs
hadoop.hdfs.server.datanode.TestBlockCountersInPendingIBR
hadoop.hdfs.server.namenode.TestFSNamesystemMBean
hadoop.hdfs.server.namenode.ha.TestHAFsck
hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.server.namenode.TestAllowFormat
hadoop.hdfs.server.namenode.TestParallelImageWrite
hadoop.hdfs.server.namenode.TestCheckpoint
hadoop.hdfs.server.namenode.TestAclConfigFlag
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
hadoop.hdfs.TestLease
hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl
hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR
hadoop.hdfs.server.namenode.TestAddStripedBlocks
hadoop.hdfs.server.namenode.TestNamenodeStorageDirectives
hadoop.hdfs.TestFileStatusWithDefaultECPolicy
hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens
hadoop.hdfs.TestReadStripedFileWithDecoding
hadoop.hdfs.server.namenode.ha.TestConsistentReadsObserver
hadoop.hdfs.server.namenode.ha.TestQuotasWithHA
hadoop.fs.TestResolveHdfsSymlink
hadoop.hdfs.TestFileChecksum
hadoop.hdfs.TestSnapshotCommands
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure
hadoop.hdfs.server.namenode.TestUpgradeDomainBlockPlacementPolicy
hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy
hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate
hadoop.hdfs.TestErasureCodeBenchmarkThroughput
hadoop.fs.TestEnhancedByteBufferAccess
hadoop.hdfs.web.TestWebHdfsTokens
hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade
hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter
hadoop.fs.viewfs.TestViewFsWithAcls
hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
hadoop.hdfs.server.mover.TestMover
hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots
hadoop.hdfs.server.namenode.TestStripedINodeFile
hadoop.fs.viewfs.TestViewFileSystemLinkMergeSlash
hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA
hadoop.hdfs.server.namenode.TestNestedEncryptionZones
hadoop.hdfs.server.namenode.TestBackupNode
hadoop.hdfs.server.namenode.ha.TestMultiObserverNode
hadoop.hdfs.server.namenode.TestStorageRestore
hadoop.hdfs.server.namenode.TestFSImage
hadoop.hdfs.TestDatanodeDeath
hadoop.hdfs.server.namenode.snapshot.TestSnapshotStatsMXBean
hadoop.hdfs.TestEncryptionZonesWithKMS
hadoop.hdfs.server.namenode.TestNameNodeRpcServerMethods
hadoop.hdfs.server.namenode.TestFavoredNodesEndToEnd
hadoop.hdfs.TestReplaceDatanodeFailureReplication
hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength
hadoop.hdfs.TestUnsetAndChangeDirectoryEcPolicy
hadoop.hdfs.TestExternalBlockReader
hadoop.hdfs.tools.TestDFSHAAdminMiniCluster
hadoop.hdfs.server.namenode.ha.TestObserverNode
hadoop.hdfs.server.namenode.TestFSImageWithAcl
hadoop.hdfs.server.namenode.TestCacheDirectives
hadoop.hdfs.tools.TestDebugAdmin
hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy
hadoop.hdfs.server.namenode.ha.TestStandbyIsHot
hadoop.hdfs.server.namenode.snapshot.TestSnapshotNameWithInvalidCharacters
hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy
hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader
hadoop.hdfs.server.namenode.TestAuditLoggerWithCommands
hadoop.hdfs.server.namenode.TestFileLimit
hadoop.hdfs.server.namenode.TestStartup
hadoop.hdfs.tools.TestECAdmin
hadoop.hdfs.TestErasureCodingPolicyWithSnapshot
hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes
hadoop.fs.viewfs.TestViewFileSystemLinkFallback
hadoop.hdfs.TestSetTimes
hadoop.fs.viewfs.TestViewFsFileStatusHdfs
hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs
hadoop.fs.viewfs.TestViewFsHdfs
hadoop.fs.TestUnbuffer
hadoop.hdfs.TestFSOutputSummer
hadoop.hdfs.server.namenode.TestFsck
hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs
hadoop.fs.permission.TestStickyBit
hadoop.hdfs.server.namenode.TestFsckWithMultipleNameNodes
hadoop.cli.TestDeleteCLI
hadoop.hdfs.server.blockmanagement.TestDatanodeManager
hadoop.fs.viewfs.TestViewFsAtHdfsRoot
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
hadoop.hdfs.TestFileAppendRestart
hadoop.hdfs.server.namenode.TestSecurityTokenEditLog
hadoop.hdfs.server.namenode.TestNameNodeRecovery
hadoop.hdfs.tools.TestWebHDFSStoragePolicyCommands
hadoop.hdfs.server.namenode.TestProcessCorruptBlocks
hadoop.hdfs.server.balancer.TestBalancerRPCDelay
hadoop.tools.TestJMXGet
hadoop.hdfs.server.namenode.snapshot.TestSnapRootDescendantDiff
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForContentSummary
hadoop.hdfs.server.namenode.ha.TestHAStateTransitions
hadoop.hdfs.TestErasureCodingMultipleRacks
hadoop.cli.TestCryptoAdminCLI
hadoop.hdfs.security.TestDelegationTokenForProxyUser
hadoop.hdfs.server.namenode.TestMetadataVersionOutput
hadoop.hdfs.TestDFSPermission
hadoop.fs.TestHDFSFileContextMainOperations
hadoop.hdfs.TestInjectionForSimulatedStorage
hadoop.hdfs.TestParallelUnixDomainRead
hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled
hadoop.hdfs.server.namenode.TestINodeFile
hadoop.hdfs.TestWriteRead
hadoop.hdfs.server.namenode.TestFSEditLogLoader
hadoop.hdfs.server.namenode.ha.TestStandbyInProgressTail
hadoop.fs.TestUrlStreamHandler
hadoop.fs.contract.hdfs.TestHDFSContractMkdir
hadoop.hdfs.server.namenode.TestQuotaByStorageType
hadoop.hdfs.server.namenode.TestEditLogAutoroll
hadoop.hdfs.TestDFSAddressConfig
hadoop.hdfs.TestDatanodeReport
hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
hadoop.hdfs.server.namenode.TestLeaseManager
hadoop.hdfs.TestDFSClientFailover
hadoop.fs.TestFcHdfsCreateMkdir
hadoop.cli.TestCacheAdminCLI
hadoop.hdfs.server.namenode.ha.TestNNHealthCheck
hadoop.hdfs.web.TestWebHDFSAcl
hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs
hadoop.fs.shell.TestHdfsTextCommand
hadoop.hdfs.TestEncryptedTransfer
hadoop.hdfs.server.namenode.TestNameNodeRpcServer
hadoop.hdfs.server.namenode.TestLargeDirectoryDelete
hadoop.hdfs.server.namenode.TestNameEditsConfigs
hadoop.hdfs.web.TestHttpsFileSystem
hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks
hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM
hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport
hadoop.fs.contract.hdfs.TestHDFSContractOpen
hadoop.hdfs.TestModTime
hadoop.hdfs.TestLargeBlock
hadoop.hdfs.server.namenode.TestFileTruncate
hadoop.hdfs.TestDFSUpgradeFromImage
hadoop.fs.TestWebHdfsFileContextMainOperations
hadoop.hdfs.server.namenode.snapshot.TestSetQuotaWithSnapshot
hadoop.hdfs.tools.TestDFSAdminWithHA
hadoop.hdfs.server.namenode.snapshot.TestSnapshotListing
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
hadoop.hdfs.TestEncryptionZones
hadoop.hdfs.server.namenode.TestSnapshotPathINodes
hadoop.hdfs.server.namenode.snapshot.TestSnapshotRename
hadoop.hdfs.server.namenode.TestGetContentSummaryWithPermission
hadoop.fs.contract.hdfs.TestHDFSContractCreate
hadoop.hdfs.TestClientProtocolForPipelineRecovery
hadoop.hdfs.TestBlockTokenWrappingQOP
hadoop.hdfs.server.namenode.snapshot.TestAclWithSnapshot
hadoop.hdfs.TestGetBlocks
hadoop.hdfs.TestDFSClientExcludedNodes
hadoop.hdfs.tools.TestGetGroups
hadoop.hdfs.TestDecommission
hadoop.hdfs.TestReservedRawPaths
hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
hadoop.hdfs.web.TestWebHdfsFileSystemContract
hadoop.hdfs.TestBalancerBandwidth
hadoop.hdfs.server.namenode.TestMetaSave
hadoop.fs.TestSymlinkHdfsFileSystem
hadoop.hdfs.tools.TestViewFSStoragePolicyCommands
hadoop.hdfs.server.namenode.TestNameNodeXAttr
hadoop.hdfs.server.namenode.ha.TestEditLogTailer
hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter
hadoop.hdfs.server.namenode.TestDeadDatanode
hadoop.hdfs.server.namenode.TestMalformedURLs
hadoop.hdfs.server.namenode.TestXAttrConfigFlag
hadoop.hdfs.TestDataTransferProtocol
hadoop.hdfs.server.namenode.TestNamenodeRetryCache
hadoop.fs.contract.hdfs.TestHDFSContractConcat
hadoop.hdfs.TestGetFileChecksum
hadoop.hdfs.server.namenode.TestAddBlockRetry
hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication
hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
hadoop.hdfs.TestExtendedAcls
hadoop.hdfs.server.balancer.TestBalancer
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
hadoop.security.TestPermissionSymlinks
hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
hadoop.hdfs.TestAppendDifferentChecksum
hadoop.hdfs.server.namenode.TestEditLogRace
hadoop.hdfs.TestListFilesInDFS
hadoop.hdfs.TestStripedFileAppend
hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
hadoop.hdfs.TestFileChecksumCompositeCrc
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks
hadoop.cli.TestXAttrCLI
hadoop.hdfs.server.namenode.metrics.TestNNMetricFilesInGetListingOps
hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement
hadoop.cli.TestAclCLI
hadoop.fs.contract.hdfs.TestHDFSContractSeek
hadoop.hdfs.TestDFSClientRetries
hadoop.hdfs.server.namenode.TestNameNodeStatusMXBean
hadoop.security.TestRefreshUserMappings
hadoop.hdfs.web.TestWebHdfsUrl
hadoop.hdfs.server.namenode.TestNameNodeReconfigure
hadoop.hdfs.server.namenode.snapshot.TestDiffListBySkipList
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForXAttr
hadoop.hdfs.server.namenode.ha.TestGetGroupsWithHA
hadoop.hdfs.TestErasureCodingPoliciesWithRandomECPolicy
hadoop.hdfs.TestMaintenanceState
hadoop.hdfs.TestDFSStripedOutputStream
hadoop.hdfs.web.TestWebHDFSForHA
hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/6/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux 025d788337e0 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 1b9ba0e
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-597/6/artifact/out/whitespace-tabs.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-597/6/artifact/out/diff-javadoc-javadoc-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/6/testReport/
Max. process+thread count 4556 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/6/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@@ -327,7 +329,7 @@ public void close() throws IOException {
public int read(long position, byte[] buffer, int offset, int length)
throws IOException {
checkStream();
try {
if (in instanceof PositionedReadable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do you mind negating this check so that you can un-indent the rest of the method? ("early return" style)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

public int read(long position, final ByteBuffer buf)
throws IOException {
checkStream();
if (in instanceof ByteBufferPositionedReadable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here (and a couple similar below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* should be written to.
* <p>
* After a successful call, {@code buf.position()} will be advanced by the
* number of bytes read and {@code buf.limit()} should be unchanged.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should be" -> "will be"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -40,8 +40,23 @@

// Bit fields for hdfsFile_internal flags
#define HDFS_FILE_SUPPORTS_DIRECT_READ (1<<0)
#define HDFS_FILE_SUPPORTS_DIRECT_PREAD (1<<0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be 1<<1, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks for catching that. I probably should not have just blindly copied the line above. I added a regression test for this.

destroyLocalReference(env, jCapabilityString);
if (ret) {
errno = ret;
return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems a little weird to return 0 but also set errno here -- I see that you're doing it because you want to fall back to false on error, but then maybe errno should not be set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think the reason for setting errno is that all the other methods set this to indicate that an exception occurred while the method was run. So if ret is true and errno needs to be set, then the method has to return with some value, right? Returning false seemed more reasonable than returning true.

@@ -985,6 +1011,49 @@ int hdfsStreamBuilderSetDefaultBlockSize(struct hdfsStreamBuilder *bld,
return 0;
}

static int hdfsHasStreamCapability(jobject jFile,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a brief doc, even though this is just an internal method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


//Error checking... make sure that this file is 'readable'
if (f->type != HDFS_STREAM_INPUT) {
fprintf(stderr, "Cannot read from a non-InputStream object!\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems odd to printf to stderr from a library. Is this our general pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, literally everything prints to stderr

"preadDirect: FSDataInputStream#read");
return -1;
}
return (jVal.i < 0) F438 ? 0 : jVal.i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems the javadoc says that the method can return 0 for a transient zero-length result, and -1 for EOF. But here we're converting -1 to 0, so the caller can't distinguish between a must-retry '0 length read' and a 'shouldn't retry' EOF, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks like a latent bug. I double checked how hdfsRead works without the ByteBuffer optimization and it seems to be doing the right thing. I fixed this method and the pread equivalent to mimic what is done with the ByteBuffer optimization is disabled.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 0 Docker mode activated.
-1 docker 641 Docker failed to build yetus/hadoop:8f97d6f.
Subsystem Report/Notes
GITHUB PR #597
JIRA Issue HDFS-3246
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/7/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

}
jthr = invokeMethod(env, &jVal, INSTANCE, jFile,
JC_FS_DATA_INPUT_STREAM, "hasCapability", "(Ljava/lang/String;)Z",
jCapabilityString);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:tabs in line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 31 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 78 Maven dependency ordering for branch
+1 mvninstall 1122 trunk passed
+1 compile 1079 trunk passed
+1 checkstyle 196 trunk passed
+1 mvnsite 235 trunk passed
+1 shadedclient 1144 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 353 trunk passed
+1 javadoc 170 trunk passed
_ Patch Compile Tests _
0 mvndep 24 Maven dependency ordering for patch
+1 mvninstall 162 the patch passed
+1 compile 911 the patch passed
-1 cc 911 root generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9)
+1 javac 911 the patch passed
+1 checkstyle 195 root: The patch generated 0 new + 112 unchanged - 7 fixed = 112 total (was 119)
+1 mvnsite 246 the patch passed
-1 whitespace 0 The patch 1 line(s) with tabs.
+1 shadedclient 678 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 367 the patch passed
-1 javadoc 68 hadoop-common-project_hadoop-common generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
_ Other Tests _
+1 unit 527 hadoop-common in the patch passed.
+1 unit 125 hadoop-hdfs-client in the patch passed.
-1 unit 4745 hadoop-hdfs in the patch failed.
-1 unit 362 hadoop-hdfs-native-client in the patch failed.
+1 asflicense 55 The patch does not generate ASF License warnings.
12840
Reason Tests
Failed CTEST tests test_test_libhdfs_ops_hdfs_static
Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
hadoop.hdfs.web.TestWebHdfsTimeouts
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux 34dbb28d4503 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 25c421b
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
cc https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/diff-compile-cc-root.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/whitespace-tabs.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/diff-javadoc-javadoc-hadoop-common-project_hadoop-common.txt
CTEST https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/patch-hadoop-hdfs-project_hadoop-hdfs-native-client-ctest.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/testReport/
Max. process+thread count 5178 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/8/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@sahilTakiar sahilTakiar force-pushed the HDFS-3246 branch 2 times, most recently from 9c5acf7 to f5c692d Compare April 10, 2019 21:36
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 29 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 22 Maven dependency ordering for branch
+1 mvninstall 1074 trunk passed
+1 compile 1032 trunk passed
+1 checkstyle 189 trunk passed
+1 mvnsite 223 trunk passed
+1 shadedclient 1108 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 346 trunk passed
+1 javadoc 185 trunk passed
_ Patch Compile Tests _
0 mvndep 23 Maven dependency ordering for patch
+1 mvninstall 161 the patch passed
+1 compile 955 the patch passed
+1 cc 955 the patch passed
+1 javac 955 the patch passed
+1 checkstyle 201 root: The patch generated 0 new + 112 unchanged - 7 fixed = 112 total (was 119)
+1 mvnsite 221 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 707 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 358 the patch passed
+1 javadoc 193 the patch passed
_ Other Tests _
+1 unit 513 hadoop-common in the patch passed.
+1 unit 126 hadoop-hdfs-client in the patch passed.
-1 unit 5113 hadoop-hdfs in the patch failed.
+1 unit 375 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 39 The patch does not generate ASF License warnings.
13045
Reason Tests
Failed junit tests hadoop.hdfs.server.diskbalancer.TestDiskBalancer
hadoop.hdfs.server.namenode.TestFSEditLogLoader
hadoop.hdfs.server.diskbalancer.TestDiskBalancerWithMockMover
hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys
hadoop.hdfs.server.diskbalancer.TestConnectors
hadoop.hdfs.qjournal.server.TestJournalNodeSync
hadoop.hdfs.server.diskbalancer.TestDiskBalancerRPC
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.server.namenode.TestAddStripedBlocks
hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand
hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/10/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux 857a9ea309e7 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 8740755
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/10/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/10/testReport/
Max. process+thread count 5126 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/10/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 25 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 22 Maven dependency ordering for branch
+1 mvninstall 1018 trunk passed
+1 compile 1175 trunk passed
+1 checkstyle 189 trunk passed
+1 mvnsite 243 trunk passed
+1 shadedclient 1127 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 331 trunk passed
+1 javadoc 169 trunk passed
_ Patch Compile Tests _
0 mvndep 20 Maven dependency ordering for patch
+1 mvninstall 160 the patch passed
+1 compile 1228 the patch passed
+1 cc 1228 the patch passed
+1 javac 1228 the patch passed
+1 checkstyle 176 root: The patch generated 0 new + 112 unchanged - 7 fixed = 112 total (was 119)
+1 mvnsite 202 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 643 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 343 the patch passed
+1 javadoc 181 the patch passed
_ Other Tests _
+1 unit 552 hadoop-common in the patch passed.
+1 unit 117 hadoop-hdfs-client in the patch passed.
-1 unit 5210 hadoop-hdfs in the patch failed.
+1 unit 363 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 46 The patch does not generate ASF License warnings.
13379
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized
hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots
hadoop.hdfs.server.namenode.TestFsck
hadoop.hdfs.server.balancer.TestBalancerRPCDelay
hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/9/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux 66b58ae30bdb 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 8740755
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/9/testReport/
Max. process+thread count 4544 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/9/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@sahilTakiar
Copy link
Contributor Author

@toddlipcon addressed your comments.

Sahil Takiar added 2 commits April 29, 2019 15:42
* Remove the "try { ... } catch (ClassCatchException e)" pattern from
  CryptoInputStream
* Fixed a bug in CryptoInputStream#hasCapability where the method was
  not correctly delegating to the underlying stream
* Further refactoring of CryptoInputStream#decrypt methods to make the
  code easier to understand
* Revised Javadoc for CryptoInputStream#decrypt
* Revised Javadoc for ByteBufferPositionedReadable and
  ByteBufferReadable
* Revised Javadoc for readDirect, preadDirect, hdfsRead, and hdfsPread
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 24 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 62 Maven dependency ordering for branch
+1 mvninstall 1032 trunk passed
+1 compile 967 trunk passed
+1 checkstyle 142 trunk passed
+1 mvnsite 244 trunk passed
+1 shadedclient 1104 branch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 323 trunk passed
+1 javadoc 166 trunk passed
_ Patch Compile Tests _
0 mvndep 20 Maven dependency ordering for patch
+1 mvninstall 154 the patch passed
+1 compile 922 the patch passed
+1 cc 922 the patch passed
+1 javac 922 the patch passed
+1 checkstyle 130 root: The patch generated 0 new + 112 unchanged - 7 fixed = 112 total (was 119)
+1 mvnsite 225 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 671 patch has no errors when building and testing our client artifacts.
0 findbugs 0 Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
+1 findbugs 367 the patch passed
+1 javadoc 203 the patch passed
_ Other Tests _
+1 unit 499 hadoop-common in the patch passed.
+1 unit 151 hadoop-hdfs-client in the patch passed.
-1 unit 4821 hadoop-hdfs in the patch failed.
+1 unit 367 hadoop-hdfs-native-client in the patch passed.
+1 asflicense 41 The patch does not generate ASF License warnings.
12528
Reason Tests
Failed junit tests hadoop.hdfs.TestFileChecksum
hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy
hadoop.hdfs.web.TestWebHdfsTimeouts
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-597/11/artifact/out/Dockerfile
GITHUB PR #597
JIRA Issue HDFS-3246
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc
uname Linux ff710c768207 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 59816df
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-597/11/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-597/11/testReport/
Max. process+thread count 4681 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-597/11/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@sahilTakiar
Copy link
Contributor Author

Hadoop QA is still hitting these memory issues, so I ran the full suite of hadoop-hdfs tests on my local machine against this patch. I can reproduce the TestWebHdfsTimeouts although that seems to fail relatively often in Hadoop QA (reproduce on trunk as well when running on my local machine). TestBPOfferService hit a OOM issue twice out of several runs, so it seems intermittent.

I looped TestFileChecksum, TestDFSStripedOutputStreamWithRandomECPolicy, and the tests in this patch, but was not able to re-produce the memory issue.

I took a quick look at the heap dump as well and didn't see anything that was related to this patch. Interestingly, most of the memory is consumed by Mockito.

@toddlipcon toddlipcon merged commit 4877f0a into apache:trunk Apr 30, 2019
@toddlipcon
Copy link
Contributor

Thanks. +1, merged to trunk.

DremioQA pushed a commit to dremio/hadoop that referenced this pull request Sep 4, 2019
Cherry-pick and fix conflicts for the following changes from trunk

HDFS-14267. Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples.
HDFS-14111. hdfsOpenFile on HDFS causes unnecessary IO from file offset 0.
HDFS-3246: pRead equivalent for direct read path (apache#597)

Change-Id: I883812e8bb5f0e951a2523336d63022ab61cf737
smengcl pushed a commit to smengcl/hadoop that referenced this pull request Oct 8, 2019
Contributed by Sahil Takiar

(cherry picked from commit 4877f0a)
(cherry picked from commit 71cc468)

Conflicts:
	hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ByteBufferReadable.java
	hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoStreams.java

Change-Id: Ie80bb459e1e20d5ba1b39f6a281f8450765bf007
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
Call for 'status' or 'kill' does not require Execution plan calculation.

Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Xinyu Liu <xinyuiscool@github.com>

Closes apache#597 from sborya/RemoteAppRunnerStatus and squashes the following commits:

7e1feea0 [Boris S] retry
1c0b3e4f [Boris S] checkstyle
e8d8d517 [Boris S] skipp graph planner for app status command
88f8559 [Boris S] Merge branch 'master' of https://github.com/apache/samza
0edf343 [Boris S] Merge branch 'master' of https://github.com/apache/samza
67e611e [Boris S] Merge branch 'master' of https://github.com/apache/samza
dd39d08 [Boris S] Merge branch 'master' of https://github.com/apache/samza
1ad58d4 [Boris S] Merge branch 'master' of https://github.com/apache/samza
06b1ac3 [
10000
Boris Shkolnik] Merge branch 'master' of https://github.com/sborya/samza
5e6f5fb [Boris Shkolnik] Merge branch 'master' of https://github.com/apache/samza
010fa16 [Boris S] Merge branch 'master' of https://github.com/apache/samza
bbffb79 [Boris S] Merge branch 'master' of https://github.com/apache/samza
d4620d6 [Boris S] Merge branch 'master' of https://github.com/apache/samza
410ce78 [Boris S] Merge branch 'master' of https://github.com/apache/samza
a31a7aa [Boris Shkolnik] reduce debugging from info to debug in KafkaCheckpointManager.java
DremioQA pushed a commit to dremio/hadoop that referenced this pull request Oct 24, 2019
Cherry-pick and fix conflicts for the following changes from trunk

HDFS-14267. Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples.
HDFS-14111. hdfsOpenFile on HDFS causes unnecessary IO from file offset 0.
HDFS-3246: pRead equivalent for direct read path (apache#597)

Change-Id: I883812e8bb5f0e951a2523336d63022ab61cf737
(cherry picked from commit 01c0d7d)
DremioQA pushed a commit to dremio/hadoop that referenced this pull request Dec 27, 2019
Cherry-pick and fix conflicts for the following changes from trunk

HDFS-14267. Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples.
HDFS-14111. hdfsOpenFile on HDFS causes unnecessary IO from file offset 0.
HDFS-3246: pRead equivalent for direct read path (apache#597)

Change-Id: I883812e8bb5f0e951a2523336d63022ab61cf737
(cherry picked from commit 01c0d7d)
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Oct 25, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Oct 28, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri added a commit to acceldata-io/hadoop that referenced this pull request Oct 28, 2024
deepakdamri added a commit to acceldata-io/hadoop that referenced this pull request Nov 20, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri added a commit to acceldata-io/hadoop that referenced this pull request Nov 27, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Nov 28, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Dec 2, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Dec 2, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Dec 2, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Dec 2, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
deepakdamri pushed a commit to acceldata-io/hadoop that referenced this pull request Dec 3, 2024
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
basic02 pushed a commit to basic02/hadoop that referenced this pull request Jan 12, 2025
Contributed by Sahil Takiar

(cherry picked from commit 4877f0a)

Conflicts:
	hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ByteBufferReadable.java
	hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoStreams.java

Change-Id: Ie80bb459e1e20d5ba1b39f6a281f8450765bf007
harshith-212 added a commit to acceldata-io/hadoop that referenced this pull request Mar 5, 2025
shubhluck pushed a commit to acceldata-io/hadoop that referenced this pull request Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0