8000 Default GitHubConnectorResponse to streamed body instead of in-memory buffer by atsushieno · Pull Request #2059 · hub4j/github-api · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Default GitHubConnectorResponse to streamed body instead of in-memory buffer #2059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 18, 2025

Conversation

atsushieno
Copy link
Contributor
@atsushieno atsushieno commented Mar 13, 2025

Default GitHubConnectorResponse to streamed body instead of in-memory buffer.

This significantly reduces memory usage especially for large responses such as GhArtifact.download().

GitHubConnectorResponse automatically falls back to buffered where required.

Fixes: #1405

Description

Before submitting a PR:

  • Changes must not break binary backwards compatibility. If you are unclear on how to make the change you think is needed while maintaining backward compatibility, CONTRIBUTING.md for details.
  • Add JavaDocs and other comments explaining the behavior.
  • When adding or updating methods that fetch entities, add @link JavaDoc entries to the relevant documentation on https://docs.github.com/en/rest .
  • Add tests that cover any added or changed code. This generally requires capturing snapshot test data. See CONTRIBUTING.md for details.
  • Run mvn -D enable-ci clean install site locally. If this command doesn't succeed, your change will not pass CI.
  • Push your changes to a branch other than main. You will create your PR from that branch.

When creating a PR:

  • Fill in the "Description" above with clear summary of the changes. This includes:
    • If this PR fixes one or more issues, inc 8000 lude "Fixes #" lines for each issue.
    • Provide links to relevant documentation on https://docs.github.com/en/rest where possible. If not including links, explain why not.
  • All lines of new code should be covered by tests as reported by code coverage. Any lines that are not covered must have PR comments explaining why they cannot be covered. For example, "Reaching this particular exception is hard and is not a particular common scenario."
  • Enable "Allow edits from maintainers".

atsushieno and others added 2 commits March 13, 2025 17:11
context: hub4j#1405

GHArtifact.download() now explicitly sets this option to make response stream
non-buffered.
@bitwiseman
Copy link
Member

@atsushieno
Add added pom config to ignore the new default method not getting covered.

Copy link
codecov bot commented Mar 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.69%. Comparing base (7e02afa) to head (350f1bc).
Report is 9 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2059      +/-   ##
============================================
+ Coverage     83.58%   83.69%   +0.10%     
- Complexity     2380     2394      +14     
============================================
  Files           235      235              
  Lines          7262     7278      +16     
  Branches        382      386       +4     
============================================
+ Hits           6070     6091      +21     
+ Misses          956      954       -2     
+ Partials        236      233       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bitwiseman bitwiseman force-pushed the avoid-buffered-response-stream branch from 218b1e7 to b2641c0 Compare March 15, 2025 08:10
@bitwiseman
Copy link
Member
bitwiseman commented Mar 15, 2025

@atsushieno
Thanks for this PR.

I made some tweaks to the code to get coverage and handle some edge cases.

Then moved the unbuffered setting down to fetchStream(), so now all stream requesting calls avoid buffering.

Then added a fallback so buffering is avoided only when status code is HTTP_OK (200).

Looking at this, it seems like the only case where we need buffering is for multiple reads of the body stream, which only occur in error, finest logging, and debugging scenarios.

I think it could be possible to moved to having unbuffered response streaming be the default for all calls when status == 200 and specifically preventBufferedResponseBodyStream() only when calling to fetchStream(). Basically, response.useBufferedBodyStream() would be:

public boolean useBufferedBodyStream() {
    return statusCode() != HTTP_OK && !request().preventBufferedResponseBodyStream();
}

Maybe with a static variable that allows forcing buffering for debugging.

I don't expect you to do any of the above. I'm thinking out loud and trying to decide if it's worth trying.

@atsushieno
Copy link
Contributor Author

@bitwiseman Thanks a lot to make this changeset way more solid. I saw a bunch of changes and the set of further changes seems in the right direction to me.

@bitwiseman bitwiseman force-pushed the avoid-buffered-response-stream branch 4 times, most recently from ecd30c8 to 9a18e00 Compare March 17, 2025 09:31
@bitwiseman
Copy link
Member

@atsushieno
PR now switches all successful request calls to use stream instead of byte arrays, automatically switching back to byte array buffering when needed.

@bitwiseman bitwiseman force-pushed the avoid-buffered-response-stream branch from 9a18e00 to 53be1b3 Compare March 17, 2025 09:32
@atsushieno
Copy link
Contributor Author

I noticed that, that would be even better if everything works!

@bitwiseman bitwiseman force-pushed the avoid-buffered-response-stream branch 2 times, most recently from e411991 to bf53879 Compare March 17, 2025 22:43
@bitwiseman bitwiseman force-pushed the avoid-buffered-response-stream branch from bf53879 to 89c2efc Compare March 17, 2025 22:55
@bitwiseman bitwiseman changed the title Add an option to avoid buffered response stream. Default GitHubConnectorResponse to streamed body instead of in-memory buffer Mar 18, 2025
@bitwiseman bitwiseman merged commit fea1001 into hub4j:main Mar 18, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OutOfMemoryError on GHRepository.readZip due to lack of streaming
2 participants
0