Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncResponseTransformer.toBlockingStream() infinitely prefetches data. #4989

Closed
nikron opened this issue Mar 5, 2024 · 3 comments
Closed
Labels
bug This issue is a bug. needs-triage This issue or PR still needs to be triaged.

Comments

@nikron
Copy link

nikron commented Mar 5, 2024

Describe the bug

From reading the code and looking at behavior of when using the SDK for S3, I believe AsyncResponseTransformer.toBlockingStream() produces a stream that will store the all the data in memory if not consumed fast enough. This is a problem when downloading lots of data from S3 without being able to write it fast enough, and it causes out of memory issues. I believe there should be a limit on how much ByteBufferStoringSubscriber will accumulate.

Expected Behavior

When reading faster than consuming with an input stream, I don't expect the input stream to buffer the entire remaining amount.

Current Behavior

Input streams can buffer the full amount of data if not written fast enough.

Reproduction Steps

I haven't tried this, but if my theory is right:

  1. Attempt to download a download a large object with S3 using blocking input stream
  2. Don't consume the input stream.
  3. The full data should reside in memory.

Possible Solution

ByteBufferStoringSubscriber or it's delegate should set a max number of bytes to store along with a minimum.

Additional Information/Context

It's possible that I'm getting out of memory errors from downloading too much data from S3 for a different reason, but this is what my debugging has lead me to.

AWS Java SDK version used

2.24.1

JDK version used

21

Operating System and version

Ubuntu 22.04

@nikron nikron added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 5, 2024
@nikron
Copy link
Author

nikron commented Mar 5, 2024

I might be wrong about this bug report because ByteBufferStoringSubscriber does try to limit to the buffer. However, if the a single event exceeds the 4MB buffer there can be more than 4MB stored.

@nikron
Copy link
Author

nikron commented Mar 6, 2024

I think this is more a problem that the crt client sets the initial read buffer high (80mb) by default, so lots of medium size files can quickly get put into memory. Closing this as my memory issues when away after setting that much lower. I think the default should probably much lower than 80mb.

@nikron nikron closed this as completed Mar 6, 2024
Copy link

github-actions bot commented Mar 6, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-triage This issue or PR still needs to be triaged.
Projects
None yet
Development

No branches or pull requests

1 participant