Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM when using S3TransferManager.uploadDirectory() with 2 million files #5023

Closed
jensvogt opened this issue Mar 16, 2024 · 5 comments
Closed
Labels
bug This issue is a bug. p1 This is a high priority issue

Comments

@jensvogt
Copy link

Describe the bug

When uploading a directory to S3 using S3TransferManager.uploadDirectory() that contains hundreds of thousands of files, then it fails with OutOfMemoryError.

Expected Behavior

S3TransferManager can work fine no matter how many files are in the directory tree.

Current Behavior

xception in thread "Thread-11" Exception in thread "Thread-24" Exception in thread "Thread-65" Exception in thread "sdk-async-response-0-24" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space

Reproduction Steps

package org.example;

import org.apache.commons.lang3.RandomStringUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3AsyncClient;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.internal.crt.S3CrtAsyncClient;
import software.amazon.awssdk.services.s3.model.CreateBucketRequest;
import software.amazon.awssdk.services.s3.model.HeadBucketRequest;
import software.amazon.awssdk.services.s3.model.NoSuchBucketException;
import software.amazon.awssdk.transfer.s3.S3TransferManager;
import software.amazon.awssdk.transfer.s3.model.CompletedDirectoryUpload;
import software.amazon.awssdk.transfer.s3.model.UploadDirectoryRequest;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Path;

public class Main {

private static Path parentDir;
private static final String LOCALSTACK_ENDPOINT = "http://localhost:4566";
private static final Region LOCALSTACK_REGION = Region.EU_CENTRAL_1;
private static final String BUCKET = "test-bucket";
private static final int FILE_COUNT = 2000000;
private static final Logger LOG = LoggerFactory.getLogger(Main.class);

private static S3Client s3Client;

private static S3TransferManager s3TransferManager;

private static void setupAws() {
    S3AsyncClient s3AsyncClient = S3CrtAsyncClient.builder()
            .endpointOverride(URI.create(LOCALSTACK_ENDPOINT))
            .region(LOCALSTACK_REGION)
            .credentialsProvider(ProfileCredentialsProvider.create())
            .forcePathStyle(true)
            .targetThroughputInGbps(10d)
            .maxNativeMemoryLimitInBytes((long) 1024 * 1024 * 1024)
            .maxConcurrency(10)
            .build();
    s3TransferManager = S3TransferManager.builder().s3Client(s3AsyncClient).build();
    s3Client = S3Client.builder()
            .endpointOverride(URI.create(LOCALSTACK_ENDPOINT))
            .forcePathStyle(true)
            .region(LOCALSTACK_REGION)
            .credentialsProvider(ProfileCredentialsProvider.create())
            .build();
}

private static boolean bucketExists(String bucket) {
    HeadBucketRequest headBucketRequest = HeadBucketRequest.builder()
            .bucket(bucket)
            .build();

    try {
        s3Client.headBucket(headBucketRequest);
        return true;
    } catch (NoSuchBucketException e) {
        return false;
    }
}

private static void setupBucket() {
    if (!bucketExists(Main.BUCKET)) {
        s3Client.createBucket(CreateBucketRequest.builder().bucket(Main.BUCKET).build());
    }
}

private static void setupFiles() throws IOException {
    parentDir = Files.createTempDirectory("aws-s3");
    for (int i = 0; i < Main.FILE_COUNT; i++) {
        File file = new File(parentDir.toString() + File.separator + "random" + i + ".txt");
        try {
            FileWriter writesToFile = new FileWriter(file);
            BufferedWriter writer = new BufferedWriter(writesToFile);
            writer.write(RandomStringUtils.random(256, true, true) + "\n");
            writer.close();
        } catch (IOException e) {
            System.exit(0);
        }
    }
}

private static void uploadFiles(String bucket, String prefix, Path sourceDir) {
    CompletedDirectoryUpload completedDirectoryUpload =
            s3TransferManager
                    .uploadDirectory(
                            UploadDirectoryRequest.builder()
                                    .bucket(bucket)
                                    .s3Prefix(prefix)
                                    .source(sourceDir)
                                    .build())
                    .completionFuture()
                    .join();

    if (!completedDirectoryUpload.failedTransfers().isEmpty()) {
        Throwable exception = completedDirectoryUpload.failedTransfers().get(0).exception();
        String msg = String.format("Some files failed ({%d}x) to upload with directory upload. One exception was: {%s}",
                completedDirectoryUpload.failedTransfers().size(), exception.getMessage());
        throw new RuntimeException(msg, exception);
    }
}

public static void main(String[] args) throws IOException {

    setupAws();
    LOG.info("AWS initialized");

    setupBucket();
    LOG.info("Bucket created: {}", BUCKET);

    setupFiles();
    LOG.info("Files created: {}", FILE_COUNT);

    uploadFiles(BUCKET, "", parentDir);
    LOG.info("Files uploaded: {}", FILE_COUNT);
}

}

Possible Solution

Do not collect a List of CompletableFutures, because this will result in a huge in-memory collection, in case the directory tree has millions of files. In our case we have a flat directory tree (only one directory with 2mio small XML files).

private void doUploadDirectory(CompletableFuture<CompletedDirectoryUpload> returnFuture,
                                   UploadDirectoryRequest uploadDirectoryRequest) {

        Path directory = uploadDirectoryRequest.source();

        validateDirectory(uploadDirectoryRequest);

        Collection<FailedFileUpload> failedFileUploads = new ConcurrentLinkedQueue<>();
        List<CompletableFuture<CompletedFileUpload>> futures;

        try (Stream<Path> entries = listFiles(directory, uploadDirectoryRequest)) {
            futures = entries.map(path -> {
                CompletableFuture<CompletedFileUpload> future = uploadSingleFile(uploadDirectoryRequest,
                                                                                 failedFileUploads, path);

                // Forward cancellation of the return future to all individual futures.
                CompletableFutureUtils.forwardExceptionTo(returnFuture, future);
                return future;
            }).collect(Collectors.toList());
        }

        // Here we should have a in-memory list of 2mio CompleteableFutures, but it will never come to this point, as a OOM will happen above, because the memory allocation for the collection will fail in case of 2 mio files.
        CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
                         .whenComplete((r, t) -> returnFuture.complete(CompletedDirectoryUpload.builder()
                                                                                               .failedTransfers(failedFileUploads)
                                                                                               .build()));
    }

Additional Information/Context

Actually, this is not a memory leak, its just a bad design. In v1 you collection the directory tree in a in-memory ArrayList. This has been fixed, as you use a directory streams in v2. Nevertheless, now a huge in-memory list of CompleteableFutures is created.

AWS Java SDK version used

2.25.10

JDK version used

openjdk version "17.0.10" 2024-01-16 OpenJDK Runtime Environment (build 17.0.10+7-Debian-1deb12u1) OpenJDK 64-Bit Server VM (build 17.0.10+7-Debian-1deb12u1, mixed mode, sharing)

Operating System and version

Debian12 (bookworm)

@jensvogt jensvogt added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 16, 2024
@zoewangg zoewangg added p0 This issue is the highest priority p1 This is a high priority issue and removed needs-triage This issue or PR still needs to be triaged. p0 This issue is the highest priority labels Mar 22, 2024
@zoewangg
Copy link
Contributor

Hi @jensvogt, we released a fix in 2.25.19, could you try with the latest version?

@zoewangg zoewangg added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Mar 27, 2024
@jensvogt
Copy link
Author

@zoewangg Hi, will try the latest version.

@jensvogt
Copy link
Author

@zoewangg
AWS SDK v2.25.10 (JProfiler):
Screenshot from 2024-03-28 07-52-14
AWS SDK v2.25.19 (JProfiler):
Screenshot from 2024-03-28 08-10-01
Excellent work! CompleteableFutures are getting properly GC-ed. Issue can be closed.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Mar 28, 2024
@zoewangg
Copy link
Contributor

Nice, thanks for verifying it. Closing the issue

Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p1 This is a high priority issue
Projects
None yet
Development

No branches or pull requests

2 participants