Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[client-s3] client is not sending chunks read from fs.ReadSteam when ChecksumAlgorithm is set #3368

Closed
trivikr opened this issue Feb 25, 2022 · 8 comments · Fixed by #3369
Closed
Labels
bug This issue is a bug.

Comments

@trivikr
Copy link
Member

trivikr commented Feb 25, 2022

Describe the bug

client is not sending chunks read from fs.ReadSteam when ChecksumAlgorithm is set

Your environment

SDK version number

@aws-sdk/client-s3@3.53.0

Is the issue in the browser/Node.js/ReactNative?

Node.js

Details of the browser/Node.js/ReactNative version

$ node -v
v16.13.2

Steps to reproduce

import { S3 } from "@aws-sdk/client-s3"; // v3.53.0

import { createReadStream } from "fs";

const logRequestMiddleware = (next) => async (args) => {
  const { body } = args.request;

  let chunkCount = 0;
  body.on("data", (chunk) => {
    console.log(`actual sent chunk #${chunkCount}: '${chunk.toString()}'`);
    chunkCount++;
  });

  return next(args);
};
const logRequestMiddlewareOptions = { step: "deserialize" };

const region = "us-west-2";
const client = new S3({ region });
client.middlewareStack.add(logRequestMiddleware, logRequestMiddlewareOptions);

const fileName = "hello-world.txt"; // File contains "helloworld"
// Because of highWaterMark=5, data will be read in two chunks: 'hello', 'world'
const readableStream = createReadStream(fileName, { highWaterMark: 5 });

let chunkCount = 0;
readableStream.on("data", (chunk) => {
  console.log(`stream sent chunk #${chunkCount}: '${chunk.toString()}'`);
  chunkCount++;
});

const Bucket = "aws-sdk-js-trivikr-flexible-checksums-testing"; // replace with your bucket
const Key = fileName;
const Body = readableStream;
const ChecksumAlgorithm = "CRC32";

try {
  await client.putObject({ Bucket, Key, Body, ChecksumAlgorithm });
} catch (error) {
  console.log({ error });
}

Observed behavior

The data from file is not sent over the wire.

stream sent chunk #0: 'hello'
stream sent chunk #1: 'world'
actual sent chunk #0: '0
'
actual sent chunk #1: 'x-amz-checksum-crc32:AAAAAA==
'
actual sent chunk #2: '
'
{
  error: IncompleteBody: The request body terminated unexpectedly
      at deserializeAws_restXmlPutObjectCommandError (/local/home/trivikr/workspace/temp/node_modules/@aws-sdk/client-s3/dist-cjs/protocols/Aws_restXml.js:8037:24)
      at processTicksAndRejections (node:internal/process/task_queues:96:5)
      at async /local/home/trivikr/workspace/temp/node_modules/@aws-sdk/client-s3/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24
      at async /local/home/trivikr/workspace/temp/node_modules/@aws-sdk/client-s3/node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:11:20
      at async StandardRetryStrategy.retry (/local/home/trivikr/workspace/temp/node_modules/@aws-sdk/client-s3/node_modules/@aws-sdk/middleware-retry/dist-cjs/StandardRetryStrategy.js:51:46)
      at async /local/home/trivikr/workspace/temp/node_modules/@aws-sdk/middleware-flexible-checksums/dist-cjs/flexibleChecksumsMiddleware.js:56:20
      at async /local/home/trivikr/workspace/temp/node_modules/@aws-sdk/client-s3/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:6:22
      at async file:///local/home/trivikr/workspace/temp/putObject.brawn.mjs:38:3 {
    '$fault': 'client',
    '$metadata': {
      httpStatusCode: 400,
      requestId: undefined,
      extendedRequestId: '9XGOJ07zW1IHBjTtAyE/+Mv97SzhwzNKTiibWfaSEr0dAP1KPbB4Y44K1Yhh0rfL4xup8M8+35I=',
      cfId: undefined,
      attempts: 1,
      totalRetryDelay: 0
    },
    Code: 'IncompleteBody',
    RequestId: 'A0Z1NB2HD491JASB',
    HostId: '9XGOJ07zW1IHBjTtAyE/+Mv97SzhwzNKTiibWfaSEr0dAP1KPbB4Y44K1Yhh0rfL4xup8M8+35I=',
    '$response': HttpResponse {
      statusCode: 400,
      headers: [Object],
      body: [IncomingMessage]
    }
  }
}

Expected behavior

The request should be successful, and the output should print.

stream sent chunk #0: 'hello'
stream sent chunk #1: 'world'
actual sent chunk #0: '5
hello
'
actual sent chunk #1: '5
world
'
actual sent chunk #2: '0
'
actual sent chunk #3: 'x-amz-checksum-crc32:AAAAAA==
'
actual sent chunk #4: '
'

Additional context

Tests for reference:

@trivikr trivikr added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Feb 25, 2022
@trivikr

This comment was marked as outdated.

@trivikr trivikr removed the needs-triage This issue or PR still needs to be triaged. label Feb 25, 2022
@trivikr
Copy link
Member Author

trivikr commented Feb 25, 2022

Verified that the chunks read in data events for fs.ReadStream are not sent to the server

Repro code:

import { S3 } from "@aws-sdk/client-s3"; // v3.53.0

import { createReadStream } from "fs";

const region = "us-west-2";
const client = new S3({ region });

const fileName = "hello-world.txt"; // File contains "helloworld"
// Because of highWaterMark=5, data will be read in two chunks: 'hello', 'world'
const readableStream = createReadStream(fileName, { highWaterMark: 5 });

const Bucket = "aws-sdk-js-trivikr-flexible-checksums-testing"; // replace with your bucket
const Key = fileName;
const Body = readableStream;
const ChecksumAlgorithm = "CRC32";

try {
  await client.putObject({ Bucket, Key, Body, ChecksumAlgorithm });
} catch (error) {
  console.log({ error });
}

console.log added in node_modules/@aws-sdk/util-stream-node/dist-cjs/getAwsChunkedEncodingStream.js

    let chunkCount = 0;
    readableStream.on("data", (data) => {
        const chunkData = `${(bodyLengthChecker(data) || 0).toString(16)}\r\n${data.toString()}\r\n`;
        console.log(`chunk #${chunkCount++}: """${chunkData}"""`);
        awsChunkedEncodingStream.push(chunkData);
    });
    readableStream.on("end", async () => {
        console.log(`chunk #${chunkCount++}: """0\r\n"""`);
        awsChunkedEncodingStream.push(`0\r\n`);
        if (checksumRequired) {
            const checksum = base64Encoder(await digest);
            const checksumChunk = `${checksumLocationName}:${checksum}\r\n`;
            console.log(`chunk #${chunkCount++}: """${checksumChunk}"""`);
            awsChunkedEncodingStream.push(checksumChunk);
            console.log(`chunk #${chunkCount++}: """\r\n"""`);
            awsChunkedEncodingStream.push(`\r\n`);
        }
        awsChunkedEncodingStream.push(null);
    });

Observed behavior:

chunk #0: """0
"""
chunk #1: """x-amz-checksum-crc32:+esgrQ==
"""
chunk #2: """
"""

Expected behavior:

chunk #0: """5
hello
"""
chunk #1: """5
world
"""
chunk #2: """0
"""
chunk #3: """x-amz-checksum-crc32:+esgrQ==
"""
chunk #4: """
"""

@trivikr
Copy link
Member Author

trivikr commented Feb 25, 2022

Issue probably related to #3338

@trivikr
Copy link
Member Author

trivikr commented Feb 25, 2022

This issue happens because we pass fd while creating readStream in

This was added in support streams created using filehandle.createReadStream in Node.js >=16.11.0

We'll remove it as of now, and think of handling it later.

@trivikr
Copy link
Member Author

trivikr commented Feb 28, 2022

For future reference, this happened as readStream.fd is set in fsCreateReadStream even if it's not set while creating the fs.ReadStream using fs.creatReadStream.

The fd gets set when somewhere when middleware stack is getting resolved, as it's not set when resolveMiddleware is called and is set when middleware is called.

@trivikr
Copy link
Member Author

trivikr commented Feb 28, 2022

While debugging, I noticed that the valud for fd gets set in Body when resolving endpoint is resolved.

const { hostname, protocol = "https", port, path: basePath } = await context.endpoint();

@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant