Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-compute SHA1 sum for streams #87

Open
cdhowie opened this issue Dec 31, 2019 · 0 comments
Open

Auto-compute SHA1 sum for streams #87

cdhowie opened this issue Dec 31, 2019 · 0 comments

Comments

@cdhowie
Copy link

cdhowie commented Dec 31, 2019

Related to #32. Applies to uploadPart and uploadFile.

If hash is not passed and data is a stream, the hash can be computed on the fly and appended to the output, while providing the header X-Bz-Content-Sha1: hex_digits_at_end. It would be nice if the client would wrap up this logic itself.

This change is simpler than it seems at first. I wrote the following transform stream that hashes the content as it passes through, then emits the hash before the stream ends. We are using this in production successfully.

const crypto = require('crypto');
const stream = require('stream');

function makeSha1AppendingStream() {
    const d = crypto.createHash('sha1');

    return new stream.Transform({
        transform(chunk, encoding, cb) {
            d.update(chunk, encoding);
            this.push(chunk, encoding);
            cb();
        },

        flush(cb) {
            this.push(d.digest('hex'));
            cb();
        },
    });
}

Used simply like (adjust variable names as needed):

if (hash === undefined && typeof data.pipe === 'function') {
  const hashStream = makeSha1AppendingStream();
  data.on('error', err => { hashStream.emit('error', err); });
  data = data.pipe(hashStream);

  hash = 'hex_digits_at_end';
  contentLength += 40;
}

Side note: if streams are used, all retrying/redirect-following should be disabled. This is either unsafe since the stream has been consumed, or will likely consume a large amount of memory as the entire request body is buffered in memory in case the request needs to be replayed. We had to pass maxRedirects: 0 to axios or process memory would balloon (we're uploading several-hundred-MB files and this was killing us).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants