New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of memory when decrypting large files without MDC #1449
Comments
I am not following what kind of files you are trying to decrypt. |
I'm trying to decrypt large files provided by a client and this task needs to be automated on the daily basis. Those files contain PII, so I can't share them (even without decryption keys; sorry). The only special thing about them is that they're encrypted without MDC (tag=9), which means they're picked up by the I'm not saying Anyway, here's a repro including test keys and files: openpgpjs_1449_repro.zip. Test files with and without MDC were generated using the ancient gpg 1.4.23, because recent versions seem to enforce MDC. index.js for your convenience
|
Hey 👋 Thanks for the report. First of all, regardless of this issue, I would strongly suggest using That being said, I'm not opposed to adding streaming decryption for non-MDC-protected messages, as indeed in principle the streaming should apply to everything. Your patch is a good start; to make it slightly more compact you could take a page from sym_encrypted_integrity_protected_data.js and do something like: let encrypted = stream.clone(this.encrypted);
if (stream.isArrayStream(encrypted)) encrypted = await stream.readToEnd(encrypted);
const iv = await stream.readToEnd(stream.slice(stream.clone(encrypted), 2, blockSize + 2));
const ciphertext = stream.slice(encrypted, blockSize + 2);
const decrypted = await crypto.mode.cfb.decrypt(sessionKeyAlgorithm, key, ciphertext, iv); rather than having two entirely separate cases. A PR would be welcome 👍 |
Can do. But let's agree on testing first. Seeing this issue undetected by tests makes me think that a new kind of test is required to make sure there are no accidental "readToEnd-s" like this very case. I even tried to botch Now, looking at streaming.js, I see tests operating on the higher level API, i.e. I have doubts that a new configuration property should be defined just for the sake of testing. But may be that's ok to add one (like, |
Hey 👋 Sorry, I forgot to respond to this.
Just as a general remark, OpenPGP.js today does not really offer the guarantee that when streaming, the data is not buffered in memory. So in that sense, the
That being said, there are some tests in streaming.js that aim to test that stream-decrypting
I would prefer not to, as we're trying to get rid of configuration options that are insecure.
Yeah, that could work 👍 |
Sorry to intrude, I'd just like to know if the problem described here is what I'm experiencing too: I'm trying to decrypt a file while downloading it on the fly: const url = "some url to encrypted file"
const res = await fetch(url);
const message = await readMessage({
binaryMessage: res.body
})
console.log("before decrypt")
const decrypted = await decrypt({
message,
passwords: 'secret',
format: "binary",
})
console.log("after decrypt")
// code to consume decrypted.data as ReadableStream Unfortunately, I observe that "before decrypt" is logged practically right away and "after decrypt" only shortly after the whole file has been downloaded (as I can see in the Network tab of chrome). I expected the |
Hi @fi4sk0 , the problem described here is relevant to a legacy encryption option, so I am not sure whether it applies to your case. Could you check if passing |
Hey @larabr, |
Sorry to bother you again @larabr but I’d like to know the implications of that flag: the way I read the documentation I understood that although I won’t be notified of message modifications during the streaming, decode will throw after the stream has completed and was recognized as modified. Is this correct? |
@fi4sk0 correct, it will still throw when you finish reading out the decrypted |
OpenPGP.js is crashing with OOM issues when I feed it files larger than about 100 MB. Granted, it's working in a resource-constained environment, but I wouldn't expect memory consumption to be proportional to the input file size in general.
So, I found this line that tries to swallow the entire input stream into memory before passing it to the decryption stage:
await stream.readToEnd(stream.clone(this.encrypted))
. Naturally, it fails for larger files. And for smaller ones you get all decrypted data inside one large chunk. It does not care about things likeallowUnauthenticatedStream: true
andallowUnauthenticatedMessages: true
(not sure which one is applicable).A limited-use workaround is to encrypt files with MDC enabled, so that they're processed by a different class. In other words, when I encrypt files using
gpg --force-mdc ...
they don't cause any similar issues when decrypted with this package.I played around with the source code a little and came up with the following to replace that block of code:
This way, it seems to deliver properly-chunked decrypted output without choking on large files. Does this look like a proper solution?
The text was updated successfully, but these errors were encountered: