Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async entry read causes exception when reading AES encrypted zip. #823

Open
rilehudd opened this issue Jan 24, 2023 · 8 comments
Open

Async entry read causes exception when reading AES encrypted zip. #823

rilehudd opened this issue Jan 24, 2023 · 8 comments
Assignees
Labels
async bug encryption zip Related to ZIP file format
Projects

Comments

@rilehudd
Copy link

Describe the bug

When trying to read an encrypted archive's entry's input stream asynchronously, it reports an exception containing message: "Auth code missing from input stream".

Yet, if the same archive is read synchronously it works fine.

Reproduction Code

https://dotnetfiddle.net/RhJM7o

Steps to reproduce

It doesn't seem to happen for every input. But for inputs where it does happen, changing the async call to the synchronous version fixes the problem.

The zip input is in the dotnetfiddle.

The steps to create the zip archive from 7-Zip are:

  1. Using the file provided (appendix A)
  2. 7-Zip -> Add to archive
  3. Ensure params:
  • Archive Format: Zip
  • Compression level: 5 - Normal
  • Compression method: * Deflate
  • Encryption method: AES-256
  • Set password (I set mine to "test")
  1. Run the decompress/decrypt steps outlined in the dotnetfiddle

Expected behavior

For the same input, I expected them both to work (the synchronous and the asynchronous).

Operating System

Windows

Framework Version

.NET 6

Tags

ZIP, Encryption, Async

Additional context

No response

@rilehudd rilehudd added the bug label Jan 24, 2023
@github-actions github-actions bot added async encryption zip Related to ZIP file format labels Jan 24, 2023
@piksel
Copy link
Member

piksel commented Apr 10, 2023

Thanks for the report and for actually using the dotnet fiddle template!
I compacted it somewhat to emphasise that the only difference is the async/sync calls: https://dotnetfiddle.net/5alzjK

ArchiveDiag report for the file (nothing stands out afaikt):
https://archivediag.piksel.se/reports/asyncaes.zip-1681140301148006.html

I have not found I cause yet, but I will look into this as soon I can.

@piksel piksel self-assigned this Apr 10, 2023
@piksel piksel added this to Needs triage in Bugs via automation Apr 10, 2023
@piksel piksel moved this from Needs triage to High priority in Bugs Apr 10, 2023
@piksel
Copy link
Member

piksel commented Apr 19, 2023

Yeah, this is yet another time that the framework bypasses the public methods of CryptoStream for increased performance.
ZipAESStream overrides the Read and ReadAsync methods of CryptoStream, but as you can see from the

Stacktrace
   at ICSharpCode.SharpZipLib.Encryption.ZipAESTransform.TransformFinalBlock(Byte[] inputBuffer, Int32 inputOffset, Int32 inputCount) in /_/src/ICSharpCode.SharpZipLib/Encryption/ZipAESTransform.cs:line 141
   at System.Security.Cryptography.CryptoStream.ReadAsyncCore(Memory`1 buffer, CancellationToken cancellationToken, Boolean useAsync)
   at System.Security.Cryptography.CryptoStream.ReadAsyncInternal(Memory`1 buffer, CancellationToken cancellationToken)
   at System.IO.StreamReader.ReadBufferAsync(CancellationToken cancellationToken)
   at System.IO.StreamReader.ReadToEndAsyncInternal()
   at Program.<<Main>$>g__Repro|0_0(String type)
   at Program.<<Main>$>g__TestAsync|0_2[T](Func`2 a, T input)

, StreamReader just bypasses ReadAsync and calls ReadAsyncCore directly, which bypasses our read logic. That method is internal and cannot be overridden by us.

This is essentially #572 all over again, but with no clear solution.

@piksel
Copy link
Member

piksel commented Apr 19, 2023

As for consumers (including @rilehudd), you can replace the StreamReader with:

var entry = zip.GetEntry("abcdefgh.txt");
var entryStream = zip.GetInputStream(entry);
var entryBuffer = new byte[entry.Size];
using (var ms2 = new MemoryStream(entryBuffer))
{
	byte[] buffer = new byte[81920];
  	int read;
	while ((read = await entryStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
	{
		ms2.Write(buffer, 0, read);
	}

	var output = Encoding.UTF8.GetString(entryBuffer);

      // ... use output ...
}

It's not ideal, for sure, but that is the only async path that avoids the optimisations in .NET that bypasses the AES encryption logic.

@rilehudd
Copy link
Author

Is there a way we can detect this "bypassed" state an compensate for this optimization?
For example detect that ZipAESTransform.TransformFinalBlock has been called from the non-expected path (to implement what CryptoStream is expecting). I'm may not be understanding the feasibility of attempting to compensate for the optimization. Or maybe it is possible but would result in unacceptable complexity. (I wonder what dotnet designers expected implementors to do with them bypassing public methods).

Otherwise, it seems like this might be an optimization that might likely affect many encryption stream implementations. Do you think it makes sense to submit this optimization bug upstream to dotnet?

@rilehudd
Copy link
Author

rilehudd commented Apr 19, 2023

Sorry didn't intend to close this issue (wanted to ensure you see this and have an opportunity to respond so I might see what your thoughts are).

@piksel piksel reopened this Apr 19, 2023
@piksel
Copy link
Member

piksel commented Apr 19, 2023

I think the main issue is that extending CryptoStream is an unintended way to implement decryption and I don't know why it was done this way.
It is currently unrecoverable, since part of the auth code gets consumed by TransformBlock. The ZipAESStream uses a sliding buffer for this that should be possible to move over to the transform.

@rilehudd
Copy link
Author

Thanks for taking the time to take a look at this!

@piksel
Copy link
Member

piksel commented Apr 20, 2023

There is another solution to this. The compression method that is used for the entry is STORE, which means that it's uncompressed. This problem only happens to such entries because we directly return the ZipAESStream to the consumer. If there is some decompression to be done, the stream will be wrapped in the appropriate decompression stream, and the optimized path is not taken. We could add a StoreOutputStream that just passes the data through, but in doing so mitigates this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async bug encryption zip Related to ZIP file format
Projects
No open projects
Bugs
  
High priority
Development

No branches or pull requests

2 participants