Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ZipCrypto reading support #115

Merged
merged 20 commits into from
Jun 23, 2020
Merged

Add ZipCrypto reading support #115

merged 20 commits into from
Jun 23, 2020

Conversation

BenjaminRi
Copy link

@BenjaminRi BenjaminRi commented Oct 19, 2019

Added two new functions to the API ( resolves #64 ):

ZipArchive::by_name_decrypt
ZipArchive::by_index_decrypt

These functions behave identically to their already existing counterparts without _decrypt, but take a password in the form of &[u8]. The password is not a &str because the ZipCrypto standard does not define which encoding is used, so UTF-8 cannot represent all possible passwords (for details, please read the comments in zipcrypto.rs).

The functions integrate seamlessly with the existing API. Substantial restructuring was necessary to insert the CryptoReader between the other readers. The code has been modified to facilitate the addition of further crypto or decompression routines.

Note that the API for read_zipfile_from_stream does not yet support ZipCrypto reading. However, this is not hard to add. Also, ZipCrypto writing support has not been added at this point in time. I want to make sure that my work is in line with what the authors and the users expect before I continue working on this.

In addition to that, a ZipCrypto writer will need cryptographically secure randomness (even though the crypto is weak). Where this notoriously hard to get randomness comes from is up for discussion.

@BenjaminRi BenjaminRi changed the title Added ZipCrypto support ( resolves #64 ) Add ZipCrypto reading support Oct 27, 2019
@BenjaminRi
Copy link
Author

@mvdnes Any thoughts on this PR? I'm willing to invest time and fix issues if needed.

@BenjaminRi
Copy link
Author

Merged improvements from master into pkzip-cipher feature branch. Added unit test to verify correct ZipCrypto reader implementation. I have been using the feature branch regularly for half a year, and no problems were found so far. If there is interest to have ZipCrypto on master, I'm still open for feedback and willing to invest time.

@Plecra
Copy link
Member

Plecra commented Jun 16, 2020

This is brilliant! I hope we can get this merged soon. Would it be possible to support Central Directory Encryption with this implementation?

@rylev and I have recently been given permission to accept PRs, and we should be able to give this a proper review soon.

@BenjaminRi
Copy link
Author

Would it be possible to support Central Directory Encryption with this implementation?

Technically, yes. The ZipCrypto algorithm stands for itself and can be used to decrypt any binary stream, operating on the std::io::Read trait* (I also considered creating a separate crate just for ZipCrypto, but that's probably overkill). However, I am not sure if there is any ZIP software in existence that uses the weak ZipCrypto cipher to encrypt the Central Directory (relevant: zlib-ng/minizip-ng#141 ).

A higher value task would be to support ZIP files with strong crypto like AES encryption which is believed to provide real security. It seems like Central Directory Encryption is often (always?) done with Strong Encryption. The encryption method of choice is almost always AES because all the alternatives in the specification are deprecated & insecure by now (RC2, RC4, DES, 3DES). AES encrypted ZIP files are somewhat common.

However, I think Strong Encryption should be treated in a separate issue/PR because it's a separate subject that requires its own considerations. The restructuring of the reader that I've done in this PR can be used to hook any other crypto algorithm into the reader pipeline, so it's a good preparation for further work.

*Also, encryption is almost finished, it just needs to get cryptographically secure randomness from somewhere.

Copy link
Member

@Plecra Plecra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Could you run a cargo fmt on the changes?

@BenjaminRi
Copy link
Author

LGTM! Could you run a cargo fmt on the changes?

Thanks! Do you mean just running cargo fmt on the changes or also additionally merging my branch such that it is up to date with master (26 commits happened in the meantime)?

@BenjaminRi BenjaminRi requested a review from Plecra June 21, 2020 15:52
@BenjaminRi
Copy link
Author

@Plecra Merged all the improvements from master. Applied cargo fmt.

Copy link
Member

@Plecra Plecra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry to be a pain, but I don't want to erase the commit history - I'd rather rebase the changes. I can handle that tomorrow if you'd like 😃

@BenjaminRi
Copy link
Author

Ah, sorry to be a pain, but I don't want to erase the commit history - I'd rather rebase the changes. I can handle that tomorrow if you'd like 😃

How does merging erase the commit history? The merge from master pulls in the entire commit history as well.

I used to rebase my changes, but for read.rs the process was so grueling that after a few failed attempts during a difficult rebase operation, I started to do merges only (maybe that is also an indication that this file needs to be split up). The merge in read.rs is not trivial. I'd appreciate it a lot if you could handle it if you think rebasing is the way to go. You should arrive at the same resulting source files as I did after the merge.

Copy link
Contributor

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small things. We can merge this, but if you're up for it, it would be great to fix the nitpicks before that.

src/read.rs Outdated
if file_number >= self.files.len() {
return Err(ZipError::FileNotFound);
}
let data = &mut self.files[file_number];
let ref mut data = self.files[file_number];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually use the previous syntax. Is there a reason this was changed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that was an oversight during the merge. Fixed.

src/read.rs Outdated
self.by_name_internal(name, None)
}

fn by_name_internal<'a>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather call this by_name_with_optional_password

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. It's a bit long, but the name is clearer. Changed it to by_name_with_optional_password and by_index_with_optional_password.

src/read.rs Outdated
if data.encrypted {
return unsupported_zip_error("Encrypted files are not supported");
if password == None {
if data.encrypted {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be easier to follow like this:

match (password, data.encrypted) {
  (None, true) => return Err(..),
  (Some(_), false) => password = None,
  _ => {}
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back when I wrote this code, I didn't even know this was possible with matcher clauses (I only learned that a week ago). Looks elegant, I applied that change.

src/zipcrypto.rs Outdated
}

impl ZipCryptoKeys {
//Used this paper to implement ZipCrypto algo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be part of the module documentation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved & fixed docstrings throughout the entire zipcrypto.rs document. Added module level documentation.

src/zipcrypto.rs Outdated

impl<R: std::io::Read> ZipCryptoReader<R> {
pub fn new(file: R, password: &[u8]) -> ZipCryptoReader<R> {
//Note: The password is &[u8] and not &str because the documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comments should have a space between the // and the first character

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// No password
let file = archive.by_index(0);
assert!(file.is_err());
if let Err(error) = file {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can match inside of the Err as well so there's no need for the match expression below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turned if expressions into cleaner matcher clause.

if let Err(error) = file {
match error {
zip::result::ZipError::PasswordRequired => (),
_ => panic!(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some descriptive text to these panics?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some meaningful panic messages.


{
// Wrong password
let file = archive.by_index_decrypt(0, "wrong password".as_bytes());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can use b"wrong password" instead of as_bytes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@BenjaminRi
Copy link
Author

@rylev Thanks for the detailed review. I applied all your suggestions.

@Plecra
Copy link
Member

Plecra commented Jun 23, 2020

Thanks for following up on that! I've created #157 to discuss more encryption features if you'd like to share your thoughts 😉

@Plecra Plecra merged commit f99cdd0 into zip-rs:master Jun 23, 2020
@BenjaminRi
Copy link
Author

Fantastic! I'm very happy to see this PR accepted and merged 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Open ZIP archive protected by password
3 participants