Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit url #51

Open
evanjs opened this issue Nov 2, 2019 · 11 comments
Open

Audit url #51

evanjs opened this issue Nov 2, 2019 · 11 comments

Comments

@evanjs
Copy link

evanjs commented Nov 2, 2019

url: GitHub, crates.io

cargo-geiger reports numerous usages of unsafe, though only 6 seem to be local to the url crate itself.

Example unsafe usage in decode_utf8_lossy

A majority of the remaining unsafe usages consist of, e.g. unicode-normalization and smallvec (#9)
Absolutely no idea if any of these unsafe cases are appropriate or not.

The url crate seems to be widely used enough to check, at least.

@evanjs evanjs changed the title Audit url Audit url Nov 2, 2019
@JoshMcguigan
Copy link

Just looked at the example unsafe block you linked. I think it is safe, and potentially appropriate, but I believe there is a way it can be re-written in safe code. I'll review the rest tomorrow and submit a PR if needed.

@JoshMcguigan
Copy link

Submitted the PR - servo/rust-url#560

@Shnatsel
Copy link
Member

Shnatsel commented Nov 3, 2019

The comments look very nice and clear to me. Thanks for doing this!

I think one more unsafe block could be trivially rewritten as safe code, I've left a comment on the PR.

This leaves only two iterator implementations that return Some(&str) and I think those can be converted to returning Some(char) instead to avoid the unsafe conversions, with something like this:

mystring.chars().flat_map(|c| if should_percent_encode(c) {percent_encode(c).chars()} else {c})

@JoshMcguigan
Copy link

@Shnatsel thanks for the feedback. I've removed the unsafe block as you suggested.

This leaves only two iterator implementations that return Some(&str) and I think those can be converted to returning Some(char) instead to avoid the unsafe conversions, with something like this..

If I'm understanding correctly, these would be changes which change the public interface?

@Shnatsel
Copy link
Member

Shnatsel commented Nov 3, 2019

Those structs seem to be private, so I think that wouldn't be a breaking change.

@JoshMcguigan
Copy link

https://github.com/servo/rust-url/blob/master/src/form_urlencoded.rs#L125

This is one example of an iterator with an unsafe block. But ByteSerialize is marked as public in the form_urlencoded module, then src/lib.rs has pub mod form_urlencoded. I believe the other case of this is similar. Are you referring to some other structs? Or am I misinterpreting the Rust public/private rules?

@Shnatsel
Copy link
Member

Shnatsel commented Nov 3, 2019

Nah, it's probably me who misinterpreted the rules. Those structs indeed seem to be public, so changing the return type would indeed be a breaking change. Apologies for the confusion.

@evanjs
Copy link
Author

evanjs commented Nov 3, 2019

In addition, I figured we might want to check idna and percent_encoding as well, but wasn't sure how those should be handled.

i.e., I'm assuming we're doing a PR per project (e.g. rust-url/url, rust-url/idna), vs a PR per repo (e.g. rust-url/*)?

@Shnatsel
Copy link
Member

Shnatsel commented Nov 3, 2019

PRs are easier per project. Tracking issues... dunno, per repo probably, because that's what I'd grep for unsafe

@JoshMcguigan
Copy link

For what its worth, my PR covers the full rust-url repo (but not dependencies).

@evanjs
Copy link
Author

evanjs commented Nov 3, 2019

For what its worth, my PR covers the full rust-url repo (but not dependencies).

Yup. Just looked through it a minute ago and realized it was comprehensive.

Regardless, I'll keep this in mind moving forward, and include checklists for bigger projects in case we need to break things up.

Thanks @Shnatsel and @JoshMcguigan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants