Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should arg and build_with be unsafe? #6

Open
epage opened this issue Aug 5, 2018 · 7 comments
Open

Should arg and build_with be unsafe? #6

epage opened this issue Aug 5, 2018 · 7 comments
Labels
breaking-change enhancement Improve the expected question Uncertainty is involved

Comments

@epage
Copy link
Contributor

epage commented Aug 5, 2018

arg and build_with have the ability to break the invariants of the API and should probably be marked as unsafe.

While at it, it should probably be renamed to arg_unchecked, opening us up to a arg that is checked.

@epage epage added enhancement Improve the expected question Uncertainty is involved breaking-change labels Aug 5, 2018
@roblabla
Copy link

Unsafe means "can cause memory safery issues". This is not likely to be the case here. There are a lot of similar cases in the stdlib where doing things incorrectly might break invariants (like implementing the Hash trait or PartialEq trait wrong in a hashmap) and yet those aren't unsafe.

The docs should make sure to mention that though.

@epage
Copy link
Contributor Author

epage commented Sep 10, 2018

Unsafe means "can cause memory safery issues". This is not likely to be the case here. There are a lot of similar cases in the stdlib where doing things incorrectly might break invariants (like implementing the Hash trait or PartialEq trait wrong in a hashmap) and yet those aren't unsafe.

My understanding is you use unsafe to mark that this API entry point can break invariants, possibly as a "I know the input code paths, give me better performance"

Example: https://doc.rust-lang.org/std/intrinsics/fn.unchecked_div.html

From TRPL

The second type of operation that requires an unsafe block is calls to unsafe functions. Unsafe functions and methods look exactly like regular functions and methods, but they have an extra unsafe before the rest of the definition. The unsafe keyword in this context indicates the function has requirements we need to uphold when we call this function, because Rust can’t guarantee we’ve met these requirements. By calling an unsafe function within an unsafe block, we’re saying that we’ve read this function’s documentation and take responsibility for upholding the function’s contracts.
...
By inserting the unsafe block around our call to dangerous, we’re asserting to Rust that we’ve read the function’s documentation, we understand how to use it properly, and we’ve verified that we’re fulfilling the contract of the function.

https://doc.rust-lang.org/book/second-edition/ch19-01-unsafe-rust.html

@roblabla
Copy link

Alright, I'm back on a computer.

So first, I'll take back what I said. Unsafe really means "may use features that can cause undefined behavior". I can't find where this is written clearly, but for instance, from the nomicon (https://doc.rust-lang.org/nomicon/what-unsafe-does.html):

The reason these operations are relegated to Unsafe is that misusing any of these things will cause the ever dreaded Undefined Behavior. Invoking Undefined Behavior gives the compiler full rights to do arbitrarily bad things to your program.

And from the page you linked: https://doc.rust-lang.org/book/second-edition/ch19-01-unsafe-rust.html

it is trickier to get unsafe code correct because the compiler can’t help uphold memory safety.

In safe rust, you cannot run into undefined behavior. In unsafe rust, you have to be very careful about not breaking the invariants, otherwise you may cause undefined behavior.

Now, what is undefined behavior? That's a hard question to answer, as rust does not yet properly identify what is defined and undefined (the Rust Unsafe Guidelines working group are working on it). There is an effort at defining them here: https://doc.rust-lang.org/reference/behavior-considered-undefined.html.

Now, what about unchecked_div ? Unchecked_div is actually an llvm intrinsic. If you misuse it, you will misguide the compiler into undefined behavior, potentially doing anything.

What about your arg function here? Here's the thing: misusing arg cannot cause undefined behavior. If you call arg at the wrong moment, what may happen is that cargo will error out with a help message, and as a result the message iterator will panic. That's it. No UB was invoked.

In general, unless you have a very good reason, I don't think marking functions as unsafe without actually using unsafe capabilities in it isn't a good idea. Here's an example of a way to break invariants of an API that doesn't result in UB, and as such is explicitly not marked as unsafe: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html

It is required that the keys implement the Eq and Hash traits, although this can frequently be achieved by using #[derive(PartialEq, Eq, Hash)]. If you implement these yourself, it is important that the following property holds:

k1 == k2 -> hash(k1) == hash(k2)

In other words, if two keys are equal, their hashes must be equal.

It is a logic error for a key to be modified in such a way that the key's hash, as determined by the Hash trait, or its equality, as determined by the Eq trait, changes while it is in the map. This is normally only possible through Cell, RefCell, global state, I/O, or unsafe code.

This is a very old post about what is unsafe, but I think it's relevant: http://huonw.github.io/blog/2014/07/what-does-rusts-unsafe-mean/.

@epage
Copy link
Contributor Author

epage commented Sep 10, 2018

While I understand the concern that an API "requiring" too many unsafe blocks helps nullify the value of unsafe (at least that is what I read out of some of your comments) I disagree about unsafe being restricted only to compiler Undefined Behavior but also applies to API invariants.

I think I'l open an issue on this in the API guidelines.

And just to have all the references in one place:

From https://rust-lang-nursery.github.io/api-guidelines/print.html#getter-names-follow-rust-convention-c-getter

For getters that do runtime validation such as bounds checking, consider adding unsafe _unchecked variants. Typically those will have the following signatures.

@epage
Copy link
Contributor Author

epage commented Sep 10, 2018

I've tried to capture this in an unbiased way at rust-lang/api-guidelines#179

@roblabla
Copy link

roblabla commented Sep 10, 2018

As a separate discussion point: Could something be done to avoid the invariant entirely? Maybe having .arg() panic if it detects a --, and have a dedicated .final_arg() function that is guaranteed to be added at the end when the command is finally built?

@epage
Copy link
Contributor Author

epage commented Sep 10, 2018

As a separate discussion point: Could something be done to avoid the invariant entirely? Maybe having .arg() panic if it detects a --,

Always a good idea to find ways to make an API less error prone.

For trying to detect it, my main concern is not hampering the user from legit use cases. I'm not familiar enough with the different ways parameters could be constructed where this could be an issue. It does help that the problem space is narrowed down to where it might be valid to pass a string to cargo with a -- in it.

Also I've not investigated and documented this but I suspect there are other uses of arg that would invalidate invariants.

have a dedicated .final_arg() function that is guaranteed to be added at the end when the command is finally built?

I've broken this out into #14

@epage epage changed the title Should arg be unsafe? Should arg and build_with be unsafe? Dec 31, 2018
epage added a commit that referenced this issue Oct 6, 2023
…holder

README.md 'Crates Status' icon link
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change enhancement Improve the expected question Uncertainty is involved
Projects
None yet
Development

No branches or pull requests

2 participants