Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make kevent safe, using BorrowedFd for changelist #939

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ids1024
Copy link
Contributor

@ids1024 ids1024 commented Nov 21, 2023

It looks like #488 initially used something like this, but there was concern about the safety if an fd is closed while it is in a kqueue. But it doesn't seem to have been established that this actually violates io safety.

kevent(2) documents that the fd is removed from any kqueues referencing it if it is closed. While that may result in behavior the caller doesn't expect, it shouldn't be unsafe. There should be no way for an fd being inserted into a kqueue then closed to impact a new fd that is opened with the same value, which is generally the concern with IO safety.

I haven't tested this yet other than cargo check from a Linux system, and this would be a breaking API change. But if it can be safe, it should be changed in the next semver breaking version of Rustix.

@notgull
Copy link
Contributor

notgull commented Nov 21, 2023

Which manpage are you using? I think this is guaranteed for FreeBSD, but not other BSDs.

I don't have the issue on hand, but I know that not removing the FD from kqueue has caused I/O safety issues in other projects.

@ids1024
Copy link
Contributor Author

ids1024 commented Nov 21, 2023

The same wording seems to be in all of them.

Calling close(2) on a file descriptor will remove any kevents that reference the descriptor.

https://man.freebsd.org/cgi/man.cgi?query=kevent
https://man.netbsd.org/kevent.2
https://man.openbsd.org/kqueue.2

It's caused safety issues in other projects, or there was concern that it might be unsafe?

Though it may be unsafe to actually use the fd returned in eventlist. It could be closed some time between kevent returning it and the fd being used if you aren't careful. But that shouldn't make kevent() itself unsafe as long as it returns events with a RawFd and documents this.

@notgull
Copy link
Contributor

notgull commented Nov 21, 2023

Okay then. I'm fine with it as long as the sunfish is.

@sunfishcode
Copy link
Member

I haven't fully caught up on this yet, but I do want to raise a potential concern.

In general, it's not sufficient to say "I'm only providing users with RawFds, anything users do with those is their own responsibility". There is no reliable way for users to check a RawFd for validity. So if the only way for users to be safe is to have global knowledge of the program, allowing them to know that no other code closed that fd, then we don't have a safe API.

@ids1024
Copy link
Contributor Author

ids1024 commented Dec 2, 2023

My thoughts on this:

  • Separately from whether or not the function is marked unsafe, accepting BorrowedFd arguments makes it safer to use, and is more convenient in a project that uses safe IO types everywhere. Otherwise it is possible to pass an invalid fd to the kevent call.
  • It is safe to compare a RawFd to a owned or borrowed FD. The result of the comparison may not be correct if the FD was closed and reused before the comparison, but the operation isn't unsound.
    • It is also possible to just ignore the RawFd value, and interpret the event based on the udata
  • Especially in a low-level library like Rustix, unsafe should only be used for things that actually violate safety. Not for things that are easy to misuse. Or for things that return a value that could be used in an unsafe way, with unsafe code. The code using the RawFd then would have an unsafe block around it.

Specifically, I'm was looking at this since it's one of the few unsafe calls in the pure Rust backend of wayland-rs. The code there seems to just use udata(), so this call being unsafe seems unnecessary with that usage.

@sunfishcode
Copy link
Member

Especially in a low-level library like Rustix, unsafe should only be used for things that actually violate safety. Not for things that are easy to misuse. Or for things that return a value that could be used in an unsafe way, with unsafe code.

I do agree with this. There is nuance here though. If we say "the code that gets the RawFd needs an unsafe block", then assuming it isn't doing comparisons, the code that receives the fds needs an unsafe block. The safety condition for that unsafe block depends on the callers of kevent upholding certain conditions. If kevent is a safe function, then the only way to be sure that the conditions are upheld is to be aware of all the kevent callsites in the program that operate on the queue, which isn't always possible. On the other hand, if kevent is an unsafe function, it can document those conditions as part of its own safety condition.

So, it is about things that violate safety. Putting the unsafe on the function is a little surprising, but there appears to be no other place we can put it that both reliably protects against safety violations, and isn't impossible to use.

One possible alternative approach here would be to have an alternate version of kevent which returns a different type which doesn't expose the RawFds. Such a function could be safe, and useful for users that would ignore the RawFds anyway. What would you think of that approach?

@ids1024
Copy link
Contributor Author

ids1024 commented Dec 8, 2023

One possible alternative approach here would be to have an alternate version of kevent which returns a different type which doesn't expose the RawFds. Such a function could be safe, and useful for users that would ignore the RawFds anyway. What would you think of that approach?

That would satisfy the use case of avoiding unnecessary unsafe blocks when the fd isn't even used. Which is perhaps the best approach for safe Rust code using kqueue.

Though it makes this a bit more complicated, and I don't see it as necessary.

(But this could be done with the generic approach used in this PR by returning Event<()>, and having .filter() return EventFilter<()> for it.)

there appears to be no other place we can put it that both reliably protects against safety violations, and isn't impossible to use.

In principle, any function that accepts a RawFd (except to print, compare, etc.) should be unsafe. (Though awkwardly, this is often not true for existing code from before the IO safety RFC.) So there should be an unsafe block when the RawFd is used.

So if I wanted to pass the RawFd returned here to another Rustix function, I'd have an unsafe { BorrowedFd::borrow_raw(..) } call. Which I'd put a // SAFETY: comment over describing how I've made use the fd is still valid. Having to also have the kevent call in an unsafe block seems extraneous.

From a safety standpoint, this is similar to functions returning raw pointers. It's generally expected in Rust for getter functions returning RawFds to be safe, even though use of the return value may require caution about the lifetime it is valid for. Which can be non-trivial to correctly handle.

it can document those conditions as part of its own safety condition.

These considerations can be mentioned in a doc comment either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants