Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poll / PollFd can no longer take -1 as descriptors #2254

Open
ltratt opened this issue Dec 8, 2023 · 9 comments
Open

poll / PollFd can no longer take -1 as descriptors #2254

ltratt opened this issue Dec 8, 2023 · 9 comments

Comments

@ltratt
Copy link
Contributor

ltratt commented Dec 8, 2023

I'm a heavy user of poll and a standard idiom is to create n entries, but not to use all of them at all times --- poll allows fd to have a value of -1 to say "ignore this entry". This means that one can simplify a lot of code surrounding poll by assuming a constant number of entries at all times, even if some of them aren't doing anything.

The change to use PollFd in 0.27 means that, as far as I can tell, this is no longer possible in nix? Here's an example of this being used in real code so it's genuinely a useful feature.

Because PollFd takes a BorrowedFd I don't think there's an obvious way to change PollFd itself but perhaps poll could usefully take &mut [Option<PollFd>]? That would feel like a fairly idiomatic Rust way of saying "there are n entries, but some of them are unused": it would allow PollFd to maintain its current semantics, but also make poll code that wants to keep a constant number of entries possible.

I appreciate this is a breaking change, but if it's something that might be welcomed, I'd be happy to raise a PR for it!

@SteveLauC
Copy link
Member

SteveLauC commented Dec 9, 2023

Is this feature available on all the platforms? I see that FreeBSD and OpenBSD have this, Linux and POSIX manual do not mention it at all.

Because PollFd takes a BorrowedFd I don't think there's an obvious way to change PollFd itself

We can make the PollFd::new() take a Option<BorrowedFd>, but this is not I/O-safe as it implements AsFd trait and should be a valid file descriptor, so I prefer the second approach:

poll could usefully take &mut [Option<PollFd>]?

I appreciate this is a breaking change,

No worries about braking changes, we will bump our minor version number in the next release

@ltratt
Copy link
Contributor Author

ltratt commented Dec 9, 2023

Linux does support this, but the man page you linked uses the term "negative". I didn't remember this detail, but Linux is more forgiving than the BSDs in this regard:

The field fd contains a file descriptor for an open file. If
this field is negative, then the corresponding events field is
ignored and the revents field returns zero. (This provides an
easy way of ignoring a file descriptor for a single poll() call:
simply set the fd field to its bitwise complement.)

POSIX doesn't mention this either way. So I don't know if all platforms support it but at least most of the major extant platforms definitely support it (I couldn't immediately tell if OS X does or doesn't with a quick search, and I don't have an OS X box to test on).

@SteveLauC
Copy link
Member

SteveLauC commented Dec 9, 2023

Linux does support this, but the man page you linked uses the term "negative".

Oh, I get it, same as the timeout argument

POSIX doesn't mention this either way. So I don't know if all platforms support it but at least most of the major extant platforms definitely support it (I couldn't immediately tell if OS X does or doesn't with a quick search, and I don't have an OS X box to test on).

If we are not sure about what platforms support this feature, then it is hard to change the interface.


NetBSD supports this:

If the value passed in fd is negative, the entry is ignored and revents is set to 0. (POLLNVAL is not set.)

DragonFlyBSD supports this:

fd File descriptor to poll. If fd is equal to -1 then revents is cleared (set to zero), and that pollfd is not checked.

macOS manual does not mention this

@ltratt
Copy link
Contributor Author

ltratt commented Dec 9, 2023

Via Ted Unangst, Darwin also skips negative fds in poll and, interestingly, has this comment in the code:

/* per spec, ignore fd values below zero */

@asomers
Copy link
Member

asomers commented Dec 9, 2023

poll could usefully take &mut [Option]?

Sounds good to me too

@SteveLauC
Copy link
Member

SteveLauC commented Dec 18, 2023

If we change poll() and ppoll() to take &mut [Option<PollFd>], then we have to clone the slice and replace the None with a PollFd that has the fd set to -1:

If Rust can ensure that Some(PollFd) will have the same memory representation as PollFd, then the type of clone can be changed to Vec<Option<PollFd>>

I think null pointer optimization can happen here so a tag is not needed but I am not sure about the memory layout...

pub fn poll<T: Into<PollTimeout>>(fds: &mut [Option<PollFd>], timeout: T) -> Result<libc::c_int> {
    let mut clone: Vec<PollFd> = Vec::new();
    for opt_fd in fds {
        if let Some(fd) = opt_fd {
            clone.push(fd.clone());
        } else {
            clone.push(PollFd::new(
                // SAFETY:
                // No need to worry about the lifetime of this fd as it is not
                // a valid one, and `poll()` or `ppoll()` will skip it so we are
                // safe.
                unsafe { BorrowedFd::borrow_raw(-1) },
                PollFlags::empty(),
            ));
        }
    }

    let res = unsafe {
        libc::poll(
            clone.as_mut_ptr().cast(),
            clone.len() as libc::nfds_t,
            i32::from(timeout.into()),
        )
    };

    Errno::result(res)
}

I am thinking about if we can define a wrapper type here:

#[repr(transparent)]
pub struct MaybePollFd<'fd> {
    pollfd: PollFd<'fd>,
    _fd: std::marker::PhantomData<BorrowedFd<'fd>>,
}

impl<'fd> MaybePollFd<'fd> {
    const INVALID_FD: std::os::fd::RawFd = -1;
    pub fn is_valid(&self) -> bool {
        self.pollfd.pollfd.fd != Self::INVALID_FD
    }

    pub fn set_invalid(&mut self) {
        self.pollfd.pollfd.fd = Self::INVALID_FD;
    }
    
    pub fn new_invalid() -> Self {
        Self {
            pollfd: PollFd::new(unsafe { BorrowedFd::borrow_raw(Self::INVALID_FD)}, PollFlags::empty()),
            _fd: std::marker::PhantomData,
        } 
    }
}

impl<'fd> From<PollFd<'fd>> for MaybePollFd<'fd> {
    fn from(value: PollFd<'fd>) -> Self {
        Self {
            pollfd: value,
            _fd: std::marker::PhantomData,
        }
    }
}

impl<'fd> TryFrom<MaybePollFd<'fd>> for PollFd<'fd> {
    type Error = Errno;
    fn try_from(value: MaybePollFd<'fd>) -> std::result::Result<Self, Self::Error> {
        if value.pollfd.pollfd.fd == -1 {
            // EBADF or EINVAL?
            Err(Errno::EBADF)
        } else {
            Ok(value.pollfd)
        }
    }
}

then poll() and epoll() will take &mut [MaybePollFd], and in @ltratt's use case, you can do:

fn update_pollfds(&mut self) {
    for (i, jobslot) in self.running.iter().enumerate() {
        let (stderr_fd, stdout_fd) = if let Some(job) = jobslot {
            // Since we've asked for stderr/stdout to be captured, the unwrap()s should be
            // safe, though the Rust docs are slightly vague on this.
            let stderr_fd = if job.stderr_hup {
                MaybePollFd::new_invalid()
            } else {
                MaybePollFd::from(PollFd::new(job.child.stderr.as_ref().unwrap().as_fd(), PollFlags::POLLIN))
            };
            let stdout_fd = if job.stdout_hup {
                MaybePollFd::new_invalid()
            } else {
                MaybePollFd::from(PollFd::new(job.child.stdout.as_ref().unwrap().as_fd(), PollFlags::POLLIN))
            };
            (stderr_fd, stdout_fd)
        } else {
            (MaybePollFd::new_invalid(), MaybePollFd::new_invalid())
        };
        self.pollfds[i * 2] = stderr_fd;
        self.pollfds[i * 2 + 1] = stdout_fd;
    }
    self.pollfds[self.maxjobs * 2] = PollFd::new(self.snare.event_read_fd, PollFlags::POLLIN);
}

@ltratt
Copy link
Contributor Author

ltratt commented Dec 18, 2023

@SteveLauC If one can avoid the extra Vec that would be cool. FWIW I have (but currently untested while I change the program that would call this!) locally:

diff --git src/poll.rs src/poll.rs
index 0ad9f40d..6b6d4c34 100644
--- src/poll.rs
+++ src/poll.rs
@@ -197,20 +197,43 @@ libc_bitflags! {
 /// [`PollTimeout::ZERO`] causes `poll()` to return immediately, even if no file
 /// descriptors are ready.
 pub fn poll<T: Into<PollTimeout>>(
-    fds: &mut [PollFd],
+    fds: &mut [Option<PollFd>],
     timeout: T,
 ) -> Result<libc::c_int> {
+    let mut fds_map = map_fds(fds);
     let res = unsafe {
         libc::poll(
-            fds.as_mut_ptr().cast(),
-            fds.len() as libc::nfds_t,
+            fds_map.as_mut_ptr().cast(),
+            fds_map.len() as libc::nfds_t,
             i32::from(timeout.into()),
         )
     };
+    for (i, fd) in fds.iter_mut().enumerate() {
+        if let Some(ref mut fd) = fd {
+            *fd = PollFd {
+                pollfd: fds_map[i],
+                _fd: std::marker::PhantomData,
+            };
+        }
+    }
 
     Errno::result(res)
 }
 
+/// Map a slice of `Option<PollFd>` to `Vec<libc::pollfd>` where `None` is mapped to a `fd` of -1.
+fn map_fds(fds: &[Option<PollFd>]) -> Vec<libc::pollfd> {
+    fds.iter()
+        .map(|x| match x {
+            Some(PollFd { pollfd, _fd: _ }) => *pollfd,
+            None => libc::pollfd {
+                fd: -1,
+                events: 0,
+                revents: 0,
+            },
+        })
+        .collect::<Vec<_>>()
+}
+
 feature! {
 #![feature = "signal"]
 /// `ppoll()` allows an application to safely wait until either a file
@@ -226,19 +249,28 @@ feature! {
 ///
 #[cfg(any(linux_android, freebsdlike))]
 pub fn ppoll(
-    fds: &mut [PollFd],
+    fds: &mut [Option<PollFd>],
     timeout: Option<crate::sys::time::TimeSpec>,
     sigmask: Option<crate::sys::signal::SigSet>
     ) -> Result<libc::c_int>
 {
     let timeout = timeout.as_ref().map_or(core::ptr::null(), |r| r.as_ref());
     let sigmask = sigmask.as_ref().map_or(core::ptr::null(), |r| r.as_ref());
+    let mut fds_map = map_fds(fds);
     let res = unsafe {
-        libc::ppoll(fds.as_mut_ptr().cast(),
-                    fds.len() as libc::nfds_t,
+        libc::ppoll(fds_map.as_mut_ptr().cast(),
+                    fds_map.len() as libc::nfds_t,
                     timeout,
                     sigmask)
     };
+    for (i, fd) in fds.iter_mut().enumerate() {
+        if let Some(ref mut fd) = fd {
+            *fd = PollFd {
+                pollfd: fds_map[i],
+                _fd: std::marker::PhantomData,
+            };
+        }
+    }
     Errno::result(res)
 }
 }

which isn't very different than your first version. Your second version looks much different, and I think better, though!

@SteveLauC
Copy link
Member

SteveLauC commented Dec 29, 2023

It seems that all the OSes supported by Nix have this feature except for Redox:

OS Supported
darwin Yes
linux Yes
android Probably yes, since it uses Linux
freebsd Yes
dragonfly Yes
netbsd Yes
openbsd Yes
illumos Yes
solaris Yes
haiku Yes
redox No

From the source code, poll(2) on Redox is emulated with epoll, which will return an error when encountering a fd of value -1.

pub fn poll_epoll(fds: &mut [pollfd], timeout: c_int) -> c_int {
    let event_map = [
        (POLLIN, EPOLLIN),
        (POLLPRI, EPOLLPRI),
        (POLLOUT, EPOLLOUT),
        (POLLERR, EPOLLERR),
        (POLLHUP, EPOLLHUP),
        (POLLNVAL, EPOLLNVAL),
        (POLLRDNORM, EPOLLRDNORM),
        (POLLWRNORM, EPOLLWRNORM),
        (POLLRDBAND, EPOLLRDBAND),
        (POLLWRBAND, EPOLLWRBAND),
    ];

    let ep = {
        let epfd = epoll_create1(EPOLL_CLOEXEC);
        if epfd < 0 {
            return -1;
        }
        File::new(epfd)
    };

    for i in 0..fds.len() {
        let mut pfd = &mut fds[i];

        let mut event = epoll_event {
            events: 0,
            data: epoll_data { u64: i as u64 },
            ..Default::default()
        };

        for (p, ep) in event_map.iter() {
            if pfd.events & p > 0 {
                event.events |= ep;
            }
        }

        pfd.revents = 0;

        if epoll_ctl(*ep, EPOLL_CTL_ADD, pfd.fd, &mut event) < 0 {
            return -1;
        }
    }

    let mut events: [epoll_event; 32] = unsafe { mem::zeroed() };
    let res = epoll_wait(*ep, events.as_mut_ptr(), events.len() as c_int, timeout);
    if res < 0 {
        return -1;
    }

    for event in events.iter().take(res as usize) {
        let pi = unsafe { event.data.u64 as usize };
        // TODO: Error status when fd does not match?
        if let Some(pfd) = fds.get_mut(pi) {
            for (p, ep) in event_map.iter() {
                if event.events & ep > 0 {
                    pfd.revents |= p;
                }
            }
        }
    }

    let mut count = 0;
    for pfd in fds.iter() {
        if pfd.revents > 0 {
            count += 1;
        }
    }
    count
}

I think I will file a feature request issue to Redox.

Update: Issue filed

@ltratt
Copy link
Contributor Author

ltratt commented Jan 1, 2024

Thank you @SteveLauC -- much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants