Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recvfrom, recvmsg and recvmmsg might return uninitialized addresses when using connection-mode sockets #2177

Open
Jan561 opened this issue Nov 4, 2023 · 8 comments

Comments

@Jan561
Copy link
Contributor

Jan561 commented Nov 4, 2023

When using connection-mode sockets like TCP, the addresses returned by recvfrom, recvmsg and recvmmsg are not initialized by the syscall.

This is how the receive operations create an IPV4-Socketaddr, with a comment which operation is unsound here:

impl SockaddrLike for SockaddrIn {
    unsafe fn from_raw(
        addr: *const libc::sockaddr,
        len: Option<libc::socklen_t>,
    ) -> Option<Self>
    where
        Self: Sized,
    {
        if let Some(l) = len {
            if l != mem::size_of::<libc::sockaddr_in>() as libc::socklen_t {
                return None;
            }
        }

        // Unsound if `addr` has not been initialized by the syscall, e.g. when using TCP.
        //
        // Currently, this is a numbers game: if we are lucky enough that the value at the
        // memory region of `sa_family` is not `AF_INET`, this check will notice the unitialized
        // state of `addr` and return `None`. Otherwise, this will return junk as an initialized address.
        if (*addr).sa_family as i32 != libc::AF_INET {
            return None;
        }
        Some(Self(ptr::read_unaligned(addr as *const _)))
    }
}

The following test will catch this issue (but it is already based on the new API I'm working on):

#[test]
fn test_connection_mode_recv_address() {
    use std::io::{IoSlice, IoSliceMut};

    use nix::sys::socket::*;

    let mut header = RecvMsgHeader::<Option<SockaddrIn>>::new();

    const MSG: &[u8] = "Hello, world!".as_bytes();
    let mut buf: [u8; 13] = [0u8; MSG.len()];

    let addr = "127.0.0.1:6069".parse::<SockaddrIn>().unwrap();

    // Prefill the address buffer with valid data.
    {
        let send_sock = socket(
            AddressFamily::INET,
            SockType::Datagram,
            SockFlag::empty(),
            Some(SockProtocol::Udp),
        )
        .unwrap();

        let recv_sock = socket(
            AddressFamily::INET,
            SockType::Datagram,
            SockFlag::empty(),
            Some(SockProtocol::Udp),
        )
        .unwrap();

        bind(recv_sock.as_raw_fd(), &addr).unwrap();

        sendmsg(
            send_sock.as_raw_fd(),
            Some(&addr),
            &[IoSlice::new(MSG)],
            CmsgEmpty::write(),
            MsgFlags::empty(),
        )
        .unwrap();

        let res = recvmsg(
            recv_sock.as_raw_fd(),
            &mut header,
            &mut [IoSliceMut::new(&mut buf)],
            CmsgEmpty::read(),
            MsgFlags::empty(),
        )
        .unwrap();

        assert!(res.address().is_some());
    }

    // Now we'll use the address same buffer with an connection-mode socket
    {
        let send_sock = socket(
            AddressFamily::INET,
            SockType::Stream,
            SockFlag::empty(),
            Some(SockProtocol::Tcp),
        )
        .unwrap();

        let recv_sock = socket(
            AddressFamily::INET,
            SockType::Stream,
            SockFlag::empty(),
            Some(SockProtocol::Tcp),
        )
        .unwrap();

        bind(recv_sock.as_raw_fd(), &addr).unwrap();
        listen(&recv_sock, 1).unwrap();

        let res = std::thread::scope(|s| {
            s.spawn(|| {
                connect(send_sock.as_raw_fd(), &addr).unwrap();

                send(send_sock.as_raw_fd(), MSG, MsgFlags::empty()).unwrap();
            });

            let fd = accept(recv_sock.as_raw_fd()).unwrap();

            recvmsg(
                fd,
                &mut header,
                &mut [IoSliceMut::new(&mut buf)],
                CmsgEmpty::read(),
                MsgFlags::empty(),
            )
            .unwrap()
        });

        // This assertion will fail in the current implementation (untested)
        assert!(res.address().is_none());
    }
}
@Jan561 Jan561 changed the title recvfrom, recvmsg and recvmmsg might return unitialized addresses when using connection-mode sockets recvfrom, recvmsg and recvmmsg might return uninitialized addresses when using connection-mode sockets Nov 4, 2023
@Jan561
Copy link
Contributor Author

Jan561 commented Nov 4, 2023

This issue is related to #2054

@asomers
Copy link
Member

asomers commented Nov 4, 2023

Unless this can violate some other invariant of SockaddrIn, it isn't unsound. I assume you're referring to the unsoundness that results from reading uninitialized memory? In this case, though the kernel may not have written to the buffer, the compiler cannot possibly know that. So it must treat the memory as initialized. But we should probably add a check for msg_namelen after the syscall returns, however.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 5, 2023

I'm by no means an expert on this topic, but I think this meets the exact definition of "Undefined Behaviour" as the outcome of the family check is, well, undefined.

Take a look what socket2 is doing, they zero out the entire storage before using it for this exact reason: https://github.com/rust-lang/socket2/blob/master/src/sockaddr.rs#L129

My suggestion is to introduce a new trait:

pub unsafe trait SockaddrFromRaw {
    // the libc type
    type Storage;

    unsafe fn from_raw(
        addr: *const Self::Storage,
        len: libc::socklen_t,
    ) -> Self
    where
        Self: Sized;

    fn init_storage(buf: &mut MaybeUninit<Self::Storage>);
}

unsafe impl SockaddrFromRaw for Option<SockaddrIn> {
    type Storage = libc::sockaddr_in;

    unsafe fn from_raw(
        addr: *const Self::Storage,
        _: libc::socklen_t,
    ) -> Self {
        // SAFETY: `sa_family` has been initialized by `Self::init_storage` or by the syscall.
        unsafe {
            if addr_of!((*addr).sin_family).read() as libc::c_int != libc::AF_INET {
                return None;
            }
        }

        // Now we can assume that `addr` has been fully initialized.

        Some(SockaddrIn(ptr::read(addr)))
    }

    fn init_storage(buf: &mut MaybeUninit<Self::Storage>) {
        // The family of `Self` is `AF_INET`, so setting the family to `AF_UNSPEC` is sufficient.
        let ptr = buf.as_mut_ptr() as *mut libc::sockaddr;
        unsafe { addr_of_mut!((*ptr).sa_family).write(libc::AF_UNSPEC as _) }
    }
}

@asomers
Copy link
Member

asomers commented Nov 5, 2023

We shouldn't need to zero it first. As far as the compiler is concerned, that FFI call always initializes the return argument. We should just check the sin_family field, the sin_len field (on OSes that have one), and the msg_namelen field of the msghdr.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 6, 2023

I'm still not sure if I agree with you, especially if we consider more complex types like UnixAddr, that require

  • a valid path in sun_path
  • an NUL-byte at the end of the path.

For example, UnixAddr::path and UnixAddr::kind rely on these invariants of the struct for safety and violating these can lead to UB.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 6, 2023

Whether we want to check the lengths is an interesting question, too. I am honestly currently in favor of not checking it at all and blindly trusting the syscalls, because

  • as we already found out, the length of the netmask returned on apple systems is not the size of the struct
  • the length can even exceed the size of the struct, see here for an example
  • AFAIK the len-field on BSD-like systems is not part of the POSIX standard, meaning that the programmer must have the possibility to ignore it without running into UB (not 100% sure though)
  • the function is unsafe, we can trust the caller to pass sensible values

Instead, I would add a # Safety section for the len method for the SockaddrLike trait saying that the returned value must be used carefully in unsafe code.

@asomers
Copy link
Member

asomers commented Nov 6, 2023

I'm still not sure if I agree with you, especially if we consider more complex types like UnixAddr, that require

* a valid path in `sun_path`

* an NUL-byte at the end of the path.

For example, UnixAddr::path and UnixAddr::kind rely on these invariants of the struct for safety and violating these can lead to UB.

Sure, different implementation of SockaddrLike might need to check different invariants.

AFAIK the len-field on BSD-like systems is not part of the POSIX standard, meaning that the programmer must have the possibility to ignore it without running into UB (not 100% sure though)

It's not part of a standard, but it's guaranteed to be present on all of the BSDs. There's no possibility that it will simply vanish at runtime.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 7, 2023

Disregard my last comment about the lengths, these syscalls can return a length larger than our allocated storage, so a boundary check for the length is mandatory here. Sorry for the confusion.

I was already thinking about the cases where the address storage is managed by the syscall, like getifaddrs, but this is out of scope of this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants