Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New socket API #2199

Closed
wants to merge 1 commit into from
Closed

New socket API #2199

wants to merge 1 commit into from

Conversation

Jan561
Copy link
Contributor

@Jan561 Jan561 commented Nov 23, 2023

Still a soft WIP, but ready for a first review and open for suggestions. A changelog will be provided when we agreed on the changes.

This PR changes virtually everything related to socket.

Quick Overview:

  • Turned the enum AddressFamily to a struct with associated constants as enums were the wrong abstraction here
  • Made the naming of the socket address structs consistent (every name now ends with either *Address if sized or *Addr if unsized)
  • SockaddrStorage got renamed to Address, and turned it into a struct.
  • Variable sized addresses like UnixAddress, LinkAddress and Address now have dyn-sized variants named UnixAddr, LinkAddr and Addr
  • Removed the SockaddrLike trait
  • Every address implements AsRef<Addr>, which also replaces the SockaddrLike bound
  • Added RawAddress, which can be used when the system call manages the address memory. getifaddrs uses this new type.
  • Basically every system call signature got changed
  • Improved the documentation, while still not perfect. I tried a couple things before this iteration, so maybe some documentation could now be stale.

Incomplete list of issues and PRs affected by this PR:

Fixes #2177
Fixes #2176
Fixes #2054
Fixes #2030
Fixes #1990
Fixes #1939

Maybe Fixez #2031

Supersedes #2038
Supersedes #1992
Supersedes #1475

Checklist:

  • I have read CONTRIBUTING.md
  • I have written necessary tests and rustdoc comments
  • A change log has been added if this PR modifies nix's API

@Jan561 Jan561 force-pushed the socket_api_2 branch 3 times, most recently from 07316cd to 8835561 Compare November 23, 2023 14:30
@Jan561
Copy link
Contributor Author

Jan561 commented Nov 23, 2023

Okay, so I have added a test that performs equality checks on the netmasks returned by getifaddrs, and as expected, it fails on macos.

I see two ways how to proceed from here:

  • We define a variable sized *Addr variant for all addresses, and make the conversion methods in RawAddr return only *Addr types.
  • We enforce that fixed size addresses always have a length equal to the underlying libc type.
    • Let the conversion methods fail if the length is too small. MacOS users can still use RawAddr::as_ptr to access the netmask
    • Revert the changes with RawAddr, and make getifaddrs return something like Option<Address>, and reapply the workaround / hack

I actually prefer the first option. To me, it seems to be the most robust and future proof choice, and we don't need to make assumptions anymore on how the system calls are behaving exactly.

Edit: If we go with Option 1, then we'll likely need a new function Ipv4Address::new_netmask that truncates the sin_len to the length of the netmask on apple systems, and hope we do it correctly (Apple's documentation is literally garbage).

Copy link
Member

@asomers asomers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way too big to review. There are thousands of lines of changes here.
Some of them are purely formatting changes, which should never be mixed with
content changes. And specifically:

  • The sendmsg changes should be broken out into a separate PR from the sockaddr changes.
  • Your name changes don't match with existing precedent. For example, std::net:: always uses "Addr" instead of "Address". And within std::net, anything that combines an IP address with a port has "Sock" in the name. We should maintain that precedent. This PR's Ipv4Address type sounds like it is equivalent to std::net::Ipv4Addr instead of std::net::SocketAddrV4.
  • Why is enum the wrong abstraction for AddressFamily?
  • Functions like getsockname now return a full sized sockaddr_storage unconditionally. That allocates a lot of extra memory that might not be needed. Previously, being generic, getsockname could be used with smaller sized types.
  • It seems to me that the main point of the PR is to replace SockaddrLike with AsRef<Addr>. Could you please explain the motivation for that?
  • The RawAddr type only wraps libc::sockaddr. That doesn't even have enough room for a sockaddr_in6. That means that RawAddr::to_ipv6 reads beyond the end of its reference. Instead, you should perform the cast from sockaddr to sockaddr_in6 while operating on raw pointers, before you turn anything into references. The fact that you had to add #[allow(unused)] to RawAddr::new is a clue that it's doing something wrong.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 23, 2023

The current implementation has many issues. The goal of these changes is to not only fix these issues, but to make it harder to make similar mistakes again.

This is way too big to review. There are thousands of lines of changes here.

I agree, and I expected this response. I thought that for discussions it is easier to have everything in one PR, so you can see the whole picture. We can use this PR just for discussion and eventually create smaller, atomic PRs on the changes we agree on.

Your name changes don't match with existing precedent. For example, std::net:: always uses "Addr" instead of "Address". And within std::net, anything that combines an IP address with a port has "Sock" in the name. We should maintain that precedent. This PR's Ipv4Address type sounds like it is equivalent to std::net::Ipv4Addr instead of std::net::SocketAddrV4.

I just thought that UnixAddress, Ipv4Address and Address is way prettier and easier to memorize than UnixAddr, SockaddrIn and SockadddrStorage.

I agree that the names could be misleading, as ip addresses live in layer 3, while the ports - coming from TCP or UDP - live in layer 4. Something that includes "Sock" would be more precise.

Why is enum the wrong abstraction for AddressFamily?

AddressFamily isn't a closed set, e.g. custom vendor-defined address families are currently impossible to represent. Also our implementation doesn't need (and likely won't ever need) exhaustive pattern matching on all family variants.

Functions like getsockname now return a full sized sockaddr_storage unconditionally. That allocates a lot of extra memory that might not be needed. Previously, being generic, getsockname could be used with smaller sized types.

That's a tradeoff between performance and ergonomics. I also have an implementation that keeps the address type as a generic, but I ultimately opted for this approach as I valued ergonomics higher. In the end, it is your choice to make, I can live with both.

I also took inspiration from socket2 which also just returns a wrapper around sockaddr_storage in all their functions.

It seems to me that the main point of the PR is to replace SockaddrLike with AsRef<Addr>. Could you please explain the motivation for that?

With the Addr type, SockaddrLike wasn't needed anymore, so I removed it.

The big advantage of Addr is that the length information is embedded into the pointer, so types like (the new) UnixAddress can be cheaply converted to this lower-level abstraction, without losing any information.

In the current implementation, we instead have the SockaddrStorage union that also has UnixAddr as a member. UnixAddr can have an additional length field on some platforms, which makes this, this and this necessary to make it work. At least to me, it feels very forced and unnatural to have to special case UnixAddr every time when downcasting. Also, what if one day Linus Torvalds decides that sockaddr_un should have the same size as sockaddr_storage? The length field wouldn't fit anymore.

The RawAddr type only wraps libc::sockaddr. That doesn't even have enough room for a sockaddr_in6. That means that RawAddr::to_ipv6 reads beyond the end of its reference. Instead, you should perform the cast from sockaddr to sockaddr_in6 while operating on raw pointers, before you turn anything into references. The fact that you had to add #[allow(unused)] to RawAddr::new is a clue that it's doing something wrong.

The safety documentation of the unsafe RawAddr::new demands that the provided argument must be castable to the more specific pointers.

The allow_unused is there because the only place where it is currently used is in ifaddrs.rs, which isn't supported by all platforms. I tried to design it with reusability in mind, so that if more system calls emerge that manage the address memory for us and don't provide us with a length, we can just use this type.

And yes, this is the API I'm the most skeptical about.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 23, 2023

My motivation for the unsized address types was initially because of freebsd's sockaddr_dl.

Take a look at this test. getifaddrs returns a link layer address with a length of 56, but the size of the struct is only 54.
Of course, the "true" length is way smaller, that's why I truncate the address in this test to only include header + nlen + alen + slen.

A user unaware of this, could run into the following issue, thinking this is totally safe, but actually accesses memory that is out of bounds of sockaddr_dl:

let link_addr = getifaddrs(...).xzy();
let cloned = link_addr.clone();
let cloned_ptr = cloned.as_ptr();

unsafe {
    println!("{}", *cloned_ptr.cast::<u8>().add(cloned.len() - 1));
}

Of course, we can also just add another workaround for this, next to the macOS workaround.

Hopefully this explains my motivations a bit better.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 24, 2023

Whether we want to use unsized types for addresses or not ultimately boils down to this "philosophical" question:

Do we want it to be a bug, if functions like getifaddrs behave unexpected? Then we need to apply workarounds for all these cases.

Or do we want it to be a missing feature? With my implementation, these things can be fixed by adding new functions to the types, like Ipv4Address::new_netmask in case of the macOS bug, or LinkAddr::true_len alongside with LinkAddr::to_owned_truncate in case of the freebsd bug.

I think both approaches are valid and have their own strengths, and even I don't know which is better, or if my provided arguments are strong enough to justify these breaking changes.

@asomers
Copy link
Member

asomers commented Nov 24, 2023

AddressFamily isn't a closed set, e.g. custom vendor-defined address families are currently impossible to represent. Also our implementation doesn't need (and likely won't ever need) exhaustive pattern matching on all family variants.

That makes sense. If users have a need for a custom value, then we should make it a struct.

That's a tradeoff between performance and ergonomics. I also have an implementation that keeps the address type as a generic, but I ultimately opted for this approach as I valued ergonomics higher. In the end, it is your choice to make, I can live with both.

A primary goal of Nix is zero-cost abstractions. We try to make the C APIs accessible to Rust but without imposing anything unnecessary. So let's keep these functions generic.

what if one day Linus Torvalds decides that sockaddr_un should have the same size as sockaddr_storage? The length field wouldn't fit anymore.

That won't happen, because it would break backwards-compatibility with every binary that uses the old sockaddr_un definition. And if Linux does introduce a sockaddr_un2, then in Nix we'll just have to exapand SockaddrStorage. That's fine.

The allow_unused is there because the only place where it is currently used is in ifaddrs.rs, which isn't supported by all platforms.

Oh, ok. I thought it was because the unsafe was unused.

A user unaware of this, could run into the following issue, thinking this is totally safe, but actually accesses memory that is out of bounds of sockaddr_dl:

Well, it's obviously not "totally safe" since it's unsafe ;) . I think this is a problem on the user's side. Pointer arithmetic is always risky.

Do we want it to be a bug, if functions like getifaddrs behave unexpected?

I think yes. Also, FreeBSD documents sockaddr_dl's sdl_data lenght as "minimum work area, can be larger". So if we have any examples where it actually overflows, maybe we should create a larger container type of our own.

@Jan561
Copy link
Contributor Author

Jan561 commented Nov 24, 2023

Sounds good to me, I can agree on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment