Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bind-multiple is hard to imlement on smoltcp #104

Open
chrysn opened this issue Feb 6, 2024 · 4 comments
Open

bind-multiple is hard to imlement on smoltcp #104

chrysn opened this issue Feb 6, 2024 · 4 comments

Comments

@chrysn
Copy link
Contributor

chrysn commented Feb 6, 2024

I've split this out of #103 as reported by @ivmarkov to keep threads disentangled:

Supporting the bind_multiple contract on top of smoltcp seems currently impossible, or in the best case would require terrible hacks, as smoltcp does seem to assign a source (local) address very late - only when dispatching the UDP packet. The UDP packets in the ring buffer queue however do not have any notion of "source address" before that.

So it seems sending with an arbitrary local address would require us to a) wait for the UDP socket ring buffer to become empty (so that we are sure all other pending packets are sent with the "previous" local addr) b) unbind the UDP socket c) rebind the socket with the local address of the packet we want to send (NOTE: this way however we would be missing packets that we want to receive, as the receive_from part will now work with the local address that we want to use for sending the packet d) unbind the socket once the packet was sent e) rebind the socket again with the local address as specified in bind_multiple

What's also weird is that there is quite a bit of asymmetry wrt UDP sockets between embedded-nal and embedded-nal-async. embedded-nal is pretty simplistic, while embedded-nal-async seems to enforce an API which is too close to the BSD sockets one, and is not necessarily a good fit for smoltcp. (Not that implementing bind_multiple is particularly easy with STD either.)

I'm unsure how to proceed, honestly.

  • One option would be to just not support bind_multiple on top of smoltcp by having the impl always return an error.

  • Another would be to remove it and/or make it optional in embedded-nal-async.

In any case, not having an impl of the UDP socket traits for smoltcp is an odd situation to be in, given that smoltcp is the one stack used in embedded...

@chrysn @lulf @Dirbaio @MabezDev Sorry for pinging. Happy to listen to your feedback / comments / guidance.

Did I miss the elephant in the room?

@chrysn
Copy link
Contributor Author

chrysn commented Feb 6, 2024

tl;dr: Does smoltcp-rs/smoltcp#904 fix this for you?

To start this off with some background on where this is coming from: Both QUIC and CoAP, when run as servers, need to know where the incoming packet was sent to, and reply from that very address. It is a frequent source of CoAP interoperability troubles if servers just send from any address (worst because it works most of the time, but when the device gets an extra IP address, suddenly it stops working). Also, this is highly important for distinguishing unicast from multicast messages. This doesn't really come from BSD sockets -- on the contrary, it is hard to do on BSD sockets because they need to go through recvmsg, set the PKTINFO socket option and hope to get the platform specific format of the ancillary data item right when using IP versions smaller than 6. In contrast, the embedded-nal sockets are more oriented toward basic BSD sockets (without recvmsg), and thus are missing crucial functionality for implementing UDP based protocols that are more complex than an Echo server.

On the long run, I think that the sync NAL should migrate to the async NAL's concepts, especially if and when we've ironed out any warts such as this (which I hope is resolved with the PR) or #103.

On the smoltcp side, there is a PR that just fixes this: smoltcp-rs/smoltcp#904. Could you have a look at it to see whether it resolves the problem? An example of how it is used (to implement the bind_multiple_server, even, although through embassy-nal) is available at embassy-rs/embassy#2519

@ivmarkov
Copy link

ivmarkov commented Feb 6, 2024

tl;dr: Does smoltcp-rs/smoltcp#904 fix this for you?

Yep, seems to address precisely the issue at hand. Sorry for not looking into what is open w.r.t. PRs in smoltcp. :)

Also, this is highly important for distinguishing unicast from multicast messages.

Interesting that you mentioned multicast. Did you really mean multicast or broadcast? Because while I can imagine that (a) having access to the 'local' (= destination) IP address in UnconnectedUdp::receive_into would allow the app code to figure out whether this is a broadcast or a multicast message (the former being important in the context of - say - DHCP servers), lack of full broadcast/multicast support is another feature I'm missing:

  • std-embedded-nal-async can't really receive broadcast messages, as it is missing a set_broadcast(true) call
  • As for multicast messages, we are (a) lacking a trait similar to this and (b) we need to map it of course to both STD and smoltcp where the latter in the meantime does support ipv4 and ipv6 multicast as well.

This doesn't really come from BSD sockets -- on the contrary, it is hard to do on BSD sockets because they need to go through recvmsg, set the PKTINFO socket option and hope to get the platform specific format of the ancillary data item right when using IP versions smaller than 6. In contrast, the embedded-nal sockets are more oriented toward basic BSD sockets (without recvmsg), and thus are missing crucial functionality for implementing UDP based protocols that are more complex than an Echo server.

On the long run, I think that the sync NAL should migrate to the async NAL's concepts, especially if and when we've ironed out any warts such as this (which I hope is resolved with the PR) or #103.

Agreed.

On the smoltcp side, there is a PR that just fixes this: smoltcp-rs/smoltcp#904. Could you have a look at it to see whether it resolves the problem? An example of how it is used (to implement the bind_multiple_server, even, although through embassy-nal) is available at embassy-rs/embassy#2519

Let me look into this deeper, but as per above - seems to be spot on.

@chrysn
Copy link
Contributor Author

chrysn commented Feb 6, 2024

I am talking of multicast -- broadcast only really exists on IPv4 and I barely touch that any more. IPv4 is hard there, because the application needs extra information (from the network stack) to determine whether an address is a broadcast address or not. With IPv6, cases are clear.

Setting up the multicast groups is out of scope for embedded-nal, at least with the current interfaces, but a library could be documented as

To start a CoAP server, run CoAPServer::new(sock) for a socket that has been bound to the addresses you want the server to run on (the default port is 5683). If you want it to process multicast requests, join the ff0x::fd group, and any other groups you want to participate in.

I'm curious about the traits you are proposing; they are a welcome addition as they allow the application to do things more automatically, but they are not essential. (Also they seem like the right pool into which I should merge my embedded-nal-tcpextensions :-) ).

@ivmarkov
Copy link

ivmarkov commented Feb 6, 2024

I am talking of multicast -- broadcast only really exists on IPv4 and I barely touch that any more. IPv4 is hard there, because the application needs extra information (from the network stack) to determine whether an address is a broadcast address or not. With IPv6, cases are clear.

Setting up the multicast groups is out of scope for embedded-nal, at least with the current interfaces, but a library could be documented as

To start a CoAP server, run CoAPServer::new(sock) for a socket that has been bound to the addresses you want the server to run on (the default port is 5683). If you want it to process multicast requests, join the ff0x::fd group, and any other groups you want to participate in.

I'm curious about the traits you are proposing; they are a welcome addition as they allow the application to do things more automatically, but they are not essential. (Also they seem like the right pool into which I should merge my embedded-nal-tcpextensions :-) ).

Perhaps I can agree for Multicast. It would be difficult for me to however swallow that we cannot have

  • TCP and UDP connection splitting (see the other thread for usecases)
  • Some sort of TCP serverside "listen / bind" traits. If we don't have these, you cannot have an HTTP server which is as simple to run as this and you have to offload to the user all of the HTTP server connection management complexity

If - you know - our answer for these use cases is - "you need to do it outside the embedded-nal-async traits", we should probably retire UdpStack and TcpConnect in the first place. But if we do this, I would question the value of what's left in embedded-nal-async as well. I mean, it is really Read + Write for TCP sockets and connected UDP - and these are not even part of embedded-nal-async but live in embedded-io-async. So what's left for embedded-nal-async is some sort of simplistic ReadAndReturnSockAddrs + WriteWithSockAddrs pair of traits that cover the "unconnected" UDP case...

I guess what I'm saying is that - at least for me - the perceived value of embedded-nal-async was always as much as - or if not more - in the socket factories' traits, as it is in the actual socket traits. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants