Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid bit fields in protocol message definitions #768

Open
mnissler-rivos opened this issue Aug 23, 2023 · 7 comments
Open

Avoid bit fields in protocol message definitions #768

mnissler-rivos opened this issue Aug 23, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@mnissler-rivos
Copy link
Contributor

The vfio_user_header.flags field currently uses bit fields to specify various sub-fields:

struct vfio_user_header {
uint16_t msg_id;
uint16_t cmd;
uint32_t msg_size;
struct {
uint32_t type : 4;
#define VFIO_USER_F_TYPE_COMMAND 0
#define VFIO_USER_F_TYPE_REPLY 1
uint32_t no_reply : 1;
uint32_t error : 1;
uint32_t resvd : 26;
} flags;
uint32_t error_no;
} __attribute__((packed));

Bit field layout is up to the compiler, and thus may differ depending on machine byte order, optimization settings, etc.

I observed this as problematic when testing on a big-endian host. All VFIO-user protocol fields are in big-endian byte order then (as expected per #615 IIUC), but the flags subcomponents weren't compiled in reverse order, thus ending up in the wrong bit positions.

@jlevon
Copy link
Collaborator

jlevon commented Aug 23, 2023

it's host-endian, not big-endian: that is, we don't do, or attempt to do, any handling of different endianness at all. We certainly wouldn't be against supporting this, but not something we're planning to work on ourselves.

@mnissler-rivos
Copy link
Contributor Author

To clarify my environment: I ran a qemu vfio-user client against a qemu vfio-user server on a big-endian host. So, both client and server are in big-endian byte order and hence expected to interoperate.

However, I found that server and client disagree on the bit field layout in the header flags field. The qemu client would treat flags as a single uint32_t value in host byte order and so the flags field of a reply message would be 0x00 0x00 0x00 0x01 in memory (which I consider correct per the spec, considering that the flags field is supposed to be in host byte order, which is big endian). However, libvfio-user's bit field places the type bits at other end of the uint32_t, i.e. a flags field of a reply message is encoded as 0x10 0x00 0x00 0x00. Unsurprisingly, I didn't even get past version negotiation ;-)

Then I researched what the net has to say about bit fields and byte order, and found that the standards leave this implementation defined. In practice, compilers are forced to implement whatever widespread conventions there are (e.g. Linux does https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/ip.h?h=03702d4d29be4e2510ec80b248dbbde4e57030d9#n88 for example). At the end of the day, it's IMO prudent to accept that bit field encoding is a mine field and rather rely on shifting and masking to pack/unpack the flags field.

@jlevon
Copy link
Collaborator

jlevon commented Aug 23, 2023

I see, because the client doesn't use bitfields, we get bitten by the different on big endian systems.
Makes sense.

@tmakatos tmakatos added the bug Something isn't working label Aug 28, 2023
@tmakatos
Copy link
Member

@mnissler-rivos are you planning to submit a patch?

@mnissler-rivos
Copy link
Contributor Author

It's not a priority for me given that none of my stuff is running on big-endian host systems (I found this after checking on big-ending when checking qemu behavior in context of a code review there). If you're all OK with replacing the bitfield with #defines, I can put together a quick MR and test it later this week.

@tmakatos
Copy link
Member

I'm not sure there's any other way around this other than using #defines, so I'd take such a PR, @jlevon ?

@jlevon
Copy link
Collaborator

jlevon commented Aug 29, 2023

yup for sure thanks

mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Aug 29, 2023
It turns out that the bit field will not yield the desired / specified
bit layout on big-endian systems, see issue nutanix#768 for details. Thus,
replace the bit field with constants for the individual fields and use
bit masking when accessing the flags field.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
tmakatos pushed a commit to mnissler-rivos/libvfio-user that referenced this issue Aug 30, 2023
It turns out that the bit field will not yield the desired / specified
bit layout on big-endian systems, see issue nutanix#768 for details. Thus,
replace the bit field with constants for the individual fields and use
bit masking when accessing the flags field.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
tmakatos pushed a commit that referenced this issue Aug 30, 2023
It turns out that the bit field will not yield the desired / specified
bit layout on big-endian systems, see issue #768 for details. Thus,
replace the bit field with constants for the individual fields and use
bit masking when accessing the flags field.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants