Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design for Rust object by-value passing to C++ and back #1185

Open
schreter opened this issue Feb 26, 2023 · 12 comments
Open

Design for Rust object by-value passing to C++ and back #1185

schreter opened this issue Feb 26, 2023 · 12 comments

Comments

@schreter
Copy link
Contributor

In our project which requires perfect C++/Rust integration (where we actually consume a Rust library in a large C++ project) we extended cxx crate to handle Rust by-value types, so we can safely give an instance of a Rust object to C++ (and back from C++ to Rust). Now we'd like to upstream the change, so everyone can profit.

The major issue is the unknown type layout on the C++ side. The parser in cxx::bridge simply cannot know the size and alignment of the data type and also the traits the type implements. Our solution is pretty simple:

  • Explicitly define the layout of the C++ type by specifying #[repr(layout(<size>, <alignment>))] on the extern "Rust" type, where the size and alignment are checked at compile-time that they are indeed at-least for the type (at-least because the bridge can target multiple platforms with different pointer sizes, one can then pick the maximum size/alignment for now).
  • Allow to #[derive(Copy)] and/or #[derive(Clone)] traits on the type (which are then checked at compile-time to ensure that the original object indeed has them). Deriving Default would be also easily possible.
  • Generate respective constructors, assignment operators and destructors as needed on the C++ side and appropriate bridge functions to call back into Rust.
  • Reserve aligned space in the C++ object for the type.

The data stored in the reserved space in the C++ object is then either T or Option<T> on the Rust side, depending on whether the Rust object implements Copy or not.

  • If Copy is implemented, then there is no issue whatsoever, since Rust's Copy objects can be copied/moved freely, so the data is just memcpy'ed as needed on the C++ side.
  • If Copy is not implemented, then the data type is Option<T> and drop, forget and optionally clone (for types implementing Clone) callbacks are generated.

The reasoning behind Option<T> is the following: C++ doesn't have a good notion of object ownership. If an object is moved to another location via move constructor or move assignment, the original object will still be destroyed by its destructor. Calling drop on this moved-out object would be fatal. Therefore, the object is represented by Option<T> and moving out of the object will call forget callback on it, which writes Option<T>::None pattern into it. Following destruction of the object via drop callback will still call drop, but on None pattern, which is a no-op, thus it's safe.

When passing objects by-value from Rust to C++, they are wrapped in Option<T>::Some. When returning them back to Rust, the Option is unwrapped, so even if someone tries to return back a moved-out object to Rust (which is UB), we'll detect it. Similarly, passing references or pointers to T from C++ to Rust will effectively entail passing references or pointers to Option<T>, which can be also checked for moved-out objects (which is still UB, but better to report it). We didn't implement that yet.

Another limitation in our implementation is also that the T and Option<T> are required to have the same size (i.e., Option<T> must use some niche or some invalid pattern to represent None). The reasoning is that this is typically anyway the case for all practical purposes (since we often want to pass handle types, which contain some Arc or the like) and it's fairly easy to add a member with a niche if needed. On the plus side, the binary representation/layout is then exactly same for T and Option<T>::Some, so passing references to C++ is also well-defined - simply pass the reference as-is.

There is one danger, though. Passing a mutable reference to C++ would allow moving out of the object on the C++ side. Again, this would be UB from our PoV, but C++ doesn't care. We can check this, however, after the C++ call returns. The binary pattern of the object passed by reference must not correspond to None pattern. With that, we can also catch UB for this (i.e., C++ side reinterpreting the mutable reference as an rvalue and moving out of the object). Similarly, if rvalue references would be allowed, then this can be implemented by using &mut Option<T> as a parameter on the Rust side, which would then correspond to rvalue reference on the C++ side. We didn't implement it, but it would be possible. This would also help addressing #561 trivially.

Other issue which could be addressed fairly easily would be #251. Maybe it would help a bit with issue #171 (by providing Option support).

Any comments/ideas on the aforementioned design?

As mentioned, we'd like to upstream the changes, which already exist, but since picking the right subset is not trivial, I'd like to clarify at least the minimal interface and minimal useful feature set.

Thanks.

@noufelez
Copy link

Any news on this ? This seems promising.

@schreter
Copy link
Contributor Author

Any news on this ? This seems promising.

Sorry, still waiting on @dtolnay to review the two PRs I posted regarding C++ exception handling (#1180) and type aliases (#1180). I cannot move forward on this until at least C++ exception handling is merged, since our implementation has dependencies and thus I can't publish further PRs for review.

Also, I'm not willing to put any more work towards this crate upfront, until I get feedback on the direction, both on exception handling on the one side, and on this design on the other side. Similarly, I won't publish a "fork" of this crate, since I don't want to create divergence - I strongly prefer we put everything together upstream.

@dtolnay If you are unable or unwilling to review these changes and designs, please designate someone else, who is able and willing to do so, so we can move forward here. I have several colleagues who are willing to review (and in fact, did the review already on our private fork). Unfortunately, currently, you seem to be the sole maintainer of this crate. If you'd be willing to allow others to contribute reviews and merge changes, it would be very welcome.

@cameron-martin
Copy link

This sounds interesting. Can you expand on what is generated on the C++ side? What do you mean by "Reserve aligned space in the C++ object for the type"?

@schreter
Copy link
Contributor Author

schreter commented Oct 1, 2023

This sounds interesting. Can you expand on what is generated on the C++ side? What do you mean by "Reserve aligned space in the C++ object for the type"?

On the C++ side, an opaque object with character array of the required size and alignment is generated, which can be moved/copied/destroyed as needed based on Copy/Clone/Drop traits in Rust (these are backed by callbacks into Rust, except for Copy objects, which are PODs).

Current requirement for non-Copy objects is sizeof::<Option<T>>() == sizeof::<T>() to differentiate between destroyed objects and live objects on the C++ side. I.e., when calling a method on an already-moved object from C++ (which is replaced by None value), we can detect this and panic in the Rust callback. Similarly, the destructor will see None value and do nothing. This is trivial to achieve - most objects already have some pointer, enum or bool inside, which is used as a niche for the None value on the Rust side, so there is nothing to do. And, if this is not the case, adding a dummy bool member will suffice.

@cameron-martin
Copy link

I'm not sure I understand the need for the sizeof::<Option<T>>() == sizeof::<T>() requirement. If we were to create an Option<T> before sending it to C++, and the size of the opaque type sizeof::<Option<T>>(), wouldn't it still work without this requirement? I assume that the current implementation relies on reinterpreting the bytes of a T as an Option<T>.

@schreter
Copy link
Contributor Author

schreter commented Oct 3, 2023

I assume that the current implementation relies on reinterpreting the bytes of a T as an Option.

Yes. That was much simpler to implement.

You need to take into consideration that we also need to support references to T when passing variables between Rust and C++. If the sizeof::<Option<T>>() == sizeof::<T>(), then you can safely assume that the reference to Option<T> is the same as reference to T, provided the Option is inhabited.

Passing opaque objects as-is is not sufficient due to C++ move semantics. A destructor is namely called on moved objects in C++ as well (i.e., you get two destructor calls). So the only way to get this right was to somehow mark the object as moved. This is done by using Option on the Rust side (well, the APIs don't use Option, just the generated code).

Similarly, it's not sufficient to mark the object by extending it by a flag on the C++ side, since we can also pass a reference to a Rust object to C++, which doesn't have this flag. So the size on the C++ side must be the same. Again, Option with niche optimizations comes to rescue here.

@cameron-martin
Copy link

cameron-martin commented Oct 3, 2023

I understand the need to use an Option<T> to mark the object as moved, but I'm concerned that there's no guarantee that the layout of T and Option<T> are the same, even if the sizes are the same. The guarantees given in the docs seem to be much weaker.

I'm wondering why you can't first move the T into an Option<T> before sending it to C++. It seems like this would remove the need for the size constraint, and also avoid any possible layout differences.

@cameron-martin
Copy link

This doc however does give a slightly stronger guarantee:

Option-like enums where the payload defines at least one niche value are guaranteed to be represented using the same memory layout as their payload.

@schreter
Copy link
Contributor Author

schreter commented Oct 4, 2023

I'm wondering why you can't first move the T into an Option<T> before sending it to C++. It seems like this would remove the need for the size constraint, and also avoid any possible layout differences.

That is possible.

However, then, you'd need to have two distinct types on the C++ side - one for the opaque objects sent to C++ by value (which are wrapped in Option) and one for a reference to the opaque type (which is pure T reference).

So the easiest way was to ensure they have the same size.

Regarding layout guarantees, it is pretty much guaranteed, since an Option<T> can give you the reference to T. So the layout of T must be stored in the Option unchanged, otherwise you couldn't get this reference. And, if the sizes match, then obviously the only possibility is that the reference to T is at exactly the same address as Option<T> (i.e., nothing is shifted).

@cameron-martin
Copy link

This seems reasonable (to both points).

@HKalbasi
Copy link

HKalbasi commented Oct 5, 2023

@schreter I still don't get why you need to do the Option<T> thing. I'm curious since I'm working on a different C++/Rust interop tool named Zngur which faces the same problem and I want to make sure I didn't overlook something. In Zngur, I just added a bool drop_flag field to non Copy classes, and set it to false when it is uninitialized or moved out.

For supporting references you just need to ensure that the backing data field is stored at the offset 0 of the class, which will happen if you put the data as the first field (Zngur doesn't do that, for some reasons it uses a dedicated Ref<T> instead of T&, and that Ref class knows about the internals of each Rust type) and by doing that each C++ reference to that type will become a valid Rust reference.

@schreter
Copy link
Contributor Author

schreter commented Oct 5, 2023

@HKalbasi Interesting idea with forcing all C++ references to use a Ref wrapper. Especially, I like the idea of compile-time checks on C++ side to prevent wrong usage (at least partially). With that, it's probably possible to add a bool somewhere.

However, the Option way I chose is also interesting, since it doesn't cause any memory overhead. Maybe one can do the best of both worlds - use Option-based uninhabited checks, if the sizes match, else use the extra field (OTOH, that boils down to Option as well, but a bit bigger).

I think I'll take a look at your crate. Since we are now completely revamping our C++ API wrapping the Rust code anyway (we expose a Rust component to C++), maybe it's a better option to use an actively developed crate instead of the one where the maintainer doesn't react.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants