Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from rust_protobuf to prost for protobuf generation #668

Closed
tiziano88 opened this issue Mar 4, 2020 · 34 comments · Fixed by #829
Closed

Switch from rust_protobuf to prost for protobuf generation #668

tiziano88 opened this issue Mar 4, 2020 · 34 comments · Fixed by #829
Assignees
Projects

Comments

@tiziano88
Copy link
Collaborator

ref:

We already have a few PRs that switched parts of the source tree to use prost; if we prefer that, we should go ahead and switch to it fully.

pros:

  • no_std compatible (almost)
  • more idiomatic and compact generated code

cons:

@tiziano88 tiziano88 added this to To do in planning via automation Mar 4, 2020
@tiziano88
Copy link
Collaborator Author

@blaxill assigning to you since you are already working on it with #683

@blaxill
Copy link
Contributor

blaxill commented Mar 18, 2020

@tiziano88 rust_protobuf is no longer used for the core components, but is still used in the examples through sdk/rust/oak_utils. Do you want to include changing the examples in this issue?

@tiziano88
Copy link
Collaborator Author

Good question, now that we don't have a hard requirement to be no_std compatible in the short term, perhaps we should reconsider this decision based on other factors, for instance which API and code generator we prefer. Given we already have gRPC code generation for rust_protobuf, perhaps we should keep that and not worry about prost?

@daviddrysdale @conradgrobler @rbehjati WDYT?

@conradgrobler
Copy link
Collaborator

I personally prefer Prost, but that could just be because i am more familiar with it.

Also, Prost is used by Tonic, so if we plan on using that in future as a gRPC server it might be more natural to use Prost. But, if we are just going to proxy the request body bytes to the nodes, then it probably doesn't matter and the nodes can decide how they interpret it.

@rbehjati
Copy link
Contributor

I think it is not quite maintainable to have both. I already ran into some problems with conflicting blanket implementations, when trying to use Prost for LogMessage. So, I think, eventually we should either change the sdk and the examples to use Prost, or change the core components to use rust_protobuf. (I am assuming that the sdk and the core components are not completely disjoint, and may want to share some code in the future.)

Having said that, I don't really have a preference. Although with my very limited experience, I found Prost to be a bit neater (I did not look into code generation for gRPC).

@tiziano88
Copy link
Collaborator Author

@wildarch as part of your experiments, did you figure out whether it is possible to rely on FieldOptions from prost? I thought it would not be possible because they are usually represented as extensions, which prost does not support, but now I looked at the generated files that prost uses, and it seems perhaps they just stash the options into a regular field:

https://github.com/danburkert/prost/blob/6f3c60f136be096194a3c6e0e77e25a9e1356669/prost-types/src/protobuf.rs#L564-L565

Could you try to define a custom option, assign it to a field, and see whether during code generation time (i.e. as part of running prost itself), the option is visible? If so, I think that means that we should be able to rely on extensions to mark fields as handles, if we wanted to, though of course that would require patching / forking prost.

@wildarch
Copy link
Contributor

I am actually in the process of doing exactly this! For now I got stuck at the situation described in https://github.com/danburkert/prost/issues/256, where for some weird reason the uninterpreted_option field remains empty. Still debugging that issue, but I think once that is fixed we can use Prost!

@tiziano88
Copy link
Collaborator Author

@wildarch did you try to dump the incoming message that protoc itself sends to the prost plugin?

@wildarch
Copy link
Contributor

I did, and unfortunately it doesn't contain the custom option.

@tiziano88
Copy link
Collaborator Author

Do you have the code you used to test that? I wonder if we could try something similar using rust-protobuf and see if there is any difference at that level already.

@wildarch
Copy link
Contributor

I have put the code up at https://github.com/wildarch/handle_extract.

It contains a dump of the file descriptor set that prost-build uses to generate rust types: https://github.com/wildarch/handle_extract/blob/master/descriptor_set.textproto. Note line https://github.com/wildarch/handle_extract/blob/master/descriptor_set.textproto#L10763 specifically, which shows an options message that is set, but does not contain any elements.
I have yet to contrast this with how rust-protobuf does this.

@tiziano88
Copy link
Collaborator Author

What if you set a standard option on that field, e.g. deprecated = true?

@wildarch
Copy link
Contributor

It sets deprecated = true inside that options field, but the custom option still isn't visible.

@tiziano88
Copy link
Collaborator Author

Note that UninterpreterOption is only for things that the parser does not recognise:

https://github.com/protocolbuffers/protobuf/blob/dec4939439d9ca2adf2bb14edccf876c2587faf2/src/google/protobuf/descriptor.proto#L697-L702

In your case, I believe the parser does recognise it.

@tiziano88
Copy link
Collaborator Author

I wonder if the issue is just converting the binary proto file to textproto. Have you tried inspecting the binary file directly from python? It may be that extensions are dropped when printing out (which would make sense to me)

@wildarch
Copy link
Contributor

I have tried to check len(fds.file[-1].message_type[0].field[-1].options.Extensions), which returns 0. deprecated is a regular field in the FieldOptions type, which may be why it is treated differently.

@blaxill
Copy link
Contributor

blaxill commented Mar 26, 2020

@tiziano88 this discussion is about the lack of extension support in prost, is that correct? Do you see oak using extensions in the foreseeable future?

@wildarch
Copy link
Contributor

We would need it for #673 to annotate handle types.

@tiziano88
Copy link
Collaborator Author

@wildarch it's still worth trying to dump the proto descriptor from within a plugin, rather than just from protoc. I am pretty sure I used field extensions from other plugins I wrote in different languages, and the extensions were there.

@wildarch
Copy link
Contributor

Will do! My suspicion is that somehow plugins do get these extensions, but the dumped FileDescriptorSet does not include them. I'll see what happens when using a plugin.

@wildarch
Copy link
Contributor

wildarch commented Mar 26, 2020

So it turns out that in order to get the python script to work (both the dumper and the plugin) you have to first compile the proto file that contains the extension for python, import the generated module in the script that parses the proto files, and then the extension magically appears. I'm not quite sure yet how this will work for our setup, but at least now we know it is doable to parse these extensions and use them to generate code.

@tiziano88
Copy link
Collaborator Author

I don't think this experiment is particularly conclusive for us :) if anything it seems bad news to me, because it means that custom options are indeed serialized as extensions, and we know that prost does not support extensions, so there is probably no way to get access to them from the prost plugin implementation. Or did I miss something?

@wildarch
Copy link
Contributor

Yeah it is definitely bad news for moving to prost. I think there is still a way to save this though: We could patch the protobuf compiler descriptor proto and make the handle type a regular field, which is binary compatible with declaring it as an extension. This would require a bit of hacking on the prost-build crate, but everything else could stay nice and straightforward (we would still have a handle type extension that clients use normally). If this works well we could even upstream changes to prost to allow people to supply their own compiler protos, potentially.

@tiziano88
Copy link
Collaborator Author

@wildarch this is the Rust type that we would like to generate for handle fields:

#[derive(Debug, Copy, Clone, PartialEq, Serialize, Deserialize)]
pub struct Sender<T: Encodable> {
pub handle: WriteHandle,
phantom: std::marker::PhantomData<T>,
}

We can assume that T is another protobuf message, which will be specified (as string?) as part of the field option. The direction of the handle (send or receive) will also be part of the field option.

We still need to figure out how to extract handles into a separate slice, and also what to do about repeated fields (or we can just make it work for non-repeated fields to start with).

@wildarch
Copy link
Contributor

wildarch commented Apr 3, 2020

#673 is no longer a blocker for switching to Prost, I have a POC that is fully based on Prost.

@wildarch
Copy link
Contributor

wildarch commented Apr 3, 2020

Do we have a tracker somewhere to see which parts of the codebase still need to be migrated?

@blaxill
Copy link
Contributor

blaxill commented Apr 3, 2020

@wildarch no tracker, everything above the Wasm abi is still rust_protobuf I believe (i.e. sdk and examples).

@daviddrysdale
Copy link
Contributor

daviddrysdale commented Apr 8, 2020

@tiziano88, @wildarch: While this is still underway, would it make sense to shift the codebase to use a single consistent protobuf library1 in the meanwhile?

I ask because this blocks #764 which blocks #801 (since #772 was merged)…


1: Which I guess would have to be rust_protobuf, given the need for the Node-specific generated code.

@tiziano88
Copy link
Collaborator Author

I agree, but let me take a quick look to see how much work it would be to just rewrite the gRPC generation logic for prost first.

@tiziano88
Copy link
Collaborator Author

BTW @daviddrysdale is there a preference for either rust-protobuf or prost from the point of view of Bazel integration?

@daviddrysdale
Copy link
Contributor

No, I don't think the tool choice should matter for Bazel integration.

@wildarch
Copy link
Contributor

wildarch commented Apr 8, 2020

@daviddrysdale I don't know if you were planning on using rules_rust for this, but it seems their proto rules use rust-protobuf. That said, Prost was designed to fit in with Bazel-like build systems so it shouldn't be too hard to make things work.

@conradgrobler
Copy link
Collaborator

@tiziano88 I don't know whether you have seen this, but Prost build provides a standard way for adding service code generation logic to proto compilation: https://docs.rs/prost-build/0.6.1/prost_build/trait.ServiceGenerator.html

@tiziano88 tiziano88 assigned tiziano88 and unassigned blaxill Apr 8, 2020
@tiziano88 tiziano88 moved this from To do to In progress in planning Apr 8, 2020
@tiziano88
Copy link
Collaborator Author

Thanks @conradgrobler , yes that's exactly what I am planning to use.

tiziano88 added a commit to tiziano88/oak that referenced this issue Apr 8, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes project-oak#668
tiziano88 added a commit to tiziano88/oak that referenced this issue Apr 8, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes project-oak#668
tiziano88 added a commit to tiziano88/oak that referenced this issue Apr 9, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes project-oak#668
tiziano88 added a commit to tiziano88/oak that referenced this issue Apr 9, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes project-oak#668
tiziano88 added a commit to tiziano88/oak that referenced this issue Apr 9, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes project-oak#668
tiziano88 added a commit to tiziano88/oak that referenced this issue Apr 9, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes project-oak#668
planning automation moved this from In progress to Done Apr 9, 2020
tiziano88 added a commit that referenced this issue Apr 9, 2020
Main differences:

- use `::default` instead of `::new` to create instances
- use native field accessors
- use native Option and Vec types for optional and repeated fields
- dispatcher struct has a more qualified name
- gRPC generator code is encapsulated in a prost ServiceGenerator
- use `quote!` macros instead of string concatenation in the generator

Fixes #668
tiziano88 pushed a commit that referenced this issue Apr 14, 2020
The documentation was still mentioning the old rust-protobuf based code
generation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
planning
  
Done
Development

Successfully merging a pull request may close this issue.

6 participants