Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: explore using google.golang.org/protobuf/reflect/protoreflect descriptors instead of custom descriptors #259

Open
lfolger opened this issue Mar 19, 2024 · 4 comments

Comments

@lfolger
Copy link

lfolger commented Mar 19, 2024

In #258 I made a proof of concept for using google.golang.org/protobuf/reflect/protoreflect descriptors instead of the the descriptors defined in linker/descriptors.go.
google.golang.org/protobuf/reflect/protodesc offers an API to turn file descriptor protos into protobuf/protoreflect descriptors. To be able to use this API on the descriptors that are generated by the parser the well-known options need to be resolved.

This means the current execution needs to change from:

  1. parsing
  2. linking (done by code in linker/)
  3. resolving options

to

  1. parsing
  2. resolving well-known options
  3. linking (done by reflect/protodesc)
  4. resolving not well-known options

In my experiments this was all that was needed but I don't know how good the test coverage of my experiments was.

Another issue might be that the reflect/protodesc validation is stricter than the linker/descriptors.go validation and thus the API would become more restrictive (backwards incompatible change).

I'm not saying this is the only (or even best) way forward but might be an option to consider to reduce the maintanence cost because protocompile would not need to maintain its own descriptor implementation.

@jhump
Copy link
Member

jhump commented Mar 19, 2024

@lfolger, thanks for filing this!

I had in fact considered that alternate proposed flow when originally implementing this. There's a little more context in this question that I filed in the protobuf-go repo at the time.

The main issue will be with supporting the user providing a custom/modified/etc version of google/protobuf/descriptor.proto. This mainly complicated by the new features option everywhere, for editions, which is a message. So we'd need a dynamic message for google.protobuf.FeatureSet, to interpret non-custom options that include features, which poses a boot-strapping issue: where do we get the descriptor for that when we are still in the middle of compiling the source for it.

There's also the performance issue of having to create the descriptors 2x: once before we interpret custom options, and then again after custom options are interpreted and source code info is generated. Or we could continue to have a custom implementation of the interfaces in this repo, but have them be much leaner, where they only override the Options() methods (to get the most recent options message, with custom options interpreted) and the SourceLocations() method on FileDescriptor.

More exploration is needed to figure out how to remove these custom implementations w/out loss of functionality and (maybe more importantly) without a regression in performance in terms of both throughput and memory usage.

Another issue might be that the reflect/protodesc validation is stricter than the linker/descriptors.go validation and thus the API would become more restrictive (backwards incompatible change).

This package intends to perform all of the same validation that protoc does. I do happen to know of at least one case where the reflect/protodesc package is a little more strict than protoc, and it's with messages using the message set wire format. But, frankly, no one should be using that, so it's certainly not a problem in practice. In fact, I actually made protocompile enforce the same rules as reflectprotodesc, so protocompile is also a little bit more strict than protoc when it comes to that. (If you're curious, the extra check that protocompile and reflect/protodesc perform is to require that a message with message set wire format has at least one extension range; protoc doesn't care and will allow a completely empty message to have message set wire format.)

@lfolger
Copy link
Author

lfolger commented Mar 20, 2024

The main issue will be with supporting the user providing a custom/modified/etc version of google/protobuf/descriptor.proto.

Does this mean you are allowing users to redefine the descriptor proto includings feature defaults?
This seems to be a problem for the protobuf implementation as many of the features and their defaults are baked into the binaries and are not resolved at compile/runtime.

In general it seems problematic if the .proto version of the descriptor.proto is not in sync with the generated .pb.go file that is baked into the binary (especially if the .proto file is newer) but I might be missing something here.

@jhump
Copy link
Member

jhump commented Mar 20, 2024

Does this mean you are allowing users to redefine the descriptor proto includings feature defaults?

Yes. protoc supports this as well. In particular, it is useful for pinning a prior or future version of descriptor.proto -- by future I mean a version of descriptor.proto from a newer version of protoc than what is "baked" into the compiler.

This seems to be a problem for the protobuf implementation as many of the features and their defaults are baked into the binaries and are not resolved at compile/runtime.

For the generated code, that is always the case. But this is a compiler. It does not necessarily deal only with statically generated code.

In general it seems problematic if the .proto version of the descriptor.proto is not in sync with the generated .pb.go file

FWIW, that happens all the time in practice as protobuf-go and protoc are not atomically released together. But also, this feature of the compiler can even be necessary to achieve the version matching you're talking about: if the compiler binary contains a different embedded version of descriptor.proto than the runtime library and generated code will expect, we have be able to provide the right version.

@lfolger
Copy link
Author

lfolger commented Mar 20, 2024

Thanks for clarifying. I think I misunderstood what you said initially.

It seems this is limited to released version of the descriptor.proto not arbitrary modifications to the descriptor proto. And for the most part it also means (assuming you are using an up-to-date version) that go protobuf understands all of these version because as it regularly updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants