New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
editions: map fields and message_encoding and field_presence features #16549
Comments
Hey Josh, thanks for reporting this! Our intention in edition 2023 was to basically punt on map fields and preserve whatever the behavior in proto2/proto3 was as best as we could. Since group maps weren't possible in proto2, we intended to ban tag-delimited maps. I think it's a an oversight that they can inherit this feature from the file, but maybe we should just open this up and remove that special-case instead of ignoring it. For presence I think I agree that this is a proto2 "bug" in |
Map fields should remain length-prefixed for now, even if DELIMITED is inherited. Field presence will remain unchanged, but unit-tests are added to make sure proto2/proto3 behaviors stay consistent. Closes #16549 PiperOrigin-RevId: 626462967
Map fields should remain length-prefixed for now, even if DELIMITED is inherited. Field presence will remain unchanged, but unit-tests are added to make sure proto2/proto3 behaviors stay consistent. Closes #16549 PiperOrigin-RevId: 626462967
Map fields should remain length-prefixed for now, even if DELIMITED is inherited. Field presence will remain unchanged, but unit-tests are added to make sure proto2/proto3 behaviors stay consistent. Closes #16549 PiperOrigin-RevId: 626462967
@mkruskal-google, FWIW, I think we found another bug related to inheritance of features from file defaults: #16664 |
Map fields should remain length-prefixed for now, even if DELIMITED is inherited. Field presence will remain unchanged, but unit-tests are added to make sure proto2/proto3 behaviors stay consistent. Closes #16549 PiperOrigin-RevId: 626462967
To follow up on this, I think we're going to leave the |
@mkruskal-google, thanks for the update! We'll follow suit in protobuf-es. cc @timostamm |
Map fields should remain length-prefixed for now, even if DELIMITED is inherited. Field presence will remain unchanged, but unit-tests are added to make sure proto2/proto3 behaviors stay consistent. Closes protocolbuffers#16549 PiperOrigin-RevId: 630191163
This is almost more of a question than a bug, but it could indicate a bug.
Currently (on main branch and in latest v26.1 release), a file that uses editions is not allowed to set the
message_encoding
feature on a map field, regardless of the value:Under the hood, the descriptor for this field looks like a repeated message field, but the check that reports that error message has an explicit exception for maps.
Despite this limitation, if I set
DELIMITED
as the file default using a file option, then thetype()
accessor for the map field reportsTYPE_GROUP
instead ofTYPE_MESSAGE
. And if the value of the map is a message type, that synthesized value field in the map entry message also reportsTYPE_GROUP
instead ofTYPE_MESSAGE
.And the actual serialization/deserialization respects the
type()
accessor, using "start group" and "end group" wire type tags when serializing the map data... or at leastprotoc --encode ...
does this (which could be a bug in the dynamic message implementation if it is not supposed to).So the first question is: Is it intended that map fields can be serialized using delimited message encoding? If so, why does the compiler prevent one from setting that feature on a map field? If not, I take it the behavior observed with
protoc --encode ...
is a bug?The next thing to bring up is related, but involves the
has_presence()
method of a field descriptor.The compiler also prevents a user from setting the
field_presence
option on any repeated field, and that includes maps. The actual map fields, because they are repeated, always report false fromhas_presence()
. However, the synthetic fields of the map entry message report whatever the file default is. So they report true of "has_presence()in "proto2" files as well as in editions file, as long as the default presence has not been overridden to be
IMPLICIT`. (The exception being a map value field always reports true, even in proto3 or when the default presence is overridden in editions, simply because it is a map field.)Also looking at
protoc --encode ...
andprotoc --decode ...
, the presence of a map key or value in fact seems to always be false. Regardless of whether you specify a key or value in input, you seem to always get a value. This is easy to see by encoding the following text format and then decoding it: the decode result always shows the keys and values set, even if they were absent in the input.// working with same Foo message defined in example source above m{ key: "abc" value: "xyz" } m{ value: "123" } m{ key: "ABC" }
// decode output adds missing keys and values m { key: "" value: "123" } m { key: "ABC" value: "" } m { key: "abc" value: "xyz" }
This could be a side effect of the dynamic message implementation and how it represents map fields, but I've observed the same behavior in the Go protobuf runtime, not just in the text format but also in the binary format. (I suspect the C++ implementation of the binary format behaves the same way.)
So my second question is: should the key and value fields of a synthetic map entry message always report false for
has_presence()
? Based on observable behavior, I think they should. Even when the map value is a message, the value field of the map entry behaves more like a repeated field in this regard -- failing to serialize a value does not result in a recipient observing an absent value for a map entry, but rather they observe a present-but-empty message.The text was updated successfully, but these errors were encountered: