Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined instead of default for optional fields ? #1572

Open
sloonz opened this issue Mar 27, 2021 · 16 comments
Open

undefined instead of default for optional fields ? #1572

sloonz opened this issue Mar 27, 2021 · 16 comments

Comments

@sloonz
Copy link
Contributor

sloonz commented Mar 27, 2021

protobuf.js version: 6.10.2

Currently, to test if an optional field has been provided, you have to use message.hasOwnProperty("optField") because message.optField will return the default value (0 for numbers, "" for strings) if not present in the message.

While this is in line with idiomatic protobuf, it is completely out of line with idiomatic javascript where unset optional field would just return undefined when accessed.

My suggestion is to add a pbjs option (for example --no-optional-defaults) for people like me that prefer the javascript way, that would set Message.prototype.optField to undefined instead of the default value for the type.

(for the record, I’m willing to work on this feature if you’re fine with integrating it)

@alexander-fenster
Copy link
Contributor

Are you talking about proto2 or proto3 optional fields here? They have completely different semantics, unfortunately. For proto3 optional, I think I got it right in #1584 and #1597 (always assigning them to null). For proto2 optional, I think having an option for pbjs might be a good idea.

@sloonz
Copy link
Contributor Author

sloonz commented Apr 30, 2021

I’m talking about proto2. I made a first shot at an option : sloonz@c6a7638

Why is optional currently typed as null|undefined tho (https://github.com/protobufjs/protobuf.js/blob/master/cli/targets/static.js#L398) ? Since there's always a value defined on the prototype (even if it's null), it can't be undefined, no ? Am I overlooking something ?

@alexander-fenster
Copy link
Contributor

I checked git blame :) The undefined part was added 4 years ago in this commit, fixing #826. I don't think it's a good idea to revert it, since it will likely break TypeScript types for many users.

Your WIP commit looks good to me. I recently added a very minimal test for pbjs, so you could add a small proto file to tests/data/cli and use it to verify the logic in the test.

@sloonz
Copy link
Contributor Author

sloonz commented May 1, 2021

Added the unit tests : sloonz@a9fb6dd

I don't think it's a good idea to revert it, since it will likely break TypeScript types for many users.

Well, that commit didn’t fixed the issue, it was fixed on another one : #826 (comment) (pretty sure it was in fact fixed here : ed7e2e7)

But I think it’s a discussion for another issue, one I’ll open once this one is closed ;)

@castarco
Copy link

castarco commented May 6, 2021

Not sure if I'm talking about the same as you, as my experience is slightly different, but related:

Right now the library does not allow to have null nor undefined as default value, and therefore it uses the "type default value" (for example 0 or "") when the field is missing (while using a getter on a decoded message)... which I would say is not really in line with how protobuf is supposed to work (and if it is, then it's still very far away from the idiomatic way in JS and TS).

When something is missing, a typical JS/TS developer would expect a null or an undefined value rather than the "default type value", because a missing value does not mean at all "default", but "missing/unknown".

What I had to do as a workaround is quite unsatisfactory, as it makes the whole thing less efficient because it requires extra copy operations:

// not present fields are not copied, and the plain object removes the problems introduced by opinionated getters
const messageFollowingJsPractices = { ...protobufjsProblematicMessage }

I would make a PR allowing for null default values, but first I would like to know if it has any chance to be accepted or if it goes against any principle/rule that the library tries to stick to.

@sloonz
Copy link
Contributor Author

sloonz commented May 6, 2021

I just made a pull request with the option and the unit tests :)

@kskalski
Copy link

Actually the way other languages (e.g. C++, C# see examples in https://github.com/protocolbuffers/protobuf/blob/v3.12.0/docs/field_presence.md) approach optional fields (in proto3) is to make getters return 'default type value' while providing separate method for checking whether field was set (e.g. has_foo).
To be honest I never liked this design, especially for languages that have ability to use 'null' value for primitive type values and JS is an example of that (C# is somewhere in between, since they have Nullable<int> a.k.a. int?, but currently C# protobuf API doesn't use/allow them for optional fields).

So it is an important and bearing consequence choice in design to decide what should be protobufjs's value for non-set optional fields. Consistency with other languages would suggest using the 'default type value' for values returned by getter even when it is not set (and actually, this seems to be the behavior up to 6.10.2.

Now 6.11.X changed it to use undefined and this issue I guess advocates / adds option to use null instead. I'm not completely sure which is better, but I lean towards null, since it better reflects the status of un-set field that actually exists (so maybe it shouldn't have an undefined value). It probably have more consequence on copying, e.g. { ... from } or other operations, which I didn't consider in depth yet.

My current issue is however the ability to actually get the 'default type value' for those fields (as mentioned here #1406 (comment)), which was possible in 6.10.2, though maybe as an additional operation instead of having the values returned by getters as such. One more consideration is the ability to use 'prototype' features of Javascript - e.g. 6.10.2 I see that Msg.create() returns an object, that do not have the optional int32 foo = 1 field explicitly defined, but accessing such object's msg.foo will actually return 0, which I suppose happens because Msg.prototype returns an object with all the default values set. That behavior made a lot of sense - though I suppose distinguishing between set and un-set optional field had to be done using hasOwnProperty(?).
6.11.X modified this behavior, but I didn't figure out yet if getting default value is still possible. As of now 6.10.2 paradoxically gives me better support for optional fields than 6.11.X ;-)

@alexander-fenster
Copy link
Contributor

I'm a little bit lost here :) In 6.11.x, the changes were related to two types of fields: proto3 optional (which are not the same as proto2 optional, the latter should not be affected in any way) and oneof members. Settings default values for oneof members was wrong (it was actually a bug). Settings null values to proto3 optional fields was not considered a breaking change because proto3 optional is a very special (and new - just a few months old) concept.

If your proto is proto2 and there was a change in the way how optional fields work, this was likely not intended. Mind sharing the minimal example of the proto and the repro code that changed the behavior?

@alexander-fenster
Copy link
Contributor

(I fully support the idea of methods for checking the presence, like in other languages, but I think at this point it's the same as checking for a null value - I might be wrong here!)

@kskalski
Copy link

I was referring to proto3 optional fields, which indeed are a new feature, but protobufjs did work already with those fields, though it seems it was basically ignoring the attribute. OTOH the semantics for all proto3 fields kind of matched the behavior expected from optional fields (their presence was saved, encoded and decoded)

Let's consider following proto definition:

syntax = "proto3";
message Msg {
  int32 regulars = 1;
  int32 regulard = 2;
  int32 regularu = 3;
  optional int32 opts = 4;
  optional int32 optd = 5;
  optional int32 optu = 6;
}

Test code:

let msg1 = proto.Msg.create();
msg1.regulars = 1;
msg1.regulard = 0;  // set, but to type default value
msg1.opts = 2;
msg1.optd = 0;  // set, but to type default value
console.log('msg1',
            msg1.regulars, msg1.hasOwnProperty('regulars'),
            msg1.regulard, msg1.hasOwnProperty('regulard'),
            msg1.regularu, msg1.hasOwnProperty('regularu'),
            msg1.opts, msg1.hasOwnProperty('opts'),
            msg1.optd, msg1.hasOwnProperty('optd'),
            msg1.optu, msg1.hasOwnProperty('optu'));

let msg2 = proto.Msg.decode(new Uint8Array(proto.Msg.encode(msg1).finish()));
console.log('msg2',
            msg2.regulars, msg2.hasOwnProperty('regulars'),
            msg2.regulard, msg2.hasOwnProperty('regulard'),
            msg2.regularu, msg2.hasOwnProperty('regularu'),
            msg2.opts, msg2.hasOwnProperty('opts'),
            msg2.optd, msg2.hasOwnProperty('optd'),
            msg2.optu, msg2.hasOwnProperty('optu'));

let msg3 = proto.Msg.prototype;
console.log('msg3 prototype', msg3.regulars, msg3.regularu, msg3.opts, msg3.optu);

let msg4 = proto.Msg.toObject(msg1);
console.log('msg4',
            msg4.regulars, msg4.hasOwnProperty('regulars'),
            msg4.regulard, msg4.hasOwnProperty('regulard'),
            msg4.regularu, msg4.hasOwnProperty('regularu'),
            msg4.opts, msg4.hasOwnProperty('opts'),
            msg4.optd, msg4.hasOwnProperty('optd'),
            msg4.optu, msg4.hasOwnProperty('optu'));

let msg5 = proto.Msg.toObject(msg1, { defaults: true });
console.log('msg5',
            msg5.regulars, msg5.hasOwnProperty('regulars'),
            msg5.regulard, msg5.hasOwnProperty('regulard'),
            msg5.regularu, msg5.hasOwnProperty('regularu'),
            msg5.opts, msg5.hasOwnProperty('opts'),
            msg5.optd, msg5.hasOwnProperty('optd'),
            msg5.optu, msg5.hasOwnProperty('optu'));

Prints following with version 6.10.2:

msg1 1 true 0 true 0 false 2 true 0 true 0 false
msg2 1 true 0 true 0 false 2 true 0 true 0 false
msg3 prototype 0 0 0 0
msg4 1 true 0 true undefined false 2 true 0 true undefined false
msg5 1 true 0 true 0 true 2 true 0 true 0 true

and following with version 6.11.2:

msg1 1 true 0 true 0 false 2 true 0 true null false
msg2 1 true 0 true 0 false 2 true 0 true null false
msg3 prototype 0 0 null null
msg4 1 true 0 true undefined false 2 true 0 true undefined false
msg5 1 true 0 true 0 true 2 true 0 true undefined false

In fact the differences are tiny and boil down to the value returned when accessing unset optional field on the message. The value in prototype is I guess the direct cause and the value set in toObject a consequence of such value being returned.

APIs in other languages differ in this regard, i.e. unset optional field will have the 'default type value' when accessed and this was also the behavior in 6.10.2 and what my code already started to utilize. You might notice some other inconsistencies with protobuf specifications, e.g. msg1 should probably skip serializing the non-optional fields with default values (i.e. regulard).

There are at least two questions here:

  • how should user be able to check the presence of the field
  • how should user access the 'default type value' of the unset optional field (seems impossible in 6.11.2)

@kskalski
Copy link

As for 6.10.2 the answers to those questions are clear:

  • use hasOwnProperty API
  • just access the field (returns value even when unset) or use toObject({ defaults: true }), might also play with prototype to get them

I suppose if the current semantics is here to stay it would be useful to have some kind of defaultsPrototype which would contain the actual defaults for unset optional fields... and maybe an option to use that as prorotype instead of the way it works now.

@alexander-fenster
Copy link
Contributor

Unfortunately, the behavior that was inherited for proto3 optional fields was incorrect - the main problem was that the static object prototype had the default value assigned. hasOwnProperty check only works for usage inside protobuf.js, but as long as you serialize the object to use elsewhere, you have a problem because the consumer cannot tell if the value is unset or set to default value – which is the sole purpose of introducing proto3 optional.

According to the official guidance a proto3 optional field is a member of a "synthetic" single member oneof, and setting any kind of default value for an oneof member defeats the purpose.

This all might not make a lot of sense if we look at protobuf.js as a separate thing, but if we look at the protobuf ecosystem as a whole, then, as protoc maintainers add new features, I believe there are more benefits in trying to catch up than split completely. proto3 optional behavior in protobuf.js was inconsistent with official implementation (in protoc, optional fields were completely forbidden before 3.12.0), now we are a little bit closer to the official approach and more compatible in terms of serialization.

Right now, to check the presence of an optional field, you could either make sure it exists has a non-null value, or – if you're using oneofs: true – I believe you can also look at the underscore field (msg1._opts, msg1._optd, msg1._optu), just like you can do with a regular oneof.

I understand your point that there is no solution currently to get a default value for an optional field. Let me think if we could do something that makes sense here (or feel free to send a PR :) )

Now, having all that said :) what's your use case of proto3 optional fields? Am I right that you started using them even before protoc 3.12.0 made it official?

@kskalski
Copy link

I started using proto3 optional a couple months ago while it was still requiring experimental flags in protoc, the purpose is primarily in implementation of 'patching' functionality (a kind of pub-sub schema where frontend connects by websocket to the server, gets the initial state and then receives small update messages - the update messages are trimmed just to the exact changes in state by skipping the fields / sub-messages that were not changed).

The specific use-case that broke is actually a bit unrelated though - since I use the same data model in the whole frontend, I also use it as backing store for my Vue.js 2.0 reactive edit screens. In there I need to get the whole object instantiated with relevant fields initialized in order to make the reactivity work (might get better in Vue.js 3.0) - for this I currently call toObject({ defaults: true }) which previously created an object with all the fields initialized, even proto3 optional. Thinking about that it might be indeed contrary to the intention of proto3 optional feature, where unset fields should remain empty. I should be able to hack the places I rely on it in some other way.

Still, there is a large discrepancy in the semantics protobufjs exposes now (accessing unset field returns null) as opposed to every other language implementation (accessing the field returns default type value and presence info is available through extra getter call). I'm not really keen on pushing for this schema to be employed here, even though consistency suggests it should, since I actually like null as marker of not present fields. :) Interestingly as you can see in my test, while proto objects return null, the fields are not propagated into object at all when calling toObject, the fields are undefined there.

One way would be to follow my other suggestion from #1406 (comment) and add option to get the defaults for optional fields when calling something like toObject({ defaults: true, optionalDefaults: true }).

@kskalski
Copy link

Unfortunately, the behavior that was inherited for proto3 optional fields was incorrect - the main problem was that the static object prototype had the default value assigned. hasOwnProperty check only works for usage inside protobuf.js, but as long as you serialize the object to use elsewhere, you have a problem because the consumer cannot tell if the value is unset or set to default value – which is the sole purpose of introducing proto3 optional.

What kind of serialization do you have in mind? I think as long as the actual protobufjs objects do support distinguishing whether field is explicitly set or not and whether returned value is a default one from static prototype or one set on the object; and as long as serializing and deserializing such object using protobufjs's encode / decode APIs preserves this, then all is fine.

I suppose you are referring to some custom serialization like treating those objects as JS objects, using {...obj}, etc.? Maybe it is ok to let user decide how unset optional fields are treated in such serialization methods? Especially that protobufjs provides dedicated API toObject which has several options allowing to customize the behavior and it already has option whether to provide default values or not.

According to the official guidance a proto3 optional field is a member of a "synthetic" single member oneof, and setting any kind of default value for an oneof member defeats the purpose.

I'm not sure what do you mean exactly by "setting any kind of default value'', this might be a matter of wording, but I don't see having a fallback when returning value as having any value set. Also note that real one-of field accessors in other languages do return default values even when you access them for unset field or field set to a different one-of case, e.g. see this implementation of C# property accessor:

   public string String {
          get { return fieldCase_ == FieldOneofCase.String ? (string) field_ : ""; }

This all might not make a lot of sense if we look at protobuf.js as a separate thing, but if we look at the protobuf ecosystem as a whole, then, as protoc maintainers add new features, I believe there are more benefits in trying to catch up than split completely. proto3 optional behavior in protobuf.js was inconsistent with official implementation (in protoc, optional fields

I agree, though IMHO previous version of protobufjs implemented proto3 optionals closer to the spirit of their specification than the current implementation. They truly allowed one to check their presence, while still using them as regular fields as is possible in other language APIs. They also properly preserved those properties when serialized and deserialized. And now the implementation diverted from that.

@AMRHopkins
Copy link

I read through this whole thread, but feel like I missed something. I'm working on a project where multiple systems are communicating utilizing proto3. I have worked on systems using C, where we make use of structures and encode and decode them, and everything is happy. I am sending data to a system written in JavaScript where I have explicitly set fields to have a value of 0 (all of the fields are set all of the time, just some happen to equal the default values). I would have expected the decoding in JavaScript to return an object analogous to the structures in C, where it has all of the fields, and the ones set to zero would be present with a value of zero. But the decoding in JavaScript is not creating the fields at all (I assume they weren't included in the encoding due to matching the default value - which is to be expected).

This thread gives the impression this is expected behavior? There is a functional difference between the languages? What is the best way to get the default value in this case? Does the receiver essentially have to recreate the protobuffer definition to implement their own setting of the default value in this case? That is, if the field isn't present, use a default value that is stored elsewhere? I feel like I have to be missing something, because this doesn't seem like this is how it should work.

@AMRHopkins
Copy link

Ok, yes, I was misunderstanding something. It looks like a two step process works in JavaScript where the toObject() method can be called on the decoded data with the option set to include the default data in order to create an object which includes the default data. That is what kskalski said, but it took me messing around to understand the two step process (instead of just calling toObject on the received message), so thanks for that post which got me to the right location.

haphut added a commit to HSLdevcom/transitdata-partial-apc-expander-combiner that referenced this issue Dec 13, 2023
The protobuf specification demands type-specific default values such as
0 for numbers when decoding. That is quite weird for TypeScript code
which usually uses null or undefined for missing values. Let's create an
alternative way to decode without defaults.

Maybe the following issue will be resolved at some point so we would not
need our own workaround:
protobufjs/protobuf.js#1572
haphut added a commit to HSLdevcom/transitdata-partial-apc-expander-combiner that referenced this issue Dec 13, 2023
The protobuf specification demands type-specific default values such as
0 for numbers when decoding. That is quite weird for TypeScript code
which usually uses null or undefined for missing values. Let's create an
alternative way to decode without defaults.

Maybe the following issue will be resolved at some point so we would not
need our own workaround:
protobufjs/protobuf.js#1572
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants