Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support writing only (some?) longs as strings (opt-in) for Javascript compatibility #1044

Open
rdesgroppes opened this issue Jun 12, 2023 · 14 comments

Comments

@rdesgroppes
Copy link

rdesgroppes commented Jun 12, 2023

Although Jackson exposes a WRITE_NUMBERS_AS_STRINGS feature that forces all regular number values to be written as JSON Strings rather than JSON Numbers, it doesn't currently provide a way to write only 64-bit integers using their canonical representation, which poses a problem for strict downstream parsers given the limitation of the maximum value to 2^53 - 1 (alt) in JavaScript, whereas this value is for instance 2^63 - 1 in Java.

Related specs:

Related issues:

@pjfanning
Copy link
Member

This would need to be an optional feature to not change behaviour for existing users. See StreamWriteFeature.

@rdesgroppes any interest in submitting a PR yourself (targeted at 2.16 branch)?

@pjfanning
Copy link
Member

pjfanning commented Jun 12, 2023

This would need to be an optional feature to not change behaviour for existing users. See StreamWriteFeature.

@rdesgroppes any interest in submitting a PR yourself (targeted at 2.16 branch)?

In the end of the day, users wanting max interoperability should treat all numbers as JSON strings or use double types.

@rdesgroppes rdesgroppes changed the title Support writing only longs as strings Support writing only longs as strings (opt-in) Jun 12, 2023
@cowtowncoder
Copy link
Member

@pjfanning This would definitely be disabled by default. Whether this should be StreamWriteFeature or JsonWriteFeature is up for debate -- I can see it being useful for some but definitely not all or most formats. Beyond JSON, maybe just YAML.
There may be practical considerations as well, making one choice or another easier to support (in 3.0 JsonWriteFeatures are fully separated from generic StreamWriteFeatures, but in 2.x there's some coupling, for example).

Note, too, that one could consider alternative approach of implementation at jackson-databind level as well. I don't have strong preference; suffice it to say there are various trade-offs wrt this choice as well.

This is something that has been requested before, fwtw, so will mark as highly-requested.

@cowtowncoder cowtowncoder changed the title Support writing only longs as strings (opt-in) Support writing only (some?) longs as strings (opt-in) for Javascript compatibility Jun 12, 2023
@cowtowncoder
Copy link
Member

@rdesgroppes Quick question: just to make sure, would this only affect values that are beyond double range (beyond 2^53 and its negative counterpart) or anything written as long? Jackson does provide separate writeNumber() methods so this is possible to implement either way. Just Stringifying all long/Long/BigInteger values would be simpler but in some ways less consistent; checking for actual range is bit more work but might be more likely what is desired.

I changed title slightly but based on specific request feel free to further improve it.

@pjfanning
Copy link
Member

@cowtowncoder is it possible that this feature should affect writing BigInteger and BigDecimal too? Feels like it should. Those number types could represent numbers that make big longs look puny.

@pjfanning
Copy link
Member

pjfanning commented Jun 12, 2023

I'm not sure about hardcoding Javascript magic numbers into Jackson, to be honest. Would we better off just sticking with WRITE_NUMBERS_AS_STRINGS and leaving it at that?

Also, users can write their own custom serializers in Jackson.

@rdesgroppes
Copy link
Author

Well, I have to admit I'm a little embarrassed because the need I initially expressed (HubSpot/jackson-datatype-protobuf#98) applied to serializing protocol buffer messages (typically to pass them through a JSON proxy) and, according to the JSON Mapping of the proto3 specification, int64, fixed64 and uint64 are serialized as strings.

@cowtowncoder
Copy link
Member

cowtowncoder commented Jun 12, 2023

@pjfanning I think I'll disagree a bit here: given that JSON use is so heterogenous (client/server running on different platforms), we have long since given up on only supporting Java-to-Java case. So while I am not a big fan of language-specific support per se, this particular thing seems to be common enough request.

Still, this is just a request and we can and should carefully consider what to do here.

@cowtowncoder
Copy link
Member

@cowtowncoder is it possible that this feature should affect writing BigInteger and BigDecimal too? Feels like it should. Those number types could represent numbers that make big longs look puny.

BigInteger for sure. BigDecimal I am less sure about as I suspect it would be difficult to implement proper checks (unlike with integral numbers).

@cowtowncoder
Copy link
Member

Thinking about this bit more, I wonder if instead of adding a Feature (on/off) of some kind, it'd make more sense to allow configuring a predicate that determines if given number ought to be stringified? Along with perhaps a standard implementation for this use case.

The idea being there'd be multiple methods called, passing typed number -- first, maybe just one for integrals, so int, long, BigInteger -- and return true if output should use String instead of number.

This would allow supporting wider range of conventions; starting with "all integer numbers" and "JS-range compatible" but allowing others, if any found.

@digulla
Copy link
Contributor

digulla commented Jun 13, 2023

I think a better approach would be to offer a bouquet of number serializers, each dealing with a different quirk (stripping trailing zeros from BigDecimal, quoting numbers that JS can't handle properly).

The API should be simple and just take a number and return either String or Number. This would then allow to write a wrapper which accepts Number and either returns them unmodified or converts them to String.

The code in Jackson should look at the return type and write Number as before and when it's a String, it should delegate to writing a string (which would then add quotes).

That way, users can quickly build a custom module picking and choosing whatever they need without having to litter the code base with if (feature).

@cowtowncoder
Copy link
Member

While availability of alternate serializers can be useful, there are some challengers there: for example, serialization of JsonNode values will not use JsonSerializers. So there are various trade-offs regarding wider applicability (when done at low-level by jackson-core) vs. more flexibility (when done at databind).

For what it is worth, most Number deserializers will (by default) accept JSON String, although this permissibility can be changed with CoercionConfig to not allow.

I agree that perhaps another approach would be a new "standard" value type of JavascriptNumber which would enclose a double and handle (de)serialization aspects appropriately. It would not (need) to be a Number, but wrapper type.
If such a type was to be added, it could form a core of something like speculative jackson-datatype-javascript?
I am guessing there would be other useful (wrapper) types that would help with Javascript interoperability.

@digulla
Copy link
Contributor

digulla commented Jun 13, 2023

  1. core vs. databind: That is one reason why I used a new interface in my PR: Support for canonical JSON output in Jackson 2.16 jackson-databind#3967

    Without the interface, every serialization would have to be implemented twice (once for core and once for databind). I didn't do everything necessary to achieve this; I think a second interface would help. That would just contain the method public void serialize(T value, JsonGenerator gen, SerializerProvider provider) from StdSerializer. Most numeric serializers should work with this information and Jackson can provide wrappers that turn those into StdSerializer and whatever JsonNode will need.

    Check out https://github.com/FasterXML/jackson-databind/pull/3967/files#diff-f503539124b68dd0b2862b7eb439ba0f67fef955e7b09255355afd8a612f8a61 to see my approach.

  2. My concern with the Feature approach is that Jackson must somehow find the option that works for most developers. Do you have enough information for this? That's why I'm cautious; the interface approach is more flexible and gives developers clear points where they can override. With features, this isn't possible because those are embedded into the code in places where you usually can't override them without duplication a lot of code.

  3. An idea for a serializer that is smart when it comes to "should I write this as number or String", you can use BigDecimal.valueOf(value.toDouble()) to get a JavaScript number. On a second path, create a BigDecimal using an optimal strategy (no rounding). Get the absolute difference between the two. If that is above a certain threshold, use String.

    Or a hack: Convert to double and print it. Then use toString() on the original value. Remove the dot. If the first 14 digits are the same (double has about 15 digits of precision), double is fine, else it's string.

    But I've seen code that had to do some other assumptions, which is why I don't see that a feature approach will work for everyone.

  4. Does it make sense to think about deserialization here as well? Feels like a whole new can of worms. How about to take this up in a new issue?

@cowtowncoder
Copy link
Member

  1. JsonNode does not use any other serializers, for better or worse; this will not work. I'll have to see what you mean by interface here, wrt canonical output
  2. Finding features etc. is always a challenge; I don't think this is specific difference here
  3. Due to associated performance hit I would not want textual serialization of numbers twice; for some usage it's probably fine but conversion from floating-point values to Strings is a rather heavy-weight operation
  4. Deserialization would be separate, if relevant; although as I said coercion from JSON String to numbers tends to already work. So yes, that'd be new issue if anyone wants it.

FWTW, I do not have immediate plans on working on this. It's more a placeholder at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants