Faster double-to-string operation when writing out JSON #1692

michaeleisel · 2021-08-09T18:21:27Z

I'm not sure how high of a priority the speed of writing out JSON is. But if it is, I have a prototype that runs in about 40% the time of the fastest library I know of, which is https://github.com/ulfjack/ryu (to be fair, ryu tries to cover a larger set of use cases than we need). In many ways it's just an inverse of what we do for string to double, and uses the same multiplication trick.

lemire · 2021-08-09T18:31:11Z

@michaeleisel Yes. This is a high priority. I started work on a reversed operation, and @nigeltao may also been doing some work (he announced it) but if you got a bit further to a benchmarkable prototype, let us move this forward. Your prototypes have a history of being fruitful.

lemire · 2021-08-09T18:39:59Z

The competitors are ryu, Grisu (which we use in simdjson, we use Grisu 2), Schubfach, Dragonbox...

(see https://github.com/jk-jeon/fp)

But we don't actually care that much about speed per se... of course, any reverse algo. will be fast and efficient, but at least speaking for myself, the motivation is to do both directions with the same code, which helps reduce bloat. This would be the benefit we want for simdjson.

lemire · 2021-08-09T18:40:13Z

Should also be of interest to @JPMag

lemire · 2021-08-09T18:41:13Z

That is, for version 1.0 of simdjson, if we could trim out a large chunk of code, it would be a big plus, even if there is no speed benefit at all.

lemire · 2021-08-09T18:46:01Z

Tentatively marked for simdjson 1.0.

michaeleisel · 2021-08-09T20:28:09Z

I'm not sure how much code sharing we can do exactly, but here's the meat of it: https://gist.github.com/michaeleisel/f7b6ece0587bf982895d1eb0bf2b2aa8

There are 3 cases IIRC:

you can go straight from double to int with a shift, if it fits in a 64-bit unsigned int
the multiplication trick
slow fallback

this code is focused on the middle case. IIRC its only outstanding issue for that case is that it always prints out 17 digits, even when a smaller number of digits is sufficient to convert back to a double

lemire · 2021-08-09T20:43:13Z

@michaeleisel Great. I'm otherwise preoccupied right at the moment, but I will start from your code and keep you posted.

lemire · 2021-08-10T20:55:50Z

I think I will be able to build on this later this week.

michaeleisel · 2021-08-11T15:47:54Z

Here's the full project: https://github.com/michaeleisel/floats (can you build with Xcode?)

Make sure to add ryu as a sibling directory

lemire · 2021-08-11T19:49:04Z

Thanks for the pointer. I think that your gist was already great.

I’ll get started soon.

lemire · 2021-08-30T17:45:14Z

I am marking this for 2.0, removing the 1.0.

lemire · 2021-08-30T17:58:29Z

In my comment above, when I was referring to "reversed operation" and "trimming out a large chunk of code", what I had in mind was a tightly integrated serializer that would reuse the data and code from the deserializer. I see that it is not what your prototype does. It seems that you have built fast code path for the deserializer (from_chars) but it is not directly related to our existing code, it is more of a case where it might compete with our from_chars. So you might have a fast routine for common cases...

michaeleisel · 2021-08-30T18:05:25Z

Agreed

lemire added the performance label Aug 9, 2021

lemire added this to To do in Get simdjson 1.0 out!!! via automation Aug 9, 2021

lemire added this to the 1.0 milestone Aug 9, 2021

lemire removed this from the 1.0 milestone Aug 30, 2021

lemire removed this from To do in Get simdjson 1.0 out!!! Aug 30, 2021

lemire added this to the 2.0 milestone Aug 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster double-to-string operation when writing out JSON #1692

Faster double-to-string operation when writing out JSON #1692

michaeleisel commented Aug 9, 2021 •

edited

lemire commented Aug 9, 2021

lemire commented Aug 9, 2021 •

edited

lemire commented Aug 9, 2021

lemire commented Aug 9, 2021

lemire commented Aug 9, 2021

michaeleisel commented Aug 9, 2021 •

edited

lemire commented Aug 9, 2021

lemire commented Aug 10, 2021

michaeleisel commented Aug 11, 2021

lemire commented Aug 11, 2021

lemire commented Aug 30, 2021

lemire commented Aug 30, 2021

michaeleisel commented Aug 30, 2021

Faster double-to-string operation when writing out JSON #1692

Faster double-to-string operation when writing out JSON #1692

Comments

michaeleisel commented Aug 9, 2021 • edited

lemire commented Aug 9, 2021

lemire commented Aug 9, 2021 • edited

lemire commented Aug 9, 2021

lemire commented Aug 9, 2021

lemire commented Aug 9, 2021

michaeleisel commented Aug 9, 2021 • edited

lemire commented Aug 9, 2021

lemire commented Aug 10, 2021

michaeleisel commented Aug 11, 2021

lemire commented Aug 11, 2021

lemire commented Aug 30, 2021

lemire commented Aug 30, 2021

michaeleisel commented Aug 30, 2021

michaeleisel commented Aug 9, 2021 •

edited

lemire commented Aug 9, 2021 •

edited

michaeleisel commented Aug 9, 2021 •

edited