Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(NODE-5934): replace DataView uses with bit math #649

Merged
merged 5 commits into from
Feb 27, 2024

Conversation

nbbeeken
Copy link
Contributor

@nbbeeken nbbeeken commented Feb 15, 2024

Description

What is changing?

  • Removes all usage of dataview in favor of Float64Arrays and bit math

We keep a Float64Array and a Uint8Array referencing the same ArrayBuffer. With 8 lines worth of assignment we can take the bytes from input BSON and put them into the Uint8Array and use the Float64Array to interpret them as a LE double.

For bigints we can shift the value into two 32 bit components and use javascript's truncation logic to pull out the bytes as we shift down to each segment of the number.

Is there new documentation needed for these changes?

No

What is the motivation for this change?

Creating data views has an impact on peformance, and we can easily avoid using it altogether. Generally, we see better performance the more logic we code in JavaScript.

Benchmarks See below

Release Highlight

Improved the performance of serializing and deserializing doubles and bigints

We now use bit shifting and multiplication operators in place of DataView getX/setX calls to parse and serialize bigints and a Float64Array to convert a double to bytes. This change has been shown to increase deserializing performance ~1.3x and serializing performance ~1.75x.

Double check the following

  • Ran npm run check:lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: type(NODE-xxxx)[!]: description
    • Example: feat(NODE-1234)!: rewriting everything in coffeescript
  • Changes are covered by tests
  • Changes have been benchmarked
  • New TODOs have a related JIRA ticket

@nbbeeken
Copy link
Contributor Author

nbbeeken commented Feb 15, 2024

Running on:

Node.js v20.11.0
OS: linux
CPUs: 8
Arch: x64
RAM: 33.105743872 GB

64-bit floats show improvements, tested a single value, and 1000 values:

=================== testing with { _id: 120.384 } bytes 18
do_nothing x 852,087,149 ops/sec ±0.04% (396 runs sampled)

bson_deserialize_current_main x 3,509,120 ops/sec ±0.11% (395 runs sampled)
bson_deserialize_bits_pr x 6,246,320 ops/sec ±0.39% (391 runs sampled)

bson_serialize_current_main x 2,481,998 ops/sec ±0.39% (391 runs sampled)
bson_serialize_bits_pr x 2,691,455 ops/sec ±0.14% (393 runs sampled)
=================== testing with an object with 1000 keys all set to 120.384, bytes 12895
do_nothing x 852,260,397 ops/sec ±0.04% (396 runs sampled)

bson_deserialize_current_main x 15,682 ops/sec ±0.08% (395 runs sampled)
bson_deserialize_bits_pr x 15,014 ops/sec ±0.06% (395 runs sampled)

bson_serialize_current_main x 10,893 ops/sec ±0.22% (395 runs sampled)
bson_serialize_bits_pr x 16,187 ops/sec ±0.11% (395 runs sampled)

64-bit integers also improved

=================== testing with { _id: -120n } bytes 18
do_nothing x 852,427,659 ops/sec ±0.04% (396 runs sampled)

bson_deserialize_current_main x 2,602,237 ops/sec ±0.09% (394 runs sampled)
bson_deserialize_bits_pr x 4,762,447 ops/sec ±0.17% (395 runs sampled)

bson_serialize_current_main x 2,382,100 ops/sec ±0.14% (393 runs sampled)
bson_serialize_bits_pr x 2,547,732 ops/sec ±0.12% (394 runs sampled)
=================== testing with an object with 1000 keys all set to -120n, bytes 12895
do_nothing x 851,099,356 ops/sec ±0.06% (396 runs sampled)

bson_deserialize_current_main x 3,764 ops/sec ±0.07% (395 runs sampled)
bson_deserialize_bits_pr x 9,259 ops/sec ±0.12% (394 runs sampled)

bson_serialize_current_main x 10,604 ops/sec ±0.10% (395 runs sampled)
bson_serialize_bits_pr x 16,260 ops/sec ±0.12% (395 runs sampled)

Granular benchmark results here

@nbbeeken nbbeeken marked this pull request as ready for review February 21, 2024 21:20
@billouboq
Copy link
Contributor

Looks freaking good !

I have crazy load of work right now, but I was thinking of an other possible improvement, we use Buffer.allocate at multiple places where we could use the faster allocateUnsafe function since we directly set the bytes. I don't know how much improvement it could make but it's safe to use it in those cases :

    const bytes = ByteUtils.allocate(16);
     // Copy the next 16 bytes into the bytes buffer
     bytes.set(buffer.subarray(index, index + 16), 0);

Or

    } else if (elementType === constants.BSON_DATA_OID) {
      const oid = ByteUtils.allocate(12);
      oid.set(buffer.subarray(index, index + 12));
      value = new ObjectId(oid);
      index = index + 12;
    }

@nbbeeken
Copy link
Contributor Author

@billouboq Good idea, I'll file a ticket about it, I do wish there was a web equivalent, but we can incorporate it for node at least.

src/objectid.ts Outdated Show resolved Hide resolved
src/parser/deserializer.ts Outdated Show resolved Hide resolved
src/parser/serializer.ts Outdated Show resolved Hide resolved
src/parser/deserializer.ts Outdated Show resolved Hide resolved
src/parser/serializer.ts Outdated Show resolved Hide resolved
@durran durran added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Feb 27, 2024
@aditi-khare-mongoDB aditi-khare-mongoDB added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Feb 27, 2024
Copy link
Contributor

@aditi-khare-mongoDB aditi-khare-mongoDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/objectid.ts Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
5 participants