Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rb_str_buf_new() to improve JSON.parse() performance with short string #452

Closed
wants to merge 1 commit into from

Conversation

Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Nov 29, 2020

rb_str_buf_new always performs allocation twice (at str_alloc and ALLOC_N).
And the method does not allow embedded string even if the string is short enough.

VALUE
rb_str_buf_new(long capa)
{
    VALUE str = str_alloc(rb_cString);

    if (capa < STR_BUF_MIN_SIZE) {
        capa = STR_BUF_MIN_SIZE;
    }
    FL_SET(str, STR_NOEMBED);
    RSTRING(str)->as.heap.aux.capa = capa;
    RSTRING(str)->as.heap.ptr = ALLOC_N(char, (size_t)capa + 1);
    RSTRING(str)->as.heap.ptr[0] = '\0';

    return str;
}

Therefore, this PR uses rb_str_new instead to reduce the allocation with a short string.
This PR will improve the performance by about 16%.

Environment

  • MacBook Pro (16-inch, 2019)
  • macOS 10.15.5
  • Intel Core i9 2.4 GHz
  • Ruby 2.7.2

Before

Warming up --------------------------------------
   short_string_json    10.660k i/100ms
    long_string_json    10.058k i/100ms
Calculating -------------------------------------
   short_string_json    106.992k (± 1.4%) i/s -    543.660k in   5.082355s
    long_string_json    102.273k (± 1.0%) i/s -    512.958k in   5.016081s

After

Warming up --------------------------------------
   short_string_json    12.413k i/100ms
    long_string_json    10.457k i/100ms
Calculating -------------------------------------
   short_string_json    124.316k (± 0.8%) i/s -    633.063k in   5.092705s
    long_string_json    105.665k (± 0.9%) i/s -    533.307k in   5.047540s

Test code

require 'benchmark/ips'
require 'json'

short_string_json = {
  "a" => "b" * 23,
  "a" * 23 => "b"
}.to_json.freeze

long_string_json = {
  "a" => "b" * 50,
  "a" * 50 => "b"
}.to_json.freeze

Benchmark.ips do |x|
  x.report("short_string_json") { JSON.parse(short_string_json) }
  x.report("long_string_json") { JSON.parse(long_string_json) }
end

…tring

`rb_str_buf_new` always performs allocation twice (at `str_alloc` and `ALLOC_N`).
And the method does not allow embeded string even if the string is short enough.

```
VALUE
rb_str_buf_new(long capa)
{
    VALUE str = str_alloc(rb_cString);

    if (capa < STR_BUF_MIN_SIZE) {
        capa = STR_BUF_MIN_SIZE;
    }
    FL_SET(str, STR_NOEMBED);
    RSTRING(str)->as.heap.aux.capa = capa;
    RSTRING(str)->as.heap.ptr = ALLOC_N(char, (size_t)capa + 1);
    RSTRING(str)->as.heap.ptr[0] = '\0';

    return str;
}
```

Therefore, this PR uses `rb_str_new` instead to reduce the allocation with a short string.
This PR will improve the performance about 16%.

### Before
```
Warming up --------------------------------------
   short_string_json    10.660k i/100ms
    long_string_json    10.058k i/100ms
Calculating -------------------------------------
   short_string_json    106.992k (± 1.4%) i/s -    543.660k in   5.082355s
    long_string_json    102.273k (± 1.0%) i/s -    512.958k in   5.016081s
```

### After
```
Warming up --------------------------------------
   short_string_json    12.413k i/100ms
    long_string_json    10.457k i/100ms
Calculating -------------------------------------
   short_string_json    124.316k (± 0.8%) i/s -    633.063k in   5.092705s
    long_string_json    105.665k (± 0.9%) i/s -    533.307k in   5.047540s
```

### Test code
```ruby
require 'benchmark/ips'
require 'json'

short_string_json = {
  "a" => "b" * 23,
  "a" * 23 => "b"
}.to_json.freeze

long_string_json = {
  "a" => "b" * 50,
  "a" * 50 => "b"
}.to_json.freeze

Benchmark.ips do |x|
  x.report("short_string_json") { JSON.parse(short_string_json) }
  x.report("long_string_json") { JSON.parse(long_string_json) }
end
```
@Watson1978
Copy link
Contributor Author

The change with the same effect was made at ruby/ruby@02c32b2

However, this PR is still useful for Ruby 2.7.x or below.

@Watson1978
Copy link
Contributor Author

Seems this PR isn't necessary any more by #451

@Watson1978 Watson1978 closed this Apr 16, 2021
@Watson1978 Watson1978 deleted the rb_str_new branch April 16, 2021 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant