Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve JSON.parse() performance #345

Closed
wants to merge 1 commit into from

Conversation

Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Feb 25, 2018

When use non-frozen string as hash key with rb_hash_aset(), it will duplicate and freeze the string internally.
To avoid duplicate and freeze, this patch will give a frozen string in rb_hash_aset().

FYI)
If you use string as hash key, hash object always might have frozen string as key.

irb(main):001:0> hash = { "foo" => 42 }
=> {"foo"=>42}
irb(main):002:0> hash.keys[0].frozen?
=> true

This patches will be 15 % faster.

Environment

  • Machine : MacBook (Retina, 12-inch, 2017)
  • OS : macOS 10.13.3
  • Compiler : Apple LLVM version 9.1.0 (clang-902.0.31)
  • Ruby : ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]

Before

$ ruby bench.rb
Warming up --------------------------------------
                json    14.000  i/100ms
Calculating -------------------------------------
                json    144.882  (± 1.4%) i/s -    728.000  in   5.025682s

After

$ ruby bench.rb
Warming up --------------------------------------
                json    16.000  i/100ms
Calculating -------------------------------------
                json    165.608  (± 1.8%) i/s -    832.000  in   5.025367s

Test code

require 'json'
require 'objspace'
require 'securerandom'
require 'benchmark/ips'

obj = []

1000.times do |i|
  obj << {
    "id": i,
    "uuid": SecureRandom.uuid,
    "created_at": Time.now
  }
end

json = obj.to_json

Benchmark.ips do |x|
  x.report "json" do |iter|
    count = 0
    while count < iter
      JSON.parse(json)
      count += 1
    end
  end
end

@Watson1978 Watson1978 changed the title Improve JSON_parse_string() performance Improve JSON.parse() performance Feb 27, 2018
When use non-frozen string for hash key with `rb_hash_aset()`, it will duplicate and freeze it internally.
To avoid duplicate and freeze, this patch will give a frozen string in `rb_hash_aset()`.

## Before
```
Warming up --------------------------------------
                json    14.000  i/100ms
Calculating -------------------------------------
                json    148.844  (± 1.3%) i/s -    756.000  in   5.079969s
```

### After
```
Warming up --------------------------------------
                json    16.000  i/100ms
Calculating -------------------------------------
                json    165.608  (± 1.8%) i/s -    832.000  in   5.025367s
```

### Test code
```
require 'json'
require 'securerandom'
require 'benchmark/ips'

obj = []

1000.times do |i|
  obj << {
    "id": i,
    "uuid": SecureRandom.uuid,
    "created_at": Time.now
  }
end

json = obj.to_json

Benchmark.ips do |x|
  x.report "json" do |iter|
    count = 0
    while count < iter
      JSON.parse(json)
      count += 1
    end
  end
end
```
@glebm
Copy link

glebm commented Apr 21, 2019

@flori Anything preventing this from getting merged? It gives a nice 15% performance increase.

@hsbt
Copy link
Collaborator

hsbt commented Jun 25, 2020

@Watson1978 Can you rebase this from the current master?

@marcandre
Copy link
Contributor

Rebased in #420. @hsbt please merge #420 and close this.

@hsbt hsbt closed this Jun 25, 2020
@marcandre
Copy link
Contributor

This has been merged, thanks @Watson1978 for the PR 👍

@Watson1978
Copy link
Contributor Author

Thanks!

@Watson1978 Watson1978 deleted the parse_string branch June 26, 2020 17:04
Watson1978 added a commit to Watson1978/oj that referenced this pull request Aug 5, 2021
When use non-frozen string as hash key with rb_hash_aset(), it will duplicate and freeze the string internally.

```c
static int
hash_aset_str(st_data_t *key, st_data_t *val, struct update_arg *arg, int existing)
{
    if (!existing && !RB_OBJ_FROZEN(*key)) {
	*key = rb_hash_key_str(*key);
    }
    return hash_aset(key, val, arg, existing);
}
```
Refer: https://github.com/ruby/ruby/blob/bda56a03a625793cb3fd110458c3f7323d73705e/hash.c#L2890-L2897

To avoid duplicate and freeze, this patch will give a frozen string in rb_hash_aset().

FYI)
If you use string as hash key, hash object always might have frozen string as key.

```
irb(main):001:0> hash = { "foo" => 42, bar: 55 }
=> {"foo"=>42, :bar=>55}
irb(main):002:0> hash.keys[0].frozen?
=> true
irb(main):003:0> hash.keys[1].frozen?
=> true
```

This patch has same approch with flori/json#345

−               | before   | after    | result
--               | --       | --       | --
Oj.load          | 335.122k | 422.081k | 1.26x

### Environment
- MacBook Air (M1, 2020)
- macOS 12.0 beta 3
- Apple M1
- Ruby 3.0.2

### Before
```
Warming up --------------------------------------
             Oj.load    33.829k i/100ms
Calculating -------------------------------------
             Oj.load    335.122k (± 0.9%) i/s -      1.691M in   5.047682s
```

### After
```
Warming up --------------------------------------
             Oj.load    42.573k i/100ms
Calculating -------------------------------------
             Oj.load    422.081k (± 0.5%) i/s -      2.129M in   5.043373s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

json =<<-EOF
{
  "$id": "https://example.com/person.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Person",
  "type": "object",
  "properties": {
    "firstName": {
      "type": "string",
      "description": "The person's first name."
    },
    "lastName": {
      "type": "string",
      "description": "The person's last name."
    },
    "age": {
      "description": "Age in years which must be equal to or greater than zero.",
      "type": "integer",
      "minimum": 0
    }
  }
}
EOF

Benchmark.ips do |x|
  x.report('Oj.load') { Oj.load(json) }
end
```
ohler55 pushed a commit to ohler55/oj that referenced this pull request Aug 5, 2021
When use non-frozen string as hash key with rb_hash_aset(), it will duplicate and freeze the string internally.

```c
static int
hash_aset_str(st_data_t *key, st_data_t *val, struct update_arg *arg, int existing)
{
    if (!existing && !RB_OBJ_FROZEN(*key)) {
	*key = rb_hash_key_str(*key);
    }
    return hash_aset(key, val, arg, existing);
}
```
Refer: https://github.com/ruby/ruby/blob/bda56a03a625793cb3fd110458c3f7323d73705e/hash.c#L2890-L2897

To avoid duplicate and freeze, this patch will give a frozen string in rb_hash_aset().

FYI)
If you use string as hash key, hash object always might have frozen string as key.

```
irb(main):001:0> hash = { "foo" => 42, bar: 55 }
=> {"foo"=>42, :bar=>55}
irb(main):002:0> hash.keys[0].frozen?
=> true
irb(main):003:0> hash.keys[1].frozen?
=> true
```

This patch has same approch with flori/json#345

−               | before   | after    | result
--               | --       | --       | --
Oj.load          | 335.122k | 422.081k | 1.26x

### Environment
- MacBook Air (M1, 2020)
- macOS 12.0 beta 3
- Apple M1
- Ruby 3.0.2

### Before
```
Warming up --------------------------------------
             Oj.load    33.829k i/100ms
Calculating -------------------------------------
             Oj.load    335.122k (± 0.9%) i/s -      1.691M in   5.047682s
```

### After
```
Warming up --------------------------------------
             Oj.load    42.573k i/100ms
Calculating -------------------------------------
             Oj.load    422.081k (± 0.5%) i/s -      2.129M in   5.043373s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

json =<<-EOF
{
  "$id": "https://example.com/person.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Person",
  "type": "object",
  "properties": {
    "firstName": {
      "type": "string",
      "description": "The person's first name."
    },
    "lastName": {
      "type": "string",
      "description": "The person's last name."
    },
    "age": {
      "description": "Age in years which must be equal to or greater than zero.",
      "type": "integer",
      "minimum": 0
    }
  }
}
EOF

Benchmark.ips do |x|
  x.report('Oj.load') { Oj.load(json) }
end
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants