Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rb_sym2str() to retrieve String object #687

Merged
merged 1 commit into from Aug 9, 2021

Conversation

Watson1978
Copy link
Collaborator

When it retrieves String object using rb_sym_to_s(), it duplicate the internal String of Symbol.

VALUE
rb_sym_to_s(VALUE sym)
{
    return str_new_shared(rb_cString, rb_sym2str(sym));
}

Refer: https://github.com/ruby/ruby/blob/6f6a84f2f3e82f9554618f080f9b66ded52884cc/string.c#L11136-L11140

There will be no need to duplicate it, since we haven't changed the retrieved String.
This patch will improve performance where it handles Symbols.

before after result
Oj.dump 807.090k 1.173M 1.45x
Oj.dump (compat) 760.372k 1.080M 1.42x
Oj.dump (rails) 656.976k 880.951k 1.34x

Environment

  • MacBook Air (M1, 2020)
  • macOS 12.0 beta 4
  • Apple M1
  • Ruby 3.0.2

Before

Warming up --------------------------------------
             Oj.dump    80.910k i/100ms
    Oj.dump (compat)    75.874k i/100ms
     Oj.dump (rails)    65.977k i/100ms
Calculating -------------------------------------
             Oj.dump    807.090k (± 0.9%) i/s -      4.046M in   5.012848s
    Oj.dump (compat)    760.372k (± 0.8%) i/s -      3.870M in   5.089417s
     Oj.dump (rails)    656.976k (± 0.8%) i/s -      3.299M in   5.021613s

After

Warming up --------------------------------------
             Oj.dump   119.895k i/100ms
    Oj.dump (compat)   107.509k i/100ms
     Oj.dump (rails)    87.631k i/100ms
Calculating -------------------------------------
             Oj.dump      1.173M (± 0.5%) i/s -      5.875M in   5.007040s
    Oj.dump (compat)      1.080M (± 0.7%) i/s -      5.483M in   5.075621s
     Oj.dump (rails)    880.951k (± 0.9%) i/s -      4.469M in   5.073530s

Test code

require 'benchmark/ips'
require 'oj'

json =<<-EOF
{
  "$id": "https://example.com/person.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Person",
  "type": "object",
  "properties": {
    "firstName": {
      "type": "string",
      "description": "The person's first name."
    },
    "lastName": {
      "type": "string",
      "description": "The person's last name."
    },
    "age": {
      "description": "Age in years which must be equal to or greater than zero.",
      "type": "integer",
      "minimum": 0
    }
  }
}
EOF

Benchmark.ips do |x|
  data = Oj.load(json, symbol_keys: true)

  x.report('Oj.dump') { Oj.dump(data) }
  x.report('Oj.dump (compat)') { Oj.dump(data, mode: :compat) }
  x.report('Oj.dump (rails)') { Oj.dump(data, mode: :rails) }
end

When it retrieves String object using `rb_sym_to_s()`, it duplicate the internal String of Symbol.

```c
VALUE
rb_sym_to_s(VALUE sym)
{
    return str_new_shared(rb_cString, rb_sym2str(sym));
}
```
Refer: https://github.com/ruby/ruby/blob/6f6a84f2f3e82f9554618f080f9b66ded52884cc/string.c#L11136-L11140

There will be no need to duplicate it, since we haven't changed the retrieved String.
This patch will improve performance where it handles Symbols.

−               | before   | after    | result
--               | --       | --       | --
Oj.dump          | 807.090k | 1.173M   | 1.45x
Oj.dump (compat) | 760.372k | 1.080M   | 1.42x
Oj.dump (rails)  | 656.976k | 880.951k | 1.34x

### Environment
- MacBook Air (M1, 2020)
- macOS 12.0 beta 4
- Apple M1
- Ruby 3.0.2

### Before
```
Warming up --------------------------------------
             Oj.dump    80.910k i/100ms
    Oj.dump (compat)    75.874k i/100ms
     Oj.dump (rails)    65.977k i/100ms
Calculating -------------------------------------
             Oj.dump    807.090k (± 0.9%) i/s -      4.046M in   5.012848s
    Oj.dump (compat)    760.372k (± 0.8%) i/s -      3.870M in   5.089417s
     Oj.dump (rails)    656.976k (± 0.8%) i/s -      3.299M in   5.021613s
```

### After
```
Warming up --------------------------------------
             Oj.dump   119.895k i/100ms
    Oj.dump (compat)   107.509k i/100ms
     Oj.dump (rails)    87.631k i/100ms
Calculating -------------------------------------
             Oj.dump      1.173M (± 0.5%) i/s -      5.875M in   5.007040s
    Oj.dump (compat)      1.080M (± 0.7%) i/s -      5.483M in   5.075621s
     Oj.dump (rails)    880.951k (± 0.9%) i/s -      4.469M in   5.073530s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

json =<<-EOF
{
  "$id": "https://example.com/person.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Person",
  "type": "object",
  "properties": {
    "firstName": {
      "type": "string",
      "description": "The person's first name."
    },
    "lastName": {
      "type": "string",
      "description": "The person's last name."
    },
    "age": {
      "description": "Age in years which must be equal to or greater than zero.",
      "type": "integer",
      "minimum": 0
    }
  }
}
EOF

Benchmark.ips do |x|
  data = Oj.load(json, symbol_keys: true)

  x.report('Oj.dump') { Oj.dump(data) }
  x.report('Oj.dump (compat)') { Oj.dump(data, mode: :compat) }
  x.report('Oj.dump (rails)') { Oj.dump(data, mode: :rails) }
end
```
@ohler55 ohler55 merged commit 1deb721 into ohler55:develop Aug 9, 2021
@ohler55
Copy link
Owner

ohler55 commented Aug 9, 2021

Thank you for helping so much.

@ohler55
Copy link
Owner

ohler55 commented Aug 9, 2021

BTW, posted an article on Hacker News about the new Oj parser.

@Watson1978 Watson1978 deleted the rb_sym2str branch August 9, 2021 13:06
@Watson1978
Copy link
Collaborator Author

BTW, posted an article on Hacker News about the new Oj parser.

Nice. I found the article :)
https://news.ycombinator.com/item?id=28115675

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants