New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize parsing option args #683
Conversation
It looked up one by one to make sure that the supported options were included in passed options. This approach of looking up the passed hash each time has an overhead. This patch will extract key and value from the passed options using `rb_hash_foreach()` and check if they are supported or not. − | before | after | result -- | -- | -- | -- Oj.load | 381.671k | 419.917k | 1.10x Oj.dump | 507.221k | 578.877k | 1.14x ### Environment - MacBook Air (M1, 2020) - macOS 12.0 beta 3 - Apple M1 - Ruby 3.0.2 ### Before ``` Warming up -------------------------------------- Oj.load 38.287k i/100ms Oj.dump 51.076k i/100ms Calculating ------------------------------------- Oj.load 381.671k (± 0.6%) i/s - 1.914M in 5.015868s Oj.dump 507.221k (± 0.5%) i/s - 2.554M in 5.035014s ``` ### After ``` Warming up -------------------------------------- Oj.load 42.352k i/100ms Oj.dump 58.022k i/100ms Calculating ------------------------------------- Oj.load 419.917k (± 0.5%) i/s - 2.118M in 5.043048s Oj.dump 578.877k (± 0.4%) i/s - 2.901M in 5.011700s ``` ### Test code ```ruby require 'benchmark/ips' require 'oj' json =<<-EOF { "$id": "https://example.com/person.schema.json", "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Person", "type": "object", "properties": { "firstName": { "type": "string", "description": "The person's first name." }, "lastName": { "type": "string", "description": "The person's last name." }, "age": { "description": "Age in years which must be equal to or greater than zero.", "type": "integer", "minimum": 0 } } } EOF Benchmark.ips do |x| x.report('Oj.load') { Oj.load(json, symbol_keys: true) } data = Oj.load(json, symbol_keys: true) x.report('Oj.dump') { Oj.dump(json, mode: :compat) } end ```
c3c5627
to
662ede1
Compare
Can we count on the symbols always being the same? In other words, could they change or all they always interned? |
Unlike objects such as strings, the same symbol will be assigned to a unique address.
However, it may be safer to compare the values converted by SYM2ID(). |
If IDs are assured to be the same when the content is the same for the life of the program then that might be better. |
Since we use rb_gc_register_address() to register the symbol, the symbol will not be freed. Here is one of using Line 1881 in 62aea59
|
As long as the symbols or IDs don't change I'm okay with either one. |
Thanks. I'm going to quit updating to use IDs. Because we have already compared Symbol directly in several places. Lines 1063 to 1076 in 62aea59
|
ok |
Ops, I found the mistake in
I should pass |
OK, I retake the result.
Before
After
Test coderequire 'benchmark/ips'
require 'oj'
json =<<-EOF
{
"$id": "https://example.com/person.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Person",
"type": "object",
"properties": {
"firstName": {
"type": "string",
"description": "The person's first name."
},
"lastName": {
"type": "string",
"description": "The person's last name."
},
"age": {
"description": "Age in years which must be equal to or greater than zero.",
"type": "integer",
"minimum": 0
}
}
}
EOF
Benchmark.ips do |x|
x.report('Oj.load') { Oj.load(json, symbol_keys: true) }
data = Oj.load(json, symbol_keys: true)
x.report('Oj.dump') { Oj.dump(data, mode: :compat) }
end |
The MR was already merged. You will have to create a new branch. I'd like to get the MR I put up merged as well if you can take a look. |
No worries. it is just mistake in Pull Request description. |
This PR will optimize the parsing options in `JSON.parse`. It looked up one by one to make sure that the supported options were included in passed options. This approach of looking up the passed hash each time has an overhead. This PR will apply the same changing as ohler55#683 − | before | after | result -- | -- | -- | -- Oj.dump | 518.709k | 528.462k | 1.019x ### Environment - MacBook Pro (M1 Max, 2021) - macOS 12.0 - Apple M1 Max - Ruby 3.0.2 ### Before ``` Warming up -------------------------------------- JSON.parse 51.973k i/100ms Calculating ------------------------------------- JSON.parse 518.709k (± 0.3%) i/s - 5.197M in 10.019799s ``` ### After ``` Warming up -------------------------------------- JSON.parse 52.534k i/100ms Calculating ------------------------------------- JSON.parse 528.462k (± 0.5%) i/s - 5.306M in 10.040606s ``` ### Test code ```ruby require 'benchmark/ips' require 'oj' json =<<-EOF { "$id": "https://example.com/person.schema.json", "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Person", "type": "object", "properties": { "firstName": { "type": "string", "description": "The person's first name." }, "lastName": { "type": "string", "description": "The person's last name." }, "age": { "description": "Age in years which must be equal to or greater than zero.", "type": "integer", "minimum": 0 } } } EOF Benchmark.ips do |x| x.warmup = 10 x.time = 10 Oj.mimic_JSON x.report('JSON.parse') { JSON.parse(json, symbolize_names: true) } end ```
This PR will optimize the parsing options in `JSON.parse`. It looked up one by one to make sure that the supported options were included in passed options. This approach of looking up the passed hash each time has an overhead. This PR will apply the same changing as #683 − | before | after | result -- | -- | -- | -- Oj.dump | 518.709k | 528.462k | 1.019x ### Environment - MacBook Pro (M1 Max, 2021) - macOS 12.0 - Apple M1 Max - Ruby 3.0.2 ### Before ``` Warming up -------------------------------------- JSON.parse 51.973k i/100ms Calculating ------------------------------------- JSON.parse 518.709k (± 0.3%) i/s - 5.197M in 10.019799s ``` ### After ``` Warming up -------------------------------------- JSON.parse 52.534k i/100ms Calculating ------------------------------------- JSON.parse 528.462k (± 0.5%) i/s - 5.306M in 10.040606s ``` ### Test code ```ruby require 'benchmark/ips' require 'oj' json =<<-EOF { "$id": "https://example.com/person.schema.json", "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Person", "type": "object", "properties": { "firstName": { "type": "string", "description": "The person's first name." }, "lastName": { "type": "string", "description": "The person's last name." }, "age": { "description": "Age in years which must be equal to or greater than zero.", "type": "integer", "minimum": 0 } } } EOF Benchmark.ips do |x| x.warmup = 10 x.time = 10 Oj.mimic_JSON x.report('JSON.parse') { JSON.parse(json, symbolize_names: true) } end ```
It looked up one by one to make sure that the supported options were included in passed options.
This approach of looking up the passed hash each time has an overhead.
This patch will extract key and value from the passed options using
rb_hash_foreach()
and check if they are supported or not.Environment
Before
After
Test code