Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make JSON::Parser initialization faster #512

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

luke-gru
Copy link

@luke-gru luke-gru commented Jan 29, 2023

Remove calls to rb_funcall, don't go back in to interpreter if not necessary. Also remove unnecessary calls to rb_respond_to(val) when val is nil. Speedup of JSON.parse when parsing small amount of data in a tight loop is up to 40%.

@luke-gru
Copy link
Author

luke-gru commented Jan 29, 2023

Here's the benchmark code:
https://gist.github.com/luke-gru/adf7468fbc0a3807b5f9a875fcfce116

On my machine:

Rehearsal ----------------------------------------------

json parse (old)  4.140547   0.001651   4.142198 (  4.262071)

------------------------------------- total: 4.142198sec



                 user     system      total        real

json parse (old) 4.099210   0.011923   4.111133 (  4.246992)

Rehearsal ----------------------------------------------

json parse (new)   3.163645   0.003476   3.167121 (  3.290742)

------------------------------------- total: 3.167121sec



                 user     system      total        real

json parse (new)  3.121472   0.004022   3.125494 (  3.198764)

@luke-gru luke-gru force-pushed the remove_unnecessary_funcall_calls branch 2 times, most recently from cbd1224 to 32a2f66 Compare January 29, 2023 14:52
@mensfeld
Copy link

Speedup of JSON.parse when parsing small amount of data in a tight loop is up to 40%.

😮

@hsbt
Copy link
Collaborator

hsbt commented Jan 30, 2023

@luke-gru Can you keep the current indent style? We can view only code changes with https://github.com/flori/json/pull/512/files?w=1. But style changes may causes code-conflict when I backport this to ruby/ruby.

@luke-gru
Copy link
Author

luke-gru commented Jan 30, 2023

HI @hsbt, for the JSON init function, it's a bit subtle but I had to remove the check if (!NIL_P(opts)) and move it below where I could assure that opts was a Hash, so I could use RHASH_SIZE() on it. So for that function it's not just a style change. For parser.c, I think I used a different version of ragel and all the whitespace changed. I could fix this, do you want me to look into it?

EDIT: my ragel is Ragel State Machine Compiler version 6.10 March 2017

@hsbt
Copy link
Collaborator

hsbt commented Jan 30, 2023

So for that function it's not just a style change.

I know that. I say parser.rl and parser.h. I hope you make your change is only https://github.com/flori/json/pull/512/files?w=1, not https://github.com/flori/json/pull/512/files.

@luke-gru luke-gru force-pushed the remove_unnecessary_funcall_calls branch 2 times, most recently from 877eed0 to b6185b5 Compare January 30, 2023 22:58
@luke-gru
Copy link
Author

Understood, sorry for the confusion. Just updated it.

Remove calls to rb_funcall, don't go back in to interpreter if not
necessary. Also remove unnecessary calls to rb_respond_to(val) when val
is nil. Speedup of JSON.parse when parsing small amount of data in a tight loop
is up to 40%.
@luke-gru luke-gru force-pushed the remove_unnecessary_funcall_calls branch from b6185b5 to e525bed Compare January 30, 2023 23:06
Comment on lines 358 to 359
if (json->decimal_class != Qnil) {
if (rb_respond_to(json->decimal_class, i_try_convert)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems indents broken.

If you want to keep the indent of the following block,

Suggested change
if (json->decimal_class != Qnil) {
if (rb_respond_to(json->decimal_class, i_try_convert)) {
if (NIL_P(json->decimal_class)) {
/* use the default conversion */
} else if (rb_respond_to(json->decimal_class, i_try_convert)) {

#endif
if (RTEST(opts) && RHASH_SIZE(opts) > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why moved the check for opts?
It seems that the previous #ifndef HAVE_RB_SCAN_ARGS_OPTIONAL_HASH block must also be inside this condition.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

#define option_given_p(opts, key) RTEST(rb_funcall(opts, i_key_p, 1, key))
extern VALUE rb_hash_has_key(VALUE hash, VALUE key);

#define hash_option_given_p(opts, key) (RTEST(rb_hash_has_key(opts, key)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this name need to be changed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a problem with using a non public API?

When I tried before, I created my own function with the same effect. #481

Copy link
Author

@luke-gru luke-gru Feb 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Watson1978 I wasn't aware of your PR, I should have checked the other PRs more thoroughly before coding up my solution. Yours is better IMO with using only public API, and also making changes to JSON.generate. The only thing that my PR has that's different in the initialize function is that it checks if the hash is empty before going through all the option checks, which yours could add if you want. I also think your other PR (#454) could be merged with the changes in #481.

I'm fine with closing my PR now if yours gets a second look. However if you're busy I'll just take some ideas from your PR and add to mine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a problem with using a non public API?

Yes, that is not respecting the API and will likely break. Also it would likely not work e.g. on truffleruby needlessly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fixed

@hsbt
Copy link
Collaborator

hsbt commented Feb 7, 2023

I accept this proposal. But I'm not familiar with C internals. I requested additional review for @nobu

@mensfeld
Copy link

@luke-gru @hsbt @nobu any ETA on this being released?

@luke-gru
Copy link
Author

Yeah, I'll try to get this updated some time today. Thanks for the reminder 😄

@luke-gru
Copy link
Author

luke-gru commented Apr 3, 2023

I finally got around to working on this. I added a 2nd commit so you can see the changes from the first. When approved, I will squash them.

@mensfeld
Copy link

mensfeld commented May 8, 2024

@luke-gru @hsbt @nobu any ETA on this being released? :D (it's more than a year from the last time I asked so hope you don't mind)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants