Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize parsing #358

Merged
merged 9 commits into from Nov 24, 2019
Merged

Optimize parsing #358

merged 9 commits into from Nov 24, 2019

Conversation

michalsrb
Copy link
Contributor

Hi, I am working on a project that parses lots of json with DateTimes in it. The time spent parsing in chrono appeared quite high compared to other things in profiler, so I tried to optimize it.

Commit c12e4be adds benchmarks. Here is the resulting diff:

 name                                                 before ns/iter  after ns/iter  diff ns/iter   diff %  speedup 
 datetime::tests::bench_datetime_from_str             712              337                      -375  -52.67%   x 2.11 
 datetime::tests::bench_datetime_parse_from_rfc2822   252              181                       -71  -28.17%   x 1.39 
 datetime::tests::bench_datetime_parse_from_rfc3339   242              142                      -100  -41.32%   x 1.70

Individual commits have their diffs in them.

@michalsrb
Copy link
Contributor Author

Looks like it is failing with rust 1.13.0, I'll try to fix it...

@quodlibetor
Copy link
Contributor

niiice, thanks!

Copy link
Contributor

@quodlibetor quodlibetor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great! If it's easy enough for you to do I'm interested in seeing how big of a difference the inline changes would make, but otherwise I'm happy to merge this.

#[bench]
fn bench_datetime_parse_from_rfc2822(bh: &mut test::Bencher) {
bh.iter(|| {
let str = test::black_box("Wed, 18 Feb 2015 23:16:09 +0000");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wild, I did not realize that it's possible to use str as a variable name. I always just use s. (this is fine, just surprising to me.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize when I wrote it, but yeah, it is strange as identifier.

@@ -112,6 +112,7 @@ pub struct Parsed {

/// Checks if `old` is either empty or has the same value to `new` (i.e. "consistent"),
/// and if it is empty, set `old` to `new` as well.
#[inline(always)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm could we check what the bench results look like with just inline instead of inline(always)? My understanding is that that gives llvm more flexibility to decide whether to inline, although AIUI it's still just a hint anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remeasured again and got these results:

No attribute:

test datetime::tests::bench_datetime_from_str            ... bench:         347 ns/iter (+/- 14)
test datetime::tests::bench_datetime_parse_from_rfc2822  ... bench:         197 ns/iter (+/- 121)
test datetime::tests::bench_datetime_parse_from_rfc3339  ... bench:         166 ns/iter (+/- 8)

With #[inline]:

test datetime::tests::bench_datetime_from_str            ... bench:         344 ns/iter (+/- 9)
test datetime::tests::bench_datetime_parse_from_rfc2822  ... bench:         192 ns/iter (+/- 11)
test datetime::tests::bench_datetime_parse_from_rfc3339  ... bench:         167 ns/iter (+/- 15)

With #[inline(always)]:

test datetime::tests::bench_datetime_from_str            ... bench:         369 ns/iter (+/- 33)
test datetime::tests::bench_datetime_parse_from_rfc2822  ... bench:         175 ns/iter (+/- 9)
test datetime::tests::bench_datetime_parse_from_rfc3339  ... bench:         138 ns/iter (+/- 5)

Interestingly the bench_datetime_from_str got worse, unlike in my previous measurement. The gains may be too random. I am fine with omitting this commit, it may be too risky to force inlining.

@quodlibetor
Copy link
Contributor

So running it locally it seems like the always has no meaningful effect on runtime, if you'd like to verify and remove it that would be fine, otherwise I'll merge this as-is this weekend.

@michalsrb
Copy link
Contributor Author

Ok, just to clarify, should the #[inline] stay without the always, or should it go away completely?

@quodlibetor
Copy link
Contributor

I'd prefer it to stay without the always unless there's evidence that losing the always makes it slower.

@michalsrb
Copy link
Contributor Author

Rebased on master to fix conflicts and changed the #[inline(always)] to #[inline].

@quodlibetor
Copy link
Contributor

This looks great! The issues are coming from the new "std is behind a feature gate" thing that caused conflicts, I can fix them but if you want to I think everything that you're doing can either switch to core or put things behind a feature gate. Also if you want to deal with it you can feel free to stick an #![allow(deprecated)] in the crate root to make the try! warnings go away, I've got a commit to transition to ? that I didn't want to conflict with your changes, ironically.

The #[cfg(bench)] attribute does not exist and is always false. Lets
define a feature "bench" which can be used to enable benchmarks when
building with nightly.
The parse::parse and format::format functions accepted Iterator of owned
Items. While it is sometimes convenient to pass in the owned values,
neither of the functions really need to own them, so references would
be enough. The Borrow trait allows us to pass in Iterator over values,
references, boxes, etc.

According to RFC 1105 this is a minor change, because it shouldn't break
any existing code. And chrono is in pre-1.0 version anyway.

This allows us to remove multiple cloned() calls which speeds up parsing
and formating:

 name                                                 control ns/iter  remove-cloned ns/iter  diff ns/iter   diff %  speedup
 datetime::tests::bench_datetime_from_str             712              582                            -130  -18.26%   x 1.22
 datetime::tests::bench_datetime_parse_from_rfc2822   252              244                              -8   -3.17%   x 1.03
 datetime::tests::bench_datetime_parse_from_rfc3339   242              239                              -3   -1.24%   x 1.01
The Item::Space calls str::trim_left and Item::Numeric also calls
str::trim_left before doing anything else, so there is no need to
have Item::Space before Item::Numeric.

Speeds up parsing:
 name                                                 remove-cloned ns/iter  simplify-from-str ns/iter  diff ns/iter   diff %  speedup
 datetime::tests::bench_datetime_from_str             582                    448                                -134  -23.02%   x 1.30
 datetime::tests::bench_datetime_parse_from_rfc2822   244                    242                                  -2   -0.82%   x 1.01
 datetime::tests::bench_datetime_parse_from_rfc3339   239                    234                                  -5   -2.09%   x 1.02
The original would first check that there is right amount of numeric
characters and then parsed them using the std::str::parse, which
internally checks the characters again and also checks for -/+ prefix,
which is not necessary in this case.

Since we are already going over the characters, we may as well do the
parsing ourselves. The length of the function is roughly the same and
it is faster:

 name                                                 simplify-from-str ns/iter  reimplement-number ns/iter  diff ns/iter   diff %  speedup
 datetime::tests::bench_datetime_from_str             448                        365                                  -83  -18.53%   x 1.23
 datetime::tests::bench_datetime_parse_from_rfc2822   242                        195                                  -47  -19.42%   x 1.24
 datetime::tests::bench_datetime_parse_from_rfc3339   234                        166                                  -68  -29.06%   x 1.41
Speedups:
 datetime::tests::bench_datetime_from_str             365                         337                       -28   -7.67%   x 1.08
 datetime::tests::bench_datetime_parse_from_rfc2822   195                         181                       -14   -7.18%   x 1.08
 datetime::tests::bench_datetime_parse_from_rfc3339   166                         142                       -24  -14.46%   x 1.17
The question mark will be fixed in a pending commit
Something that wasn't part of this PR: the work to support nested
`Option<[ChronoType]>` was merged without being adjusted for the no-std
support

And now there are some unused import warnings because they need to get
configged out.
@quodlibetor quodlibetor merged commit 46f8267 into chronotope:master Nov 24, 2019
@quodlibetor
Copy link
Contributor

Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants