Decrease deserialization complexity from quadratic to linear #349

est31 · 2019-10-25T06:27:09Z

Fixes #342.

In particular, see my comment #342 (comment)

Values that I recorded on my machine for running the code measure_time(n, |i| format!("[header_no_{}]\n", i)) function for varying n:

benchmark	optimizations	before this PR	after this PR
1k	no	170ms	48ms
10k	no	14 191ms	483ms
100k	no	n/a	5 077ms
1k	yes	5ms	4ms
10k	yes	311ms	39ms
100k	yes	63 850ms	364ms

You can nicely see how before this PR, a 10x increase in data meant a 100x increase in time spent, while after the PR it only means a 10x increase.

Also add regression test

alexcrichton

Thanks for this @est31!

Can this also include documentation as to what these two intermediate tables are? Either on the struct fields or on the functions that construct them.

alexcrichton · 2019-10-28T13:57:39Z

src/de.rs

+                    .and_then(|entries| {
+                        let start = entries
+                            .binary_search(&self.cur)
+                            .unwrap_or_else(std::convert::identity);


I'd personally prefer if this were unwrap_or_else(|i| i)

alexcrichton · 2019-10-28T15:05:17Z

👍

alexcrichton · 2019-10-28T15:11:57Z

It looks like the tests added here may be causing a spurious failure on CI?

est31 · 2019-10-28T15:17:55Z

Huh, that's weird. It works on my machine but it failed on CI previously as well. As a result I increased the tolerance value, but it seems further increases are needed. The difference is quite large. I can maybe increase the sample size, which should lead to less noise. And maybe increase the multiplier as well to allow for even larger tolerances. Does that sound reasonable?

alexcrichton · 2019-10-28T15:22:12Z

CI machines (VMs) tend to be extremely noisy in terms of measurements, so I think it's fine to perhaps remove the test and move it to a benchmark which can be manually tracked over time. This is pretty unlikely to regress.

As per toml-rs#349 (comment)

CI environments can be noisy and while the test worked great locally on my machine, it didn't on the CI environment. This replaces the test with a (manually tracked) benchmark. As per toml-rs#349 (comment)

CI environments can be noisy and while the test worked great locally on my machine, it didn't on the CI environment. This replaces the test with a (manually tracked) benchmark. As per #349 (comment)

Speed up array code

dcc063f

est31 mentioned this pull request Oct 25, 2019

Large TOML document performance #342

Closed

Speed up map code too

9a38a82

Also add regression test

est31 force-pushed the speedup branch from 0140d4a to 9a38a82 Compare October 25, 2019 06:32

est31 changed the title ~~Increase deserialization speed from quadratic to linear~~ Increase deserialization speed from quadratic time to linear Oct 25, 2019

est31 changed the title ~~Increase deserialization speed from quadratic time to linear~~ Decrease deserialization complexity from quadratic to linear Oct 25, 2019

yuvadm mentioned this pull request Oct 26, 2019

Rethink TOML format for library streamlib/streamlib#9

Closed

alexcrichton reviewed Oct 28, 2019

View reviewed changes

Use more obvious closure notation

158498c

est31 force-pushed the speedup branch from 96c3ca8 to fa61d1a Compare October 28, 2019 14:50

Document the builder functions

37484d5

est31 force-pushed the speedup branch from fa61d1a to 37484d5 Compare October 28, 2019 14:50

est31 requested a review from alexcrichton October 28, 2019 14:51

alexcrichton merged commit c049b7a into toml-rs:master Oct 28, 2019

est31 added a commit to est31/toml-rs that referenced this pull request Oct 28, 2019

Replace the test with a bench

65a2e0e

As per toml-rs#349 (comment)

est31 mentioned this pull request Oct 28, 2019

Replace the test added by #349 with a bench #351

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decrease deserialization complexity from quadratic to linear #349

Decrease deserialization complexity from quadratic to linear #349

est31 commented Oct 25, 2019 •

edited

alexcrichton left a comment

alexcrichton Oct 28, 2019

alexcrichton commented Oct 28, 2019

alexcrichton commented Oct 28, 2019

est31 commented Oct 28, 2019

alexcrichton commented Oct 28, 2019

Decrease deserialization complexity from quadratic to linear #349

Decrease deserialization complexity from quadratic to linear #349

Conversation

est31 commented Oct 25, 2019 • edited

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton Oct 28, 2019

Choose a reason for hiding this comment

alexcrichton commented Oct 28, 2019

alexcrichton commented Oct 28, 2019

est31 commented Oct 28, 2019

alexcrichton commented Oct 28, 2019

est31 commented Oct 25, 2019 •

edited