Skip to content
This repository has been archived by the owner on Sep 24, 2022. It is now read-only.

Large TOML document performance #342

Closed
yuvadm opened this issue Oct 6, 2019 · 11 comments · Fixed by #349
Closed

Large TOML document performance #342

yuvadm opened this issue Oct 6, 2019 · 11 comments · Fixed by #349

Comments

@yuvadm
Copy link

yuvadm commented Oct 6, 2019

I'm attempting to read a large-ish TOML document like so:

println!("Reading toml size {}", s.len());
let t = toml::from_str(s).unwrap();

where the str size is around 4MB (this is an auto-generated TOML, obviously). The file is essentially a lot of very small tables in the following format:

[table-id-1]
name = "Foo"
url = "http://example.com"
tags = ["tag-name"]

The loading time is unreasonable, spinning the CPU up to 100% and taking way too long, over a minute before I kill the process.

Am I wrong to expect this library to be able to handle files that big?

@alexcrichton
Copy link
Collaborator

Thanks for the report! Can you gist the file here that's being parsed so some profiling can be done locally to figure out where the time is being spent? Also, to confirm, are you compiling with --release?

@yuvadm
Copy link
Author

yuvadm commented Oct 9, 2019

@alexcrichton sure, you can pull the file from here https://github.com/streamlib/library/blob/radiodb/library/radiodb.toml

As for --release, thanks for pointing that out, that definitely improves performance significantly but it still takes a few seconds (about 5 on my machine) to parse the entire file, which can still be improved.

@alexcrichton
Copy link
Collaborator

Thanks! Looks like there's definitely some low hanging fruit for us to optimize here, and agreed that we can definitely improve this!

Some local profiling shows:

+   66.12%  toml     toml               [.] toml::de::headers_equal                                                            
+   16.28%  toml     toml               [.] <toml::de::MapVisitor as serde::de::MapAccess>::next_key_seed                      
+    8.52%  toml     libc-2.27.so       [.] __memcmp_avx2_movbe                                                                
+    6.11%  toml     toml               [.] <toml::de::MapVisitor as serde::de::MapAccess>::next_key_seed                      
     0.33%  toml     toml               [.] toml::tokens::Tokenizer::next                                                      
     0.30%  toml     toml               [.] <toml::tokens::CrlfFold as core::iter::traits::iterator::Iterator>::next           

Apparently the headers_equal function is super hot and a good candidate to optimize!

@yuvadm
Copy link
Author

yuvadm commented Oct 9, 2019

IIUC this function is part of some attempt to match an existing header name? Any easy fix here that I might be able to test?

@gsquire
Copy link

gsquire commented Oct 14, 2019

I took a peek at the file posted above and it seems like there are a lot of headers which aligns with the profile. Would something like rayon's all make sense here?

We could feature flag the crate addition but even that seems non-optimal for a one line update.

@est31
Copy link
Contributor

est31 commented Oct 25, 2019

Apparently the headers_equal function is super hot and a good candidate to optimize!

As the person who added that function, note that this function only exists in master and was added by commit 7c9b0a3. @yuvadm most likely used a crates.io release. It's slow before and after the commit as as far as I can tell. The bad performance is caused by these two snippets:

https://github.com/alexcrichton/toml-rs/blob/9ed2903517fe1e63e70fb2138a22296aa434da9e/src/de.rs#L366-L381

https://github.com/alexcrichton/toml-rs/blob/9ed2903517fe1e63e70fb2138a22296aa434da9e/src/de.rs#L500-L503

As the find function is called for every single table, it creates a O(n^2) buildup when you have n tables (before 7c9b0a3, there was a == instead of a headers_equal invocation, but same behaviour really).

A speedup can most likely be attained by creating a lookup structure to get sublinear lookup times.

@est31
Copy link
Contributor

est31 commented Oct 25, 2019

Yeah the file linked has 31315 headers, and squaring that gives a very large number...

@est31
Copy link
Contributor

est31 commented Oct 25, 2019

@yuvadm could you try my PR #349 ? It should fix the speed issue.

@yuvadm
Copy link
Author

yuvadm commented Oct 26, 2019

@est31 your fix looks awesome on my side! Brings performance back to where it should be.

@alexcrichton
Copy link
Collaborator

Thanks for looking into this and fixing it @est31!

@yuvadm
Copy link
Author

yuvadm commented Oct 28, 2019

Thanks again @est31 this is amazing and @alexcrichton thanks for the quick version bump!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants