Large TOML document performance #342
Comments
Thanks for the report! Can you gist the file here that's being parsed so some profiling can be done locally to figure out where the time is being spent? Also, to confirm, are you compiling with |
@alexcrichton sure, you can pull the file from here https://github.com/streamlib/library/blob/radiodb/library/radiodb.toml As for |
Thanks! Looks like there's definitely some low hanging fruit for us to optimize here, and agreed that we can definitely improve this! Some local profiling shows:
Apparently the |
IIUC this function is part of some attempt to match an existing header name? Any easy fix here that I might be able to test? |
I took a peek at the file posted above and it seems like there are a lot of headers which aligns with the profile. Would something like rayon's We could feature flag the crate addition but even that seems non-optimal for a one line update. |
As the person who added that function, note that this function only exists in master and was added by commit 7c9b0a3. @yuvadm most likely used a crates.io release. It's slow before and after the commit as as far as I can tell. The bad performance is caused by these two snippets: As the A speedup can most likely be attained by creating a lookup structure to get sublinear lookup times. |
Yeah the file linked has |
@est31 your fix looks awesome on my side! Brings performance back to where it should be. |
Thanks for looking into this and fixing it @est31! |
Thanks again @est31 this is amazing and @alexcrichton thanks for the quick version bump! |
I'm attempting to read a large-ish TOML document like so:
where the
str
size is around 4MB (this is an auto-generated TOML, obviously). The file is essentially a lot of very small tables in the following format:The loading time is unreasonable, spinning the CPU up to 100% and taking way too long, over a minute before I kill the process.
Am I wrong to expect this library to be able to handle files that big?
The text was updated successfully, but these errors were encountered: