Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowness #123

Open
dpc opened this issue Jan 4, 2019 · 11 comments
Open

Slowness #123

dpc opened this issue Jan 4, 2019 · 11 comments

Comments

@dpc
Copy link

dpc commented Jan 4, 2019

As soon as you enable RUST_LOG, even for just a small module that logs infrequently, the performance of your program is going to go drastically downhill.

Example: https://i.imgur.com/al4BJUn.png

+   94.04%     0.39%  rust-bitcoin-in  rust-bitcoin-indexer  [.] <serde_json::read::IoRead<R> as serde_json::read::Read<
+   93.61%     0.48%  rust-bitcoin-in  rust-bitcoin-indexer  [.] serde_json::read::next_or_eof                          
+   93.19%     0.60%  rust-bitcoin-in  rust-bitcoin-indexer  [.] std::io::read_one_byte                                 
+   92.55%     0.14%  rust-bitcoin-in  rust-bitcoin-indexer  [.] std::io::impls::<impl std::io::Read for &'a mut R>::rea
+   92.43%     1.50%  rust-bitcoin-in  rust-bitcoin-indexer  [.] <hyper::client::response::Response as std::io::Read>::r
+   91.04%     0.27%  rust-bitcoin-in  rust-bitcoin-indexer  [.] <hyper::http::h1::Http11Message as std::io::Read>::read
+   90.72%     2.96%  rust-bitcoin-in  rust-bitcoin-indexer  [.] <hyper::http::h1::HttpReader<R> as std::io::Read>::read
-   86.85%    82.85%  rust-bitcoin-in  rust-bitcoin-indexer  [.] log::__log                                             
   - 80.92% <serde_json::read::IoRead<R> as serde_json::read::Read<'de>>::parse_str                                     
      + serde_json::read::next_or_eof                                                                                   
   + 4.00% log::__log                                                                                                   
   + 1.91% jsonrpc::client::Client::send_request                                                                        
+    2.44%     0.02%  rust-bitcoin-in  rust-bitcoin-indexer  [.] rust_bitcoin_indexer::db::parse_node_block             
+    2.20%     0.00%  rust-bitcoin-in  rust-bitcoin-indexer  [.] jsonrpc::client::Client::send_request                  
+    2.18%     0.00%  rust-bitcoin-in  rust-bitcoin-indexer  [.] serde::de::impls::<impl serde::de::Deserialize<'de> for
+    2.18%     0.00%  rust-bitcoin-in  rust-bitcoin-indexer  [.] <&'a mut serde_json::de::Deserializer<R> as serde::de::
+    2.17%     0.00%  rust-bitcoin-in  rust-bitcoin-indexer  [.] serde_json::de::from_reader                            
+    2.01%     1.91%  rust-bitcoin-in  rust-bitcoin-indexer  [.] log::logger                                            
+    1.71%     0.43%  rust-bitcoin-in  rust-bitcoin-indexer  [.] <env_logger::Logger as log::Log>::log                  
+    1.59%     0.13%  rust-bitcoin-in  rust-bitcoin-indexer  [.] core::fmt::write                                       
+    1.44%     1.21%  rust-bitcoin-in  rust-bitcoin-indexer  [.] bitcoin::util::misc::he                     

As you can see, just because serde_json::read::Read has some trace! in parse_str somewhere, is completely destroying my program, for which I've enabled debugging just for my main module.

Is there anything that can be done about it?

@KodrAus
Copy link
Collaborator

KodrAus commented Jan 4, 2019

Does adding a filter to exclude logs from serde_json improve the situation at all?

As far as I know, serde_json::read::Read takes 1 byte at a time, so it's a pretty hot path. It probably shouldn't need to be logging at all. I'll have a look at what it's up to.

@KodrAus
Copy link
Collaborator

KodrAus commented Jan 4, 2019

So I looked at serde_json and it doesn't look like it does any logging.

@dpc
Copy link
Author

dpc commented Jan 4, 2019

Not sure what exactly happened there: might have been something to do with optimalization causing this to point to slightly different function, but I don't think it's very relevant. I've hit this issue before, and it looks it's caused by env_logger matching being slow, so any library that has trace! or something in a hot loop, will start to hog the cpu.

I was using a very simple RUST_LOG=rust_bicoin_indexer which is a name of my root crate. My root crate prints maybe one line every couple of seconds. But my CPU usage went through the roof. Now I wonder if it is going to match only the root module, or does it have to go through the full body search or something.

In

https://github.com/sebasmagri/env_logger/blob/master/src/filter/mod.rs#L164

as soon as the target is enabled, every logging statement will be run against a regex, and there's a to_string() that will allocate a string per every filter string (at very least I'd move it out of the loop). To prevent that allocation from happening at all, env_logger could use a per-thread String buffer to reuse the allocation across many calls.

Now I've noticed that I can actually disable regex through feature, so I'll try it later.
But in my case, I was hopping only a match against target/module will be made. And I don't understand why would make such a huge difference.

Maybe there's more of clever things that can be tried there to optimize the performance, or something is wrong somewhere.

@KodrAus
Copy link
Collaborator

KodrAus commented Jan 5, 2019

Ah right, that code hasn't been touched in a couple of years so there's probably plenty of opportunity to optimise it.

We can work on building up a decent set of benchmarks to track changes in performance.

@KodrAus
Copy link
Collaborator

KodrAus commented Jan 6, 2019

If somebody would like to work on env_logger performance I'd be happy to help them get started!

@PvdBerg1998
Copy link

I profiled my (rather large) backend server and to my surprise found that the time spent logging is orders of magnitude more than the actual program. I saw that slog has something where the printing is offloaded to a separate thread.

What would be your advice on this issue? Would switching to slog give any major improvements?

@dpc
Copy link
Author

dpc commented Jan 28, 2019

@PvdBerg1998 slog-envlogger is a fork of env_logger so it's filtering performance is going to be similiar, but other than this slog is supposed to be very fast, especially with slog-async employed to offload the serialization and io to another thread.

@KodrAus
Copy link
Collaborator

KodrAus commented Jan 28, 2019

Yeh, on the printing side env_logger doesn't do any batching so each log statement results in a line printed to the output stream. It's not particularly fast.

@mainrs
Copy link
Contributor

mainrs commented Jul 22, 2020

Is there anything we can/want to do on our side? Not sure if something like offloading the work is in-scope of the crate to be honest. I'm happy about speed improvements however, and I'll probably also take a closer look at the matching algorithm

@dpc
Copy link
Author

dpc commented Jul 22, 2020

IMO. env_logger (and generally the whole log ecosystem) should become more modular and env_logger itself should be split into a logical part (decide if something should get logged or not, based on env variables etc. and pass it to the "inner loger") and the formatting and outputting part. They could be combined together to give a backward compatible API.

I think my main problem was here was that each logging statement has to go through a regex (if regex feature is enabled). Maybe some tiny cache could provide a substantial speedup. Or maybe there's something else that can be done to avoid it.

@KodrAus
Copy link
Collaborator

KodrAus commented Jul 23, 2020

It would be nice to pull the filters out into a separate library, since they are pretty useful on their own, and for other loggers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants