-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance with rayon ParallelIterator #170
Comments
By global shared lock do you mean the |
No, stderr is unlikely a problem. I already tried reducing the rendering refresh rate to one per second and it didn't help at all. The problem is the fact that the whole See this in pub struct ParProgressBarIter<T> {
it: T,
progress: Arc<Mutex<ProgressBar>>,
} And also updating the internal state requires locking: fn update_and_draw<F: FnOnce(&mut ProgressState)>(&self, f: F) {
let mut draw = false;
{
let mut state = self.state.write().unwrap(); |
As a quick check I tried using Here a fragment of my original code: let file_scan_pb = ProgressBar::new_spinner();
file_scan_pb.set_draw_target(ProgressDrawTarget::stdout_with_hz(1));
file_scan_pb.set_style(ProgressStyle::default_spinner().template("Scanning files: {pos}"));
let files = ... // obtain a parallel iterator
files
.progress_with(file_scan_pb)
.foreach(|item| ...); And after changing to the counter: let counter = RelaxedCounter::new(0);
let files = ... // obtain a parallel iterator
files
.inspect(|_| { counter.inc(); () })
.foreach(|item| ...);
println!("{}", counter.get()); // to avoid smart compiler optimizing our counter out This one is about 3x faster, just as fast as without the |
Oh, I see the problem now. Could you try and use an experimental branch to see if it helps? let pb = ...
let iter = pb.wrap_par_iter(files);
iter.set_update_delta(1000);
iter.foreach(|item| ...); |
|
Oops, please make sure you also enable |
Does not compile :(
|
It looks like the errors are in indicatif, however I have already fixed them. Running |
Looks like it works, however I had to disable the refresh rate, because otherwise it didn't write anything at all. update_delta = 1:
update_delta = 10:
update_delta=100:
No progressbar at all:
I'm not entirely happy with this workaround, though, because the "speed" at which my iterator is able to deliver items is heavily dependent on many circumstances, and hardcoding a magic number doesn't feel right to me. I'd rather progress bar recorded all updates fast like in the example with the Anyway, it is still better than nothing! BTW: Is it possible to do a pull-based progress bar? Like - I could provide a |
You could write your own let counter = Arc::new(counter); // use this in inspect(counter.inc())
let counter2 = Arc::clone(&counter);
let pb2 = pb.clone();
thread::spawn(move || {
if Arc::strong_count(&counter2) == 1 {
// intensive counting is over, stop updating in a separate thread
break;
}
thread::sleep(interval);
pb2.set_position(counter2.get());
}); |
Thank you! This is nice:
I believe this hides the progress bar once the counter is dropped? Is it right? |
TLDR: Yes if all instances of the same progress bar are dropped ( When a ProgressBar is dropped it is hidden unless you have called |
Using @mibac138 's fork gives me a lot of
I'm iterating over quite large values. :( |
Is this a use case that I'm also willing to making a PR to fix/enable this with something along the |
FWIW, this problem is also very visible outside of rayon: I'm parsing gzipped log files that have a few million lines, and depending on the option I pick I get vastly different speeds:
And when parsing another, non-gzipped log file:
Overall, I'd think that the best option might be to “just” make ProgressBar's internal counter an AtomicU64, and then instead of precomputing all the metadata on each WDYT? |
If you're reading this thread because the progressbar is slowing down your application and don't yet know exactly why, try using That is not to say that OP's problem isn't with the locks, it depends a lot on what kind of terminal you use AFAICT, but it just really wasn't that for me. |
If someone can retest this against current main and still sees bad performance, I'd like to hear about it. The internal locking is still pretty much the same, but a bunch of other improvements should have improved performance. |
When processing several hundreds of thousands of items per second with a
ParallelIterator
,ProgressBar
becomes a bottleneck because it grabs a global shared lock whenever the state needs to be updated. Instead, it would be better to store internal progress for each rayon thread separately and sum these counters on read.Having said that as a Rust beginner, I'm not sure how hard such implementation could be in Rust and whether Rust has a similar structure like Java
LongAdder
(https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/LongAdder.html).RelaxedCounter
from https://docs.rs/atomic-counter/1.0.1/atomic_counter/ maybe?The text was updated successfully, but these errors were encountered: