Make `--no-mmap` calls still use parallelism when filesizes are large #361

ultrabear · 2023-11-06T06:16:18Z

This change uses double buffers that are each 1MiB large, while one buffer is filling from the OS, the other buffer is hashed using update_rayon. This is around twice as fast as just using update_reader for files of 1GiB in size on my machine (ryzen 2600), and half as fast as using mmap.

The code also accounts for small files, if a file is under 1MiB it will fall back to update_reader, this ensures that the change is always at least neutral in performance, because we overshot the actual place where update_rayon becomes faster, we never see cases where it is slower.

Currently the code uses the read_chunks crate, which is something I made to handle EINTR and try and fully fill the read buffer, if this is approved to merge I would want to take the function it calls and just cut it into this project somewhere, instead of adding an extra dependency.

Some crude benchmarks below, hashing a gibibyte of random data;
(b3sum 1.5.0 vs 03e0949)

# this PR
[b3sum]$ time ./target/release/b3sum --no-mmap gigafile
303966b0ba3c0766247f911d8f7dd172cffa1952bf1106f801fcf7e1455ce5c0  gigafile

real	0m0.253s
user	0m1.234s
sys	0m0.501s
# unmodified binary
[b3sum]$ time b3sum --no-mmap gigafile
303966b0ba3c0766247f911d8f7dd172cffa1952bf1106f801fcf7e1455ce5c0  gigafile

real	0m0.570s
user	0m0.477s
sys	0m0.091s
# unmodified binary, with mmap enabled
[b3sum]$ time b3sum gigafile
303966b0ba3c0766247f911d8f7dd172cffa1952bf1106f801fcf7e1455ce5c0  gigafile

real	0m0.126s
user	0m1.067s
sys	0m0.103s

This uses a double buffer of 1MiB each, reading to one buffer while hashing the other in parallel. This is around 2x as fast as hashing singlethreadedly on my machine (ryzen 2600) with an in memory benchmark. This is still 2x slower than using memmap.

ultrabear added 4 commits November 5, 2023 21:21

add comments explaining thought processes

e297112

remove accidentally left in profile fiddling

0b3f843

add testcase that hash_reader_parallel gives correct results

03e0949

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `--no-mmap` calls still use parallelism when filesizes are large #361

Make `--no-mmap` calls still use parallelism when filesizes are large #361

ultrabear commented Nov 6, 2023

Make --no-mmap calls still use parallelism when filesizes are large #361

Are you sure you want to change the base?

Make --no-mmap calls still use parallelism when filesizes are large #361

Conversation

ultrabear commented Nov 6, 2023

Make `--no-mmap` calls still use parallelism when filesizes are large #361

Make `--no-mmap` calls still use parallelism when filesizes are large #361