New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad I/O overhead - related to Mutex in File? see #2930 #2994
Comments
I'd be happy to accept PRs with an improved implementation, but it's not my priority at the moment. File IO has never been async/await's strong suit. |
Having another look, this does not call any of the methods that lock the mutex from the linked issue. That mutex can't be the cause, though I don't immediately know what else it could be. |
I have integrated the benchmarks into tokio/benches and can provide a PR later this evening. |
You say that you tested on 0.1 and 0.3. Did you try 0.2? |
Has there been a performance regression from 0.2 -> 0.3? I don't believe 0.1 had any FS apis. Incidentally, given how FS ops work, the best strategy for optimal performance right now is going to be to spawn blocking tasks that batch FS work. |
Good point, it has already happend in v0.2. see v0.2.x...stsydow:fs-benchmarks-v0.2
Here are the profiling results as callgraph: v0.1 for reference:
|
I suspect now it is rather a problem with the runtime then with file I/O directly. Here is a minimal standalone example for profiling: use tokio::fs::File;
use tokio::io::AsyncReadExt;
fn rt() -> tokio::runtime::Runtime {
tokio::runtime::Builder::new_current_thread()
//tokio::runtime::Builder::new_multi_thread().worker_threads(2)
.build()
.unwrap()
}
const BLOCK_COUNT: usize = 100_000;
const BUFFER_SIZE: usize = 4096;
const DEV_ZERO: &'static str = "/dev/zero";
pub fn main() -> () {
let rt = rt();
let task = || async {
let mut file = File::open(DEV_ZERO).await.unwrap();
let mut buffer = [0u8; BUFFER_SIZE];
for _i in 0..BLOCK_COUNT {
let count = file.read(&mut buffer).await.unwrap();
if count == 0 {
break;
}
}
};
rt.block_on(task());
} $perf stat -r3 -e syscalls:sys_enter_futex,syscalls:sys_enter_read,context-switches,cycles:k,cycles:u,instructions,task-clock ./target/release/fs_standalone
From the profile an important location is: tokio/tokio/src/runtime/blocking/pool.rs Line 261 in c8a484b
I know it is the idle case, but it becomes ready "sooner than expected" for files and probably also channels |
File I/O has an overhead increase of factor three since the 0.3 release (on AArch64 linux kernel v5.7).
I suspect the Mutex protecting the file introduces in 12bba6f / #2930
I noticed that for my application the syscall sys_enter_futex increased from 15 (in v0.1) to 248,792 (v0.3)
and am still searching for a better solution
Benchmarks
Results
$ RUSTFLAGS="-C force-frame-pointers -C target-cpu=native" cargo bench
$ RUSTFLAGS="-C` force-frame-pointers -C target-cpu=native" perf stat -e syscalls:sys_enter_futex,syscalls:sys_enter_read,syscalls:sys_enter_sigaltstack,syscalls:sys_enter_sched_yield,syscalls:sys_enter_munmap,syscalls:sys_enter_mprotect,syscalls:sys_enter_mmap cargo bench tests::sync_read
$ RUSTFLAGS="-C force-frame-pointers -C target-cpu=native" perf stat -e syscalls:sys_enter_futex,syscalls:sys_enter_read,syscalls:sys_enter_sigaltstack,syscalls:sys_enter_sched_yield,syscalls:sys_enter_munmap,syscalls:sys_enter_mprotect,syscalls:sys_enter_mmap cargo bench tests::async_read_codec
The text was updated successfully, but these errors were encountered: