Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grcov sometimes takes very long to execute #1013

Open
jonhoo opened this issue Apr 11, 2023 · 26 comments
Open

grcov sometimes takes very long to execute #1013

jonhoo opened this issue Apr 11, 2023 · 26 comments

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Apr 11, 2023

This is extracted from the discussion in #326.

In some projects, some of the time, when I execute grcov, it takes a very long time to execute, using 100% of one core for several minutes at the time. I executed it like this:

$ pgrep -a grcov
61819 grcov $X/cb-coverage-target/profraw --source-dir . --binary-path $X/cb-coverage-target/debug/ --branch --ignore-not-existing <a few --excl-* args> --keep-only src/* --keep-only src/* --keep-only src/bin/* -t html -o $X/coverage/
$ ls -la $X/cb-coverage-target/profraw
.rw-r--r-- 3.1M jongje 11 Apr 16:45 61284-8806487336122761277_0.profraw
.rw-r--r-- 2.6M jongje 11 Apr 16:45 61289-8403376454973083594_0.profraw
.rw-r--r--  24k jongje 11 Apr 16:45 61296-16724270415138201609_0.profraw
.rw-r--r-- 5.5M jongje 11 Apr 16:45 61297-4578242808341374556_0.profraw
.rw-r--r--  224 jongje 11 Apr 16:45 61378-18005603889831348665_0.profraw
.rw-r--r-- 4.6M jongje 11 Apr 16:45 61379-12535087358149662424_0.profraw
.rw-r--r-- 5.4M jongje 11 Apr 16:45 61458-13306843052265737977_0.profraw
.rw-r--r-- 5.5M jongje 11 Apr 16:45 61573-2063728991251543943_0.profraw
.rw-r--r-- 5.4M jongje 11 Apr 16:45 61659-2807099239873412591_0.profraw
.rw-r--r-- 336k jongje 11 Apr 16:45 61742-2870874979089868175_0.profraw
.rw-r--r-- 312k jongje 11 Apr 16:45 61742-5935495699248178733_0.profraw
.rw-r--r-- 543k jongje 11 Apr 16:45 61742-6234157094333981707_0.profraw
.rw-r--r-- 206k jongje 11 Apr 16:45 61742-8110670833310045757_0.profraw
.rw-r--r-- 8.3k jongje 11 Apr 16:45 61742-8156326955437846749_0.profraw
.rw-r--r--  27k jongje 11 Apr 16:45 61742-13964213668791941633_0.profraw
.rw-r--r-- 323k jongje 11 Apr 16:45 61742-15978164622266936944_0.profraw
$ ls -la $X/cb-coverage-target/debug/
.rw-r--r--    0 jongje 11 Apr 16:20 .cargo-lock
drwxr-xr-x    - jongje 11 Apr 16:20 .fingerprint
drwxr-xr-x    - jongje 11 Apr 16:20 build
drwxr-xr-x    - jongje 11 Apr 16:21 deps
drwxr-xr-x    - jongje 11 Apr 16:20 examples
drwxr-xr-x    - jongje 11 Apr 16:21 incremental
.rwxr-xr-x 329M jongje 11 Apr 16:21 binary-1
.rwxr-xr-x 379M jongje 11 Apr 16:21 binary-2

I tried running perf on the grcov invocation from shortly after it started until I killed it, and the results were interesting:

-   69.78%     0.00%  llvm-cov    llvm-cov                          [.] (anonymous namespace)::CodeCoverageTool::run
   - (anonymous namespace)::CodeCoverageTool::run
      - 64.30% (anonymous namespace)::CodeCoverageTool::run
         - 63.89% llvm::CoverageExporterLcov::renderRoot
            - 63.50% llvm::CoverageExporterLcov::renderRoot
               - 62.60% llvm::coverage::FunctionRecordIterator::skipOtherFiles
                   16.62% bcmp
      - 5.47% (anonymous namespace)::CodeCoverageTool::load

A run of strace -f -c was also instructive:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 55.50   23.663735        1316     17972        18 futex
 30.72   13.098805      127172       103           wait4
  5.21    2.223218          27     79948     23060 read
  3.42    1.457714          21     68292           write
  2.19    0.934105          40     23164           ppoll
  0.58    0.247434          17     14333           rt_sigaction
  0.51    0.215750          34      6283      5047 openat
  0.42    0.180645          18     10009           brk
  0.31    0.130493          38      3401           mmap
  0.26    0.112478          31      3605      2884 newfstatat
  0.23    0.098308          31      3158           mprotect
  0.12    0.051731          31      1648           close
  0.11    0.044941          31      1442           fstat
  0.10    0.043112          53       803           munmap
  0.06    0.024312          26       930           rt_sigprocmask
  0.05    0.021149          29       721           prlimit64
  0.04    0.019140         184       104           mremap
  0.02    0.009636          46       206           clone
  0.02    0.009387          30       309           ioctl
  0.02    0.008547          27       309           dup3
  0.02    0.008122          39       206           sigaltstack
  0.02    0.006905          33       206           pipe2
  0.01    0.006072          29       206           set_robust_list
  0.01    0.006001          29       206       206 lseek
  0.01    0.005197          50       103           execve
  0.01    0.003782          36       103           readlinkat
  0.01    0.003593          34       103           madvise
  0.01    0.003435          33       103           set_tid_address
  0.01    0.002718          26       103       103 faccessat
  0.00    0.000000           0         1           tgkill
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0         1           gettid
------ ----------- ----------- --------- --------- ----------------
100.00   42.640465                238082     31318 total

It looks like a lot of time is spent just calling llvm-cov export, which then spends a lot of its time skipping/ignoring files. And a lot of time is spend on locking somewhere. Looking at what those llvm-cov exports operate on, it looks like it's every file under $X/cb-coverage-target/debug:

$X/cb-coverage-target/debug/new-crate-yanker
$X/cb-coverage-target/debug/build/serde_derive-998ef328dc8205c7/build_script_build-998ef328dc8205c7
$X/cb-coverage-target/debug/build/serde_derive-998ef328dc8205c7/build-script-build
$X/cb-coverage-target/debug/build/futures-task-a43cf798c741c1e5/build_script_build-a43cf798c741c1e5
$X/cb-coverage-target/debug/build/futures-task-a43cf798c741c1e5/build-script-build
$X/cb-coverage-target/debug/build/semver-587cd36212a42b68/build_script_build-587cd36212a42b68
$X/cb-coverage-target/debug/build/semver-587cd36212a42b68/build-script-build
$X/cb-coverage-target/debug/build/httparse-1b4d98bc49e8be40/build_script_build-1b4d98bc49e8be40
$X/cb-coverage-target/debug/build/httparse-1b4d98bc49e8be40/build-script-build
$X/cb-coverage-target/debug/build/rayon-core-ebfec758998913f3/build_script_build-ebfec758998913f3
$X/cb-coverage-target/debug/build/rayon-core-ebfec758998913f3/build-script-build
$X/cb-coverage-target/debug/build/slab-5b8ce3de708fe0c7/build_script_build-5b8ce3de708fe0c7
$X/cb-coverage-target/debug/build/slab-5b8ce3de708fe0c7/build-script-build
$X/cb-coverage-target/debug/build/proc-macro2-7dff807bfcb44bc2/build-script-build
$X/cb-coverage-target/debug/build/proc-macro2-7dff807bfcb44bc2/build_script_build-7dff807bfcb44bc2
$X/cb-coverage-target/debug/build/curl-6d21905df01ab6dc/build-script-build
$X/cb-coverage-target/debug/build/curl-6d21905df01ab6dc/build_script_build-6d21905df01ab6dc
$X/cb-coverage-target/debug/build/signal-hook-d988d2a242379f7e/build_script_build-d988d2a242379f7e
$X/cb-coverage-target/debug/build/signal-hook-d988d2a242379f7e/build-script-build
$X/cb-coverage-target/debug/build/anyhow-f34f4b00a66b8eb6/build_script_build-f34f4b00a66b8eb6
$X/cb-coverage-target/debug/build/anyhow-f34f4b00a66b8eb6/build-script-build
$X/cb-coverage-target/debug/build/libz-sys-65752cf9edf4eafd/build_script_build-65752cf9edf4eafd
$X/cb-coverage-target/debug/build/libz-sys-65752cf9edf4eafd/build-script-build
$X/cb-coverage-target/debug/build/libgit2-sys-0585917da9ed8876/build_script_build-0585917da9ed8876
$X/cb-coverage-target/debug/build/libgit2-sys-0585917da9ed8876/build-script-build
$X/cb-coverage-target/debug/build/unicase-eae804e298002f99/build_script_build-eae804e298002f99
$X/cb-coverage-target/debug/build/unicase-eae804e298002f99/build-script-build
$X/cb-coverage-target/debug/build/native-tls-84167741b66fd17f/build_script_build-84167741b66fd17f
$X/cb-coverage-target/debug/build/native-tls-84167741b66fd17f/build-script-build
$X/cb-coverage-target/debug/build/openssl-2794832cd06ea98d/build-script-build
$X/cb-coverage-target/debug/build/openssl-2794832cd06ea98d/build_script_build-2794832cd06ea98d
$X/cb-coverage-target/debug/build/libc-d343cd3686b51b54/build-script-build
$X/cb-coverage-target/debug/build/libc-d343cd3686b51b54/build_script_build-d343cd3686b51b54
$X/cb-coverage-target/debug/build/crc32fast-2813e5891a11c7ef/build-script-build
$X/cb-coverage-target/debug/build/crc32fast-2813e5891a11c7ef/build_script_build-2813e5891a11c7ef
$X/cb-coverage-target/debug/build/ahash-16b0f1e509e62802/build_script_build-16b0f1e509e62802
$X/cb-coverage-target/debug/build/ahash-16b0f1e509e62802/build-script-build
$X/cb-coverage-target/debug/build/rustversion-590d0d1f54eb250a/build_script_build-590d0d1f54eb250a
$X/cb-coverage-target/debug/build/rustversion-590d0d1f54eb250a/build-script-build
$X/cb-coverage-target/debug/build/futures-core-077e504f33c63998/build_script_build-077e504f33c63998
$X/cb-coverage-target/debug/build/futures-core-077e504f33c63998/build-script-build
$X/cb-coverage-target/debug/build/crossbeam-epoch-39b61d173fab0739/build_script_build-39b61d173fab0739
$X/cb-coverage-target/debug/build/crossbeam-epoch-39b61d173fab0739/build-script-build
$X/cb-coverage-target/debug/build/crossbeam-queue-0e851ae96f377934/build-script-build
$X/cb-coverage-target/debug/build/crossbeam-queue-0e851ae96f377934/build_script_build-0e851ae96f377934
$X/cb-coverage-target/debug/build/curl-sys-6dd92ae058ac86d5/build_script_build-6dd92ae058ac86d5
$X/cb-coverage-target/debug/build/curl-sys-6dd92ae058ac86d5/build-script-build
$X/cb-coverage-target/debug/build/tokio-db808a7b9dd0a4d7/build_script_build-db808a7b9dd0a4d7
$X/cb-coverage-target/debug/build/tokio-db808a7b9dd0a4d7/build-script-build
$X/cb-coverage-target/debug/build/futures-util-07362946f00908de/build-script-build
$X/cb-coverage-target/debug/build/futures-util-07362946f00908de/build_script_build-07362946f00908de
$X/cb-coverage-target/debug/build/memchr-88b0eb3a56660b40/build-script-build
$X/cb-coverage-target/debug/build/memchr-88b0eb3a56660b40/build_script_build-88b0eb3a56660b40
$X/cb-coverage-target/debug/build/indexmap-b7d01c653a2ccc7e/build_script_build-b7d01c653a2ccc7e
$X/cb-coverage-target/debug/build/indexmap-b7d01c653a2ccc7e/build-script-build
$X/cb-coverage-target/debug/build/parking_lot_core-6fd58c973a766b08/build_script_build-6fd58c973a766b08
$X/cb-coverage-target/debug/build/parking_lot_core-6fd58c973a766b08/build-script-build
$X/cb-coverage-target/debug/build/log-0b30c4bda57051c7/build_script_build-0b30c4bda57051c7
$X/cb-coverage-target/debug/build/log-0b30c4bda57051c7/build-script-build
$X/cb-coverage-target/debug/build/thiserror-17150905cea044c2/build-script-build
$X/cb-coverage-target/debug/build/thiserror-17150905cea044c2/build_script_build-17150905cea044c2
$X/cb-coverage-target/debug/build/num-traits-54ae66f0cf07f718/build_script_build-54ae66f0cf07f718
$X/cb-coverage-target/debug/build/num-traits-54ae66f0cf07f718/build-script-build
$X/cb-coverage-target/debug/build/mime_guess-879a91c975411707/build_script_build-879a91c975411707
$X/cb-coverage-target/debug/build/mime_guess-879a91c975411707/build-script-build
$X/cb-coverage-target/debug/build/smartstring-cc9df01996da7da8/build_script_build-cc9df01996da7da8
$X/cb-coverage-target/debug/build/smartstring-cc9df01996da7da8/build-script-build
$X/cb-coverage-target/debug/build/openssl-sys-da57f73f42886402/build-script-main
$X/cb-coverage-target/debug/build/openssl-sys-da57f73f42886402/build_script_main-da57f73f42886402
$X/cb-coverage-target/debug/build/futures-channel-2f1a56bb895e8af9/build_script_build-2f1a56bb895e8af9
$X/cb-coverage-target/debug/build/futures-channel-2f1a56bb895e8af9/build-script-build
$X/cb-coverage-target/debug/build/io-lifetimes-eacbf0bc3c7ad239/build-script-build
$X/cb-coverage-target/debug/build/io-lifetimes-eacbf0bc3c7ad239/build_script_build-eacbf0bc3c7ad239
$X/cb-coverage-target/debug/build/serde_json-eabd4439168aa9b1/build-script-build
$X/cb-coverage-target/debug/build/serde_json-eabd4439168aa9b1/build_script_build-eabd4439168aa9b1
$X/cb-coverage-target/debug/build/serde-f693450158e32a9e/build_script_build-f693450158e32a9e
$X/cb-coverage-target/debug/build/serde-f693450158e32a9e/build-script-build
$X/cb-coverage-target/debug/build/syn-85b699ae17d6164c/build_script_build-85b699ae17d6164c
$X/cb-coverage-target/debug/build/syn-85b699ae17d6164c/build-script-build
$X/cb-coverage-target/debug/build/rustix-5c4de8728bdbf71b/build_script_build-5c4de8728bdbf71b
$X/cb-coverage-target/debug/build/rustix-5c4de8728bdbf71b/build-script-build
$X/cb-coverage-target/debug/build/memoffset-b33bc06809024b5e/build_script_build-b33bc06809024b5e
$X/cb-coverage-target/debug/build/memoffset-b33bc06809024b5e/build-script-build
$X/cb-coverage-target/debug/build/crossbeam-utils-32f39ee2fcb887c9/build_script_build-32f39ee2fcb887c9
$X/cb-coverage-target/debug/build/crossbeam-utils-32f39ee2fcb887c9/build-script-build
$X/cb-coverage-target/debug/build/lock_api-6e4cdfa674a25fa0/build_script_build-6e4cdfa674a25fa0
$X/cb-coverage-target/debug/build/lock_api-6e4cdfa674a25fa0/build-script-build
$X/cb-coverage-target/debug/build/quote-70374ff0c4ca035a/build_script_build-70374ff0c4ca035a
$X/cb-coverage-target/debug/build/quote-70374ff0c4ca035a/build-script-build
$X/cb-coverage-target/debug/deps/rust_crate_importer-e420b5b99e68eb77
$X/cb-coverage-target/debug/deps/libclap_derive-fca315fae511eb9e.so
$X/cb-coverage-target/debug/deps/librustversion-ec02b5f6658254de.so
$X/cb-coverage-target/debug/deps/new_crate_yanker-f49c8bbfc5d517aa
$X/cb-coverage-target/debug/deps/curation-0f7343d133e87b27
$X/cb-coverage-target/debug/deps/libtracing_attributes-989634260b114420.so
$X/cb-coverage-target/debug/deps/libmaybe_async-a28baa601652d5df.so
$X/cb-coverage-target/debug/deps/libtime_macros-822cef6009cd293e.so
$X/cb-coverage-target/debug/deps/incremental-cf7c52bc74e38f80
$X/cb-coverage-target/debug/deps/rust_crate_importer-551d7f791c701471
$X/cb-coverage-target/debug/deps/test_the_tester-e26c6cb57748a830
$X/cb-coverage-target/debug/deps/full-46e8b31f1f94165f
$X/cb-coverage-target/debug/deps/libmockall_derive-a24c2513b5c405b2.so
$X/cb-coverage-target/debug/deps/errors-b0d705018f3b8ad2
$X/cb-coverage-target/debug/deps/amzn_crate_importer-dd4b9c8c73acb3be
$X/cb-coverage-target/debug/deps/libserde_derive-37214400c3ba75c1.so
$X/cb-coverage-target/debug/deps/libopenssl_macros-c6bede717dd9727c.so
$X/cb-coverage-target/debug/deps/metrics-088bee4901b1d3a9
$X/cb-coverage-target/debug/deps/libthiserror_impl-e92b5bbc64cc236b.so
$X/cb-coverage-target/debug/deps/new_crate_yanker-8f4c215c7ebf15b3
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/3te1m9n4u75hrpc9.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/2uq2flx1zjf0d2q5.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/41krjfy51yrkbkl4.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/3uabpapqrvv81tbe.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/4ue7u6ynict5g6id.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/1r82valv2likdyag.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/4g59v215a3qpy4y4.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/2a1pd6dtqtww7n8x.o
$X/cb-coverage-target/debug/incremental/binary-1-2os3jb1l9jubw/s-gjy2fuhsin-1sm9w5a-2z39ohcat1cfr/4eq4pf27tzjj617h.o

(note that I killed the process at this point — I assume it would have continued going through all the files)

This... feels like a problem.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

Looks like the incremental directory is really causing us problems because it includes lots of files for every output target (binaries, the crate itself, and integration tests): https://gist.github.com/jonhoo/ef66c75137b1f9a14c322faa747b8cc1

I also ran perf on grcov after all the llvm-cov export invocations have finished (but grcov is still running), and it shows a significant portion of execution time being spent in calls to realpath via std::fs::canonicalize through grcov::path_rewriting::canonicalize_path in grcov::add_results (called from grcov::consumer).

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

cc @jerel

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

@marco-c I'm happy to try to help in fixing this, but could use some pointers for how you think this should be handled. For example, would it be okay to just always ignore everything that's in /incremental/?

@jerel
Copy link

jerel commented Apr 11, 2023

Thanks for splitting this out and digging into it. What you've documented makes sense with what I've seen in practice. As supporting evidence my workaround solution completes quickly and (I think) does the same work but avoids enumeration of target/debug/**:

TEST_BIN_PATHS=$( \
  cargo test --no-run --message-format=json --tests --lib \
    | grep "{\"" \
    | jq -r "select(.profile.test == true) | .filenames[]" \
    | grep -v dSYM - \
)

rust-profdata merge -sparse ./*.profraw -o combined.profdata

rust-cov export \
  --ignore-filename-regex='/.cargo/registry|.cargo/git|/rustc|target/debug' \
  --instr-profile=combined.profdata \
  --object target/debug/libmy_app.so $TEST_BIN_PATHS \
  --format lcov > lcov-combined.info

grcov lcov-combined.info -s . -o ./html -t html --ignore-not-existing --llvm

If I'm not mistaken all of the binaries that grcov needs to load are at the top level of target/debug/* so maybe restricting the search to top level could work?

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

I think that kind of filtering would end up happening here so we don't even recurse down into that dir:

grcov/src/producer.rs

Lines 171 to 196 in 917494a

ArchiveType::Dir(ref dir) => {
for entry in WalkDir::new(dir) {
let entry = entry.unwrap_or_else(|err| {
panic!(
"Failed to open '{}'.",
err.path().unwrap().to_string_lossy()
)
});
let full_path = entry.path();
if full_path.is_file() {
let mut file = File::open(full_path).ok();
let path = full_path.strip_prefix(dir).unwrap();
self.handle_file(
file.as_mut(),
path,
gcno_stem_archives,
gcda_stem_archives,
profraws,
infos,
xmls,
linked_files_maps,
is_llvm,
);
}
}
}

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

The challenge in doing it in grcov is that we don't know it's being pointed at a Rust build directory. I suppose we could look for /{debug,release}/incremental/ and exclude that, but it feels like we may instead want to add a command-line argument to grcov to specifically exclude directories.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

So something like an --exclude flag that feeds into a filter_entry call on the WalkDir.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

Ah, wait, no, that's for walking paths, whereas what we need to filter is the stuff in --binary-path... Hmm..

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

It's over here we'll need to make a change:

grcov/src/llvm_tools.rs

Lines 102 to 130 in 917494a

let binaries = if metadata.is_file() {
vec![binary_path.to_owned()]
} else {
let mut paths = vec![];
for entry in WalkDir::new(binary_path) {
let entry =
entry.unwrap_or_else(|_| panic!("Failed to open directory '{:?}'.", binary_path));
if is_binary(entry.path()) && entry.metadata().unwrap().len() > 0 {
paths.push(entry.into_path());
}
}
paths
};
let mut results = vec![];
let cov_tool_path = get_cov_path()?;
for binary in binaries {
let args = [
"export".as_ref(),
binary.as_ref(),
"--instr-profile".as_ref(),
profdata_path.as_ref(),
"--format".as_ref(),
"lcov".as_ref(),
];

@jerel
Copy link

jerel commented Apr 11, 2023

If it's helpful to your exploration I had some uncommitted work from several weeks ago where I was experimenting with accepting binary-path as a vec of binaries and then passing them in to export as --object args master...jerel:multi-binary-path

IIRC it didn't solve the perf issue by itself (in hindsight probably because it was still walking the target/debug/ somewhere else).

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

Oh, that's interesting. You should open that as a PR independently of this; I think it's a good change in isolation.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

btw, the places you have &vec![...] I think you can just do &[...] instead.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

I think we have a couple of paths forward:

  1. Special-case exclude /{debug,release}/incremental/.
  2. Support a glob-based --exclude-binaries flag to "subtract" from the directory set chosen by --binary-path.
  3. Special-case exclude .o files.
  4. Only include files marked executable.
  5. Recommend that users pass target/{debug,release}/deps to --binary-path instead of target/{debug,release}.

I don't know which of these paths is better, and we may need the maintainers to chime in on that.

I also think it'd be nice if grcov -v logged every file it runs through

grcov/src/llvm_tools.rs

Lines 122 to 132 in 917494a

for binary in binaries {
let args = [
"export".as_ref(),
binary.as_ref(),
"--instr-profile".as_ref(),
profdata_path.as_ref(),
"--format".as_ref(),
"lcov".as_ref(),
];
match run(&cov_tool_path, &args) {

as that would highlight problems in this area much faster.

@marco-c
Copy link
Collaborator

marco-c commented Apr 11, 2023

The ideal solution is probably fixing #535, basically kind of reproducing the way llvm-cov ignores files.

1. Special-case exclude `/{debug,release}/incremental/`.

I'm not a fan of special cases, they might silently break things for uncommon use cases.

2. Support a glob-based `--exclude-binaries` flag to "subtract" from the directory set chosen by `--binary-path`.

This would make things more complex for users of grcov. I'd rather we did the right thing by default.

3. Special-case exclude `.o` files.

Same as 1.

4. Only include files marked executable.

With this, we'd risk missing some coverage (e.g. from shared objects).

5. Recommend that users pass `target/{debug,release}/deps` to `--binary-path` instead of `target/{debug,release}`.

Might work, hopefully it doesn't break anything, but I'm not 100% sure.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

Since 5 was easy to try out, I went that path (so no more .o files), but it's still very slow (multiple minutes) to execute, so I think there's a deeper problem here. An strace -f -c of the grcov execution gave:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 78.58 1102.814225       19947     55286      2332 futex
 11.71  164.292082         606    271057           sched_yield
  7.83  109.948569         157    697683       263 newfstatat
  1.08   15.155137      797638        19           wait4
  0.20    2.862066          28    100751     29369 read
  0.18    2.581477          29     86427           mprotect
  0.14    2.001592          21     93083           write

And here's the perf report:

-   59.39%     0.00%  llvm-cov         llvm-cov                          [.] (anonymous namespace)::CodeCoverageTool::run
   - (anonymous namespace)::CodeCoverageTool::run
      - 55.27% (anonymous namespace)::CodeCoverageTool::run
         - 54.95% llvm::CoverageExporterLcov::renderRoot
            - 54.63% llvm::CoverageExporterLcov::renderRoot
               - 53.91% llvm::coverage::FunctionRecordIterator::skipOtherFiles
                    14.34% bcmp
      - 4.12% (anonymous namespace)::CodeCoverageTool::load
         - 4.11% (anonymous namespace)::CodeCoverageTool::load
            - 3.47% llvm::coverage::CoverageMapping::load
               + 1.77% llvm::coverage::BinaryCoverageReader::create
                 1.22% llvm::coverage::CoverageMapping::loadFunctionRecord
            + 0.64% llvm::coverage::CoverageMapping::loadFunctionRecord
-   21.05%     0.01%  Consumer 0       grcov                             [.] grcov::consumer
   - 21.04% grcov::consumer
      - 11.91% grcov::llvm_tools::profraws_to_lcov
         - 11.91% grcov::llvm_tools::run (inlined)
            - 11.91% std::process::Command::output (inlined)
               - 11.90% core::result::Result<T,E>::and_then (inlined)
                    std::process::Command::output::_$u7b$$u7b$closure$u7d$$u7d$::h29c88bdea84f03c7 (inlined)
                  - std::process::Child::wait_with_output
                     + 11.90% std::sys::unix::pipe::read2 (inlined)
      - 5.76% grcov::parser::parse_lcov
         - 4.09% core::iter::traits::iterator::Iterator::collect (inlined)
            - <alloc::string::String as core::iter::traits::collect::FromIterator<char>>::from_iter
               + 4.07% <alloc::string::String as core::iter::traits::collect::Extend<char>>::extend (inlined)
         + 0.54% std::collections::hash::map::HashMap<K,V,S>::insert (inlined)
      - 3.37% grcov::add_results (inlined)
         - 2.48% grcov::path_rewriting::canonicalize_path (inlined)
            + std::fs::canonicalize (inlined)
           0.79% grcov::merge_results

10% of execution time is spent in memset as part of the read in Child::wait_with_output.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

So the pipe here is definitely a bottleneck. But the huge amount of cycles spent on skipOtherFiles in llvm-cov also feels pretty concerning.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

It seems like a fairly straightforward method:
https://github.com/llvm-mirror/llvm/blob/2c4ca6832fa6b306ee6a7010bfb80a3f2596f824/lib/ProfileData/Coverage/CoverageMapping.cpp#L189-L195

But it also appears to do a linear search, which could certainly cause problems for even modestly sized inputs.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

One obvious win here would be to run llvm-cov export in parallel. Doesn't solve the fact that llvm-cov export itself is seemingly fairly slow here (~4s per invocation), but at least it avoids the wait time being 4s times the number of binaries in target/debug/deps (in my case, there are 20).

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

Just invoking llvm-cov export on binaries through rayon gave me a huge speedup, even if I don't specifically subset to /deps/. Overall:

Benchmark 1: ./original
  Time (mean _ _):     71.804 s _  0.373 s    [User: 68.919 s, System: 17.522 s]
  Range (min _ max):   71.182 s _ 72.400 s    10 runs

Benchmark 2: ./original-deps
  Time (mean _ _):     44.957 s _  0.336 s    [User: 50.187 s, System: 7.719 s]
  Range (min _ max):   44.540 s _ 45.441 s    10 runs

Benchmark 3: ./parallel
  Time (mean _ _):     20.869 s _  0.107 s    [User: 121.422 s, System: 43.629 s]
  Range (min _ max):   20.749 s _ 21.037 s    10 runs

Benchmark 4: ./parallel-deps
  Time (mean _ _):     14.561 s _  0.054 s    [User: 94.777 s, System: 8.153 s]
  Range (min _ max):   14.474 s _ 14.637 s    10 runs

Summary
  './parallel-deps' ran
    1.43 _ 0.01 times faster than './parallel'
    3.09 _ 0.03 times faster than './original-deps'
    4.93 _ 0.03 times faster than './original'

@jerel
Copy link

jerel commented Apr 11, 2023

The project I'm on has 1,400 items in target/debug/deps/ so that 4s is noteworthy

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

You can ignore all the .d files though — grcov will filter those out eagerly. What matters is how many binary files you have in debug/deps (the number of executable files there is a reasonable proxy number). But still, yes.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

Interestingly, passing the --threads flag to llvm-cov export appears to not have any effect. At least I don't see any speed-up from it. Which suggests that llvm-cov export itself could probably also use some love to avoid this significant sequential cost as part of its execution.

jonhoo pushed a commit to jonhoo/grcov that referenced this issue Apr 11, 2023
This results in a significant speed-up since we can avoid invoking
`llvm-cov export` on all the object files in `target/debug/incremental`.

See mozilla#1013.
jonhoo pushed a commit to jonhoo/grcov that referenced this issue Apr 11, 2023
llvm-cov can be fairly slow per invocation, but can safely and easily be
parallelized for significant speed-ups when there are many binary
artifacts.

See mozilla#1013.
jonhoo pushed a commit to jonhoo/grcov that referenced this issue Apr 11, 2023
llvm-cov can be fairly slow per invocation, but can safely and easily be
parallelized for significant speed-ups when there are many binary
artifacts.

See mozilla#1013.
@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 11, 2023

I opened two PRs, one for the parallelization of invocations (#1015) and one for recommending folks point at /deps/ in the README (#1014).

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 12, 2023

Filed an issue with LLVM about llvm-cov export being slow: llvm/llvm-project#62079

marco-c pushed a commit that referenced this issue Apr 12, 2023
llvm-cov can be fairly slow per invocation, but can safely and easily be
parallelized for significant speed-ups when there are many binary
artifacts.

See #1013.
@TriplEight
Copy link

TriplEight commented Apr 13, 2023

Not sure where to ask, but I certainly can help testing things out.
I'm trying to setup grcov against a repository. Searching for binaries after running cargo test with find -type f -executable -exec file -i '{}' \; | rg 'executable; charset=binary' shows me 1146 results (intermediary and those used for tests binaries ofc). Of which, only 111 reside in ./target/../deps.

Firstly, I'm not really sure (as grcov doesn't output any logs for me) what is going on there.
Then, should I target all these binaries? Or those in ./target/../deps? Or just the main one? It's not explained anywhere. Also, --binary-path does not accept multiple input and it's not mentioned how deep does it look for binaries (this is partly answered in #1015 (comment), I got almost all the files covered).

@cole-abbeduto-particle
Copy link

It might also be worth suggesting that people use a separate profile for coverage on their systems. For example, my debug/deps folder that I was trying to run grcov on has over 40k files in it which made it take ~12m to run!

[profile.coverage]
inherits = "test"

Adding a coverage profile and doing a fresh test run dropped that number down to ~3k deps files which let grcov do what it needed to do in just over a minute!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants