Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use /proc/self/maps when available instead of std::env::current_exe #488

Conversation

pnkfelix
Copy link
Member

rust-lang/rust#101913 illustrates a scenario on Linux where backtraces fail because of their reliance on std::env::current_exe() always yielding the path to the Rust binary that is driving execution.

This PR changes that strategy, by first trying to parse /proc/self/maps and extract from it the path to the Rust binary that was mmap'ed in as part of linking/loading.

It addresses the problem for my local Linux host, as illustrated by the included regression test. I have not yet tested it on other platforms, but I am hoping I got the logic right (in terms of just falling back gracefully to std::env::current_exe() in cases where we either cannot find a /proc/self/maps, or we cannot find the instruction pointer of interest anywhere in the mapping reported by /proc/self/maps).

pathname: OsString,
}

pub(super) fn parse_maps() -> Result<Vec<MapsEntry>, &'static str> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic only applies to linux I think. Netbsd has /proc/curproc/. macOS doesn't have /proc at all. As for other unixes I'm not quite sure.

Copy link
Member Author

@pnkfelix pnkfelix Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to revise this so that only Linux pulls in the "interesting" parse_maps file, and every other target just goes to the no-op variant.

(I was hoping that I had gotten the logic right such that targets that didn't have the /proc/self/maps pseudo-file, or if it failed to parse said file, it would silently just fall back on std::env::current_exe, but I haven't confirmed that yet.)

Copy link
Member Author

@pnkfelix pnkfelix Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked again, and realized that the sole calling module is solely defined here:

} else if #[cfg(all(
any(
target_os = "linux",
target_os = "fuchsia",
target_os = "freebsd",
target_os = "openbsd",
all(target_os = "android", feature = "dl_iterate_phdr"),
),
not(target_env = "uclibc"),
))] {
mod libs_dl_iterate_phdr;

That makes this a lot easier to limit to just these small subset of targets.

I force-pushed an update that fixes that (and backtraces my API to a simpler one now enabled by removing the parse_running_mmaps_noop.rs code).

…e current_exe.

(updated to only define the parse_running_mmaps module in cases where we are
also defining libs_dl_iterate_phdr, since that is the only place it is currently
needed.)
@pnkfelix pnkfelix force-pushed the use-proc-self-maps-when-avail-instead-of-current-exe branch from 79f8187 to 75d6b28 Compare October 21, 2022 19:20
@pnkfelix
Copy link
Member Author

hmm, cargo test --no-default-features failed, something with how I'm using the std library. Will investigate

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable-enough to me, thanks for implementing this! One thing that might be worth checking is the binary size impact of adding this, since I'd expect that BufReader is not trivial in its code size.

If the code size of this is somewhat substantial there's probably a number of ways that this can cut down on code size, for example there's a fair number of allocations which I think could be removed and/or changed to iterators. Not necessarily trivially so I don't think it's worth doing unless motivated though.

@pnkfelix pnkfelix force-pushed the use-proc-self-maps-when-avail-instead-of-current-exe branch from 73062a8 to b7bc9d0 Compare October 24, 2022 16:55
@pnkfelix
Copy link
Member Author

pnkfelix commented Oct 24, 2022

This seems reasonable-enough to me, thanks for implementing this! One thing that might be worth checking is the binary size impact of adding this, since I'd expect that BufReader is not trivial in its code size.

Hmm. I honestly did not know there was a significant cost associated with BufReader itself. My sole motivation for using BufReader was just to get easy line-by-line parsing.

So here's some data:

On master, after doing cargo build --release:

% ls -sk target/release/libbacktrace.rlib
992 target/release/libbacktrace.rlib

On this PR's branch, after doing cargo build --release:

% ls -sk target/release/libbacktrace.rlib
1052 target/release/libbacktrace.rlib

I am inferring adding 60kb to a 992kb rlib is more than you want to pay for this..

If the code size of this is somewhat substantial there's probably a number of ways that this can cut down on code size, for example there's a fair number of allocations which I think could be removed and/or changed to iterators. Not necessarily trivially so I don't think it's worth doing unless motivated though.

I cannot tell from your feedback whether you are suggesting that I leave in the BufReader usage while removing allocations elsewhere, or to remove the BufReader in addition to making other changes.

In any case I'll poke at it and see if I can figure out a way to cut down on the code size increase without too much pain.

(/me does some searching on crates.io.) Let me look into using https://crates.io/crates/bstr; at least, I assume that would be an acceptable dependency to pull in here. (hmm, or I guess I'll need to pull in a local copy of its code, since the whole point is that this needs to work while depending on Rust's libstd alone...)

…save a bit on code size.

ls -sk before:
1052 target/release/libbacktrace.rlib

ls -sk after:
1044 target/release/libbacktrace.rlib
@alexcrichton
Copy link
Member

Thanks for measuring! 60k indeed seems pretty small and probably largely inconsequential given the size of this crate already. In that case while I think it's possible to somewhat easily cut down on code size it's not a priority, so it's fine to leave as-is, sorry if that wasn't clear!

It looks like there's some minor nits on CI but feel free to bump the msrv in the CI configuration file and re-push, there's no need to strictly maintain compatibilty with that.

@pnkfelix
Copy link
Member Author

Thanks for measuring! 60k indeed seems pretty small and probably largely inconsequential given the size of this crate already. In that case while I think it's possible to somewhat easily cut down on code size it's not a priority, so it's fine to leave as-is, sorry if that wasn't clear!

It looks like there's some minor nits on CI but feel free to bump the msrv in the CI configuration file and re-push, there's no need to strictly maintain compatibilty with that.

Well, I've already jumped through some other hoops, let me take a shot at seeing if I can satisfy the existing MSRV

@pnkfelix
Copy link
Member Author

pnkfelix commented Oct 25, 2022

Thanks for measuring! 60k indeed seems pretty small and probably largely inconsequential given the size of this crate already. In that case while I think it's possible to somewhat easily cut down on code size it's not a priority, so it's fine to leave as-is, sorry if that wasn't clear!

So ... in cc0cdbb I did sidestep the use of BufRead by instead doing a (very naive) Read-based implementation: namely, read the whole file into one String, and then split that into lines.

Doing this cut out 8K from the rlib size.

I'm sort of sad about doing this, but then again, even my previous implementation was allocating a Vec of String whose size was proportional to the full contents of /proc/self/maps, so I think its at most a 2x runtime-memory overhead over what I was previously doing.

Anyway, @alexcrichton , let me know if you'd prefer I put a line-by-line parsing version back. (I suspect if anyone invests time in doing that, they should also get rid of the Vec and just have the code directly extract the desired entry as it traverses the lines of /proc/self/maps.)

// by our binary.
//
// if we cannot `readelf` for some reason, or if we fail to parse its output,
// then we will just give up on this test (and not treat it as a test failure).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it might fairly easily lead to the test being skipped without anyone realizing due to some unrelated change. Are there known cases where readelf isn't available but the test is still run? Alternatively, could we exit with an error if this is detected and we know we're on CI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the way things are structured now, this test is run unconditionally regardless of target, so there's lots of cases where readelf is not available and the test is still run. (E.g. its run on my Mac laptop, and ends up in the IgnoreTest("readelf invocation failed".into()) path.)

Would you feel more comfortable if the output of the test still said "test ignored" but also included the text for why it was ignored, like so on Mac:

test result: ignored: "readelf invocation failed"

(I guess your message explicitly said that you want a hard error from CI. I don't know if we have that contextual information available when the test is running, I'll investigate.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: I recommend that we land this now and file an issue to figure out how best to revise its associated test. This patch is resolving a real (if niche) issue, and it seems suboptimal to block landing it based on the concern about some other change causing the test to start being incorrectly ignored.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, definitely not a blocker! I just worry about this all being broken, but the breakage not being detected because some other issue causes readelf to not be available on CI and so the test is always ignored.

@alexcrichton alexcrichton merged commit e1d31f2 into rust-lang:master Oct 25, 2022
@alexcrichton
Copy link
Member

Seems reasonable to me, thanks again for this @pnkfelix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants