Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elf gnu symbol versioning #280

Merged
merged 20 commits into from Sep 14, 2021
Merged

Conversation

johannst
Copy link
Contributor

@johannst johannst commented Aug 13, 2021

GNU symbol version extension for ELF files (#263).

WIP PR to discuss overall design.

Tasks:

  • Parse .gnu.version_r section
  • Parse .gnu.version_d section
  • Parse .gnu.version section
  • Test parsing .gnu.version_r section
  • Test parsing .gnu.version_d section
  • Test parsing .gnu.version section

@johannst johannst marked this pull request as ready for review August 15, 2021 19:41
@johannst
Copy link
Contributor Author

First implementation done, any feedback is welcome :]

Copy link
Collaborator

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking quite good! I've only reviewed for correctness, not style (which is probably ok but I'll leave that to @m4b ).

src/elf/symver.rs Outdated Show resolved Hide resolved
src/elf/symver.rs Outdated Show resolved Hide resolved
src/elf/symver.rs Outdated Show resolved Hide resolved
src/elf/symver.rs Outdated Show resolved Hide resolved
src/elf/symver.rs Outdated Show resolved Hide resolved
@johannst
Copy link
Contributor Author

Thanks for the quick review, I'll take a look at the details after work.

- Remove & properly handle unwrap in VersymIter::next
- Fix VersymIter::size_hint
- Adapt datatypes of fields in exposed ELF structs
- Remove calls to unwrap and validate offsets in all Iterator implementations
- Adapt datatypes of fields in exposed ELF structs
@johannst johannst changed the title WIP: Elf gnu symbol versioning Elf gnu symbol versioning Aug 16, 2021
Copy link
Collaborator

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more things for better handling of corrupt data. These apply to all four of Verdef/Verdaux/Verneed/Vernaux.

src/elf/symver.rs Outdated Show resolved Hide resolved
src/elf/symver.rs Outdated Show resolved Hide resolved
- Return 0 as lower bound in size_hint impls (better hint for corrupt
  ELFs)
- Verdef/Verdaux/Verneed/Vernaux Iterator: Start yielding None on the
  next call if there is no valid next index
@johannst
Copy link
Contributor Author

I was thinking, with the primitives available for reading, to also add some utility to lookup version strings.
For example something like Versym -> Option<&str>.
@philipc what do you think about that? Or should we target higher-level utilities in a new issue/PR?

@philipc
Copy link
Collaborator

philipc commented Aug 18, 2021

It's certainly useful to have something like that, but it's a bit harder to be sure what the correct API is, so maybe leave it for a new PR, but doing a prototype (and using it for a real task) wouldn't hurt.

@johannst
Copy link
Contributor Author

It's certainly useful to have something like that, but it's a bit harder to be sure what the correct API is, so maybe leave it for a new PR, but doing a prototype (and using it for a real task) wouldn't hurt.

In that sense the PR is done from my side.

The only question is if there is a better way to handle the test ELF binaries? I optimized them for size but if you have an idea I would prefer to not commit them here?
Building during test execution doesn't seem to be a good idea due to the windows runners.
Maybe we could maintain another repository with all the binaries and include it submodule here? But that would also be a task for another issue/PR.

@johannst
Copy link
Contributor Author

@m4b any further feedback on style or completeness, please let me know.

@m4b
Copy link
Owner

m4b commented Sep 5, 2021

@johannst sorry for delay, and thank you for your patience; going to give this one last review, but it looks good to me; if i don't merge by tomorrow, feel free to ping.

There is not breaking changes yes? (I can release this in a minor release once it's merged if you like.

@johannst
Copy link
Contributor Author

johannst commented Sep 5, 2021

Don't worry, that's fine, you can take your time to review it.

Actually I have one concrete question about the cfg guard.
Other modules wrap the implementation in a macro, but as far as I can tell, rustfmt doesn't format this code.
But to align on style, would you prefer to use a macro in the symver module as well?

There is not breaking changes yes? (I can release this in a minor release once it's merged if you like.

Right, it is compatible on the API level.

@m4b
Copy link
Owner

m4b commented Sep 5, 2021

Yea I believe we've had issues in past with rustfmt and the macro stuff, it's quite annoying.

So I believe the original reason for the macro was because it put the cfg on every item, which too burdensome manually (and error prone). If the module has some entire feature applicable to it, you can just put it on the module; if there is some subset of features though, like alloc then probably macro is necessary.

I'm not sure what your cfg needs are, need to do a closer review to see what you mean (or you could summarize what you're thinking if you like too :) )

@johannst
Copy link
Contributor Author

johannst commented Sep 5, 2021

I'm not sure what your cfg needs are, need to do a closer review to see what you mean (or you could summarize what you're thinking if you like too :) )

As is today, to really use the features of the symver module we require (elf32 || elf64) && alloc.
Due to the rustfmt issues with the macros I implemented the cfg guard with a nested impl module (see below).
But you are right, in this case (where the cfg applies globally to the module) it would have been better to directly put it on the module in src/elf/mod.rs.

#[cfg(all(any(feature = "elf32", feature = "elf64"), feature = "alloc"))]
pub use symver_impl::*;

#[cfg(all(any(feature = "elf32", feature = "elf64"), feature = "alloc"))]
mod symver_impl {
     ...
}

However if we decide that is it valuable to offer the ELF type definitions in non-alloc environments then I need to put the cfg guards on the items in the module itself.

So I believe the original reason for the macro was because it put the cfg on every item, which too burdensome manually (and error prone).

Definitely agree, putting it by hand on every item is quiet error prone.
If I have some time I can check why rustftmt has a problem with the macros as we use them.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks so much for your patience !

I just had some minor nits, but nothing even necessary to change.

Only real question was about the pub strtab field and "future proofing" to best of our ability, but otherwise this looks great, and looks ready to merge to me. Thanks again for your patience and for committing this awesome work!

src/elf/symver.rs Outdated Show resolved Hide resolved
} else {
self.bytes
.gread_with::<ElfVersym>(&mut self.offset, self.ctx.le)
.ok()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will stop iteration on first bad parse, which I assume is intended

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, while the assumption was that it fails if there are not enough bytes left to read an ElfVersym.
Maybe I can comment that intend.

src/elf/symver.rs Show resolved Hide resolved
src/elf/symver.rs Show resolved Hide resolved
src/elf/symver.rs Outdated Show resolved Hide resolved
@@ -53,6 +53,8 @@ pub mod dynamic;
#[macro_use]
pub mod reloc;
pub mod note;
#[cfg(all(any(feature = "elf32", feature = "elf64"), feature = "alloc"))]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a little sad this only needs the alloc feature because (iiuc) it creates a Strtab, which I believe until only recently did not require the alloc feature either (maybe it still did, I can't remember).

Anyway, something we could possibly investigate fixing in the future, but it's not related to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems alloc is also required for a few other items as well, eg Result, Error, Pread derive, SectionHeader.
You can nicely see it if you run make api and remove the alloc from the cfg guard.

#[derive(Debug)]
pub struct VerdefSection<'a> {
/// String table used to resolve version strings.
pub verstr: Strtab<'a>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is a pub struct, and this field is pub; I don't see it used anywhere, so I presume it's for clients to resolve symbols; but I'm wondering if we can expose an api to resolve it for them? Since once this is a pub it's pub forever basically, so just making sure it's something want to commit to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, these Strtabs are used to resolve any symbol version related strings.
Only the rustdoc examples at the top of the module show the usage of those.

but I'm wondering if we can expose an api to resolve it for them

Out of my head I see the following two approaches:

  1. Pass a reference to the corresponding Strtab down to Verneed | Vernaux | Vernaux, and provide APIs on those structs to get the strings.
  2. Offer APIs on the sections to resolve the strings, eg VerneedSection::file_name(_:&Verneed) -> Option<&str>

Not really decided which one I prefer, maybe after a night of sleep :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strtab must be the same as Elf::dynstrtab (because the dynamic loader can't use section headers to find the strings). So it might be better to pass a reference to Elf::dynstrtab when needed instead of having this field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philipc good point, I was mainly following the LSB here.

The sh_link member of the section header shall point to the section that contains the strings referenced by this section.

Out of interest I also look at the common static linkers, how they craft the version related sections (gold, ld and lld).

In that sense we could drop the verstr fields completely for now and only offer the minimal APIs. Users could then resolve the strings against Elf::dynstrtab directly.

As discussed previously, we could then look into providing higher-level APIs in a second PR later (which I started to toy around with).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committed that change for now and updated the doc examples.
Let's see your opinions.

- impl From<ElfVersym> for Versym
- remove '_ lifetime annotation on SectionHeader slice
- remove string tables from VerdefSection / VerneedSection
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this PR has been going for a while, and I don't want you to lose motivation; with the pub field removed, I think this is in a great state to merge, and we could add nicer apis at a later date if necessary (though it seems somewhat ok as it is, at least in example)

src/elf/symver.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,22 @@
# Build shared libraries (32/64) with versioned symbols and applications (32/64).
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unrelated to your patch but I realized goblin is using an include section, and we include tests wholesale; since I believe include is only for crates downloads, we should only include what this crate needs to build it's source as a dependency, not it's tests; so all this stuff is adding a lot of unnecessary downloads to people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honesty I don't like to commit binaries, but I didn't see a good way at the time, and I saw other binaries. I compiled them with -Os but it doesn't change the fact :)

Interesting, I didn't know about the include key, but I agree, for people only building we should remove the tests/ folder.
So dependencies to goblin become slimmer and people wanting to hack on goblin will need to clone it from gh anyway.

@johannst
Copy link
Contributor Author

Thanks for your review.
I agree, I think it's in a good state to merge now.

@m4b
Copy link
Owner

m4b commented Sep 14, 2021

Thank you for your patience and this awesome PR!

@m4b m4b merged commit 73b5e27 into m4b:master Sep 14, 2021
@m4b
Copy link
Owner

m4b commented Sep 14, 2021

So I squashed your commits, and because some of your commit messages were quite good and looks like you took time to write them, I chose the best ones and included that in the main body; thanks again! I'll have a minor release soon, since this is non-breaking, though I may want to roll up other changes if possible, not sure; is this urgently needed for you?

@johannst
Copy link
Contributor Author

Thanks for the effort to extract some of the commit messages.

I'll have a minor release soon, since this is non-breaking, though I may want to roll up other changes if possible, not sure; is this urgently needed for you?

That's fine for me, I mainly used this features in some experiments for now where I can use a version from gh.

@messense
Copy link
Contributor

I'd love to have a new release to land PyO3/maturin#626, thanks!

@m4b
Copy link
Owner

m4b commented Sep 18, 2021

ok 0.4.3 is out; i trimmed the include field too to remove tests, which includes binary files, etc., so the crates.io download should be much slimmer too!

@johannst johannst deleted the elf-gnu-symbol-versioning branch October 3, 2021 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants