Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get proper name for over-8-character sections #100

Merged
merged 5 commits into from
Sep 4, 2018

Conversation

roblabla
Copy link
Contributor

@roblabla roblabla commented Aug 6, 2018

Fixes #99

An early PR to get some reviews, as there are probably better ways to do this. I need to implement the base64 kind of section table names, and fix the error handling in some places.

@roblabla roblabla changed the title Get proper name for over-8-character sections [WIP] Get proper name for over-8-character sections Aug 6, 2018
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @roblabla , thanks for the PR! This looks great, modulo:

  1. fixing the panic(s)

We technically broke the api of parse for sectiontable, but I'm ok with it; I don't believe anyone uses it.

Unless no one else has any concerns, I'm ok with this, modulo the panics being fixed.

Thanks again for your contribution!

// TODO: Base-64 encoding
panic!("At the disco")
} else {
name[1..].pread::<&str>(0)?.parse().unwrap()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pread::<&str>(1) is preferred; also no unwraps, please.

if name[0] == b'/' {
let idx: usize = if name[1] == b'/' {
// TODO: Base-64 encoding
panic!("At the disco")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this panic (at the disco) will disappear in the final PR?

} else {
name[1..].pread::<&str>(0)?.parse().unwrap()
};
table.real_name = Some(bytes.pread::<&str>(string_table_offset + idx)?.to_string());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you're writing directly into a field whose type is statically known, i believe you can remove the &str annotation and be super cool.

It might also be a good time to make this struct zero copy w.r.t. strings, but maybe who cares? since this probably isn't super common and copying some section names isn't a big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation is necessary, rust can't infer the type otherwise.

WRT zero-copy, I could do this pretty easy, just have to tuck a lifetime on the struct. Would you prefer this approach?

For what it's worth, this pattern is pretty common on MinGW binaries. Most of their sections are setup this way.

pub fn name(&self) -> error::Result<&str> {
Ok(self.name.pread(0)?)
match self.real_name.as_ref() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this as_ref() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I run into borrow errors otherwise. The second as_ref in the Some case isn't necessary though.

src/pe/mod.rs Outdated
@@ -59,8 +59,9 @@ impl<'a> PE<'a> {
let offset = &mut (header.dos_header.pe_pointer as usize + header::SIZEOF_COFF_HEADER + header.coff_header.size_of_optional_header as usize);
let nsections = header.coff_header.number_of_sections as usize;
let mut sections = Vec::with_capacity(nsections);
let string_table_offset = header.coff_header.pointer_to_symbol_table + header.coff_header.number_of_symbol_table * 18;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this magic 18 come from? I believe it should be a const with a (good) name, or at least a comment explaining it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the size_of a single symbol in the table. Do we have a struct representing a Symbol in goblin ? I hardcoded 18, which is the common case scenario. But if we're handling a BigObj COFF (https://github.com/llvm-mirror/llvm/blob/af7b1832a03ab6486c42a40d21695b2c03b2d8a3/lib/Object/COFFObjectFile.cpp#L704), then that'll be wrong.

I believe goblin doesn't support bigobj coffs, though. I guess I'll put a comment about this, and put it in a const.

@m4b
Copy link
Owner

m4b commented Aug 8, 2018

Need to fix CI; i know you want to use latest stuff, but ..= isn't on 1.19; should be an easy fix though.

@roblabla roblabla changed the title [WIP] Get proper name for over-8-character sections Get proper name for over-8-character sections Aug 8, 2018
@roblabla
Copy link
Contributor Author

roblabla commented Aug 8, 2018

CI is still failing, but I think it's unrelated to me? It's complaining about lazy_static using unstable features on 1.19. lazy_static is a dependency of env_logger -> regex -> thread_local.

graph graph

@m4b
Copy link
Owner

m4b commented Aug 8, 2018

Thanks for the graph! That’s it, I’m done with env logger. Second time transitive deps have broken us; I’m tired of this shit.

@m4b
Copy link
Owner

m4b commented Aug 10, 2018

heh; i assumed the update branch button would rebase your commits on top; but it does not... what a pain in the ass

@m4b
Copy link
Owner

m4b commented Aug 20, 2018

So are we good to go for this? @philipc anyone else?

@philipc
Copy link
Collaborator

philipc commented Aug 20, 2018

LGTM. We're not validating that the offsets are within the string table (the string table size is stored at the start of the table), not sure if that matters much.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need clarification on the assert; also @philipc is your comment about offsets a blocker here? Otherwise I want to merge once the assert issue is clarified

// Decodes a string table entry in base 64 (//AAAAAA). Expects string without
// prefixed slashes.
fn base64_decode_string_entry(s: &str) -> Result<usize, ()> {
assert!(s.len() <= 6, "String too long, possible overflow.");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should assert unless it's a fundamental programmer error somehow. Return an Err is better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same, but looking into it I don't think it's possible for this to assert because the section name is at most 8 bytes, and 2 of those are slashes.

Copy link
Contributor Author

@roblabla roblabla Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not return an error. If this assert is triggered, then something went terribly wrong, and we absolutely should abort.

@philipc
Copy link
Collaborator

philipc commented Aug 25, 2018

The offsets aren't a blocker, but maybe put a todo comment?

@roblabla
Copy link
Contributor Author

roblabla commented Aug 28, 2018

Maybe parse should pass a string table slice instead of a string table offset ? That would take care of ensuring we stay within the table. The first 4 bytes of the string table should be its size (though I'm not sure if it's enforced by the PE loader or anything), so getting the slice is pretty easy.

@philipc
Copy link
Collaborator

philipc commented Aug 29, 2018

Yeah I have no idea if the PE loader enforces it, and it's not like we're going to panic, so let's leave it.

@philipc
Copy link
Collaborator

philipc commented Aug 29, 2018

@m4b Is the out-of-date branch check necessary? Or should I be clicking that Update branch button and adding more merge commits...?

@m4b
Copy link
Owner

m4b commented Sep 3, 2018

The out of date button doesn’t rebase as I would expect so I don’t want to use it anymore. Probably should have PR authors rebase on master and force push latest before merging. In this case can just merge I think and it’ll be fine

@m4b
Copy link
Owner

m4b commented Sep 3, 2018

Hah i just got “server error” when trying to merge; dunno

@philipc
Copy link
Collaborator

philipc commented Sep 4, 2018

My "Merge pull request" button is disabled. It seems like you have disabled "Allow merge commits" in the settings. Is that correct? Requiring authors to rebase their PR just because another PR was merged first seems cumbersome to me.

@m4b
Copy link
Owner

m4b commented Sep 4, 2018

Yea agreed, ok i removed that from settings (had no idea it was even enabled)

@m4b m4b merged commit dd3a923 into m4b:master Sep 4, 2018
@m4b
Copy link
Owner

m4b commented Sep 4, 2018

thanks for you PR and patience @roblabla

gsolberg referenced this pull request in gsolberg/object Sep 20, 2018
DWARF sections names in PE files are stored indirectly through a symbol table.
Names of the form "/n" where "n" is a number indicate the section name is
stored in index "n" of a hidden symbol table.  The hidden symbol table immediately
follows the COFF symbol table and is a sequence of null terminated CStrings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pe SectionTable::name should parse MinGW-style names
3 participants