Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing the raw COFF section table to the user #239

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

not-wlan
Copy link

@not-wlan not-wlan commented Oct 10, 2020

I'm working on something that requires raw access to the section table and figured I might as well add this to the library so others may make use of this.
I tried building upon the existing code but I didn't want to break the signature of header.coff_header.sections either.

This shouldn't break anything and should also add some validation for broken PEs.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool Thank you for this PR! Just a doc comment would be nice on the new public field; I don't know much about section_tables are they optional? What do they contain? Might be good for minor summary of this in the doc comment?

Also just curious, what are you using the section tables for ? :)

src/pe/mod.rs Show resolved Hide resolved
src/pe/mod.rs Outdated Show resolved Hide resolved
@not-wlan
Copy link
Author

Also just curious, what are you using the section tables for ? :)

Microsofts CodeView debug format (aka PDB) contains a copy of the raw section headers. I'm currently working on generating those for already existing binaries to make reverse engineering them easier.
I've written a Rust module to take care of the actual generation using LLVM as the backend for generating the PDB file. In this scenario I'm currently parsing the same PE twice, once in Rust to get basic information like the image base and well the parsed sections. Ideally I'd like to skip the 2nd parse inside my wrapper though and just pass the data from goblin. I realize that I'm able to calculate that offset in my code as well figured that others might need it as well.

I don't know much about section_tables are they optional? What do they contain? Might be good for minor summary of this in the doc comment?

I figured that wasn't necessary considering that goblin is already parsing them and fails if they're not present or corrupted.

@m4b
Copy link
Owner

m4b commented Oct 14, 2020

I figured that wasn't necessary considering that goblin is already parsing them and fails if they're not present or corrupted.

So I guess what I’m asking is there other structure we need to pull out of the section tables or is &[u8] the best we can do / depends on other information ?

@m4b
Copy link
Owner

m4b commented Oct 14, 2020

I’m asking specifically because I’m wondering if it’s the best public api to expose because if it isn’t and it needs to change then it’s a breaking change. (And it’s very well possible it is the best, in which case, great !)

@not-wlan
Copy link
Author

not-wlan commented Oct 14, 2020

Oh this definitely isn't the best way to achieve what I'm trying to do. The structure behind that is IMAGE_SECTION_HEADER which is already being parsed into a struct

pub sections: Vec<section_table::SectionTable>,

The issue with that being that it copies the raw data into a custom struct

#[repr(C)]
#[derive(Debug, PartialEq, Clone, Default)]
pub struct SectionTable {
pub name: [u8; 8],
pub real_name: Option<String>,
pub virtual_size: u32,
pub virtual_address: u32,
pub size_of_raw_data: u32,
pub pointer_to_raw_data: u32,
pub pointer_to_relocations: u32,
pub pointer_to_linenumbers: u32,
pub number_of_relocations: u16,
pub number_of_linenumbers: u16,
pub characteristics: u32,
}

that isn't 1:1 what's in the binary so I can't pass it to LLVM. I'm definitely duplicating behaviour here. A better (read: cleaner) solution would be to rewrite the section header parsing to produce the correct structure or even better, pass references to the original buffer back. I avoided doing that because that'd definitely be a breaking change.
If you're down for doing that then I'd be willing to implement that as well :)

@m4b
Copy link
Owner

m4b commented Nov 28, 2020

@not-wlan sorry I got busy and didn't notice your follow up; it's too bad, just released a new (breaking 0.3).

So if i understood your proposal correctly, you want the section table we already parse to be 1:1 binary compatible so you can pass that back to llvm? Could you highlight what needs to change right now in the SectionTable struct we parse in order for it to be compatible? I assume it's just the name perhaps? Maybe we could make the SectionTable binary compatible, then have a fn name() accessor ?

@m4b
Copy link
Owner

m4b commented Nov 28, 2020

Alternatively, I think I'm ok with merging a version of this PR which tucks away the raw bytes in the struct as a private field (which e.g., mach-o does), and then we add a function like fn raw_section_table(&self) -> Result<&[u8]> which does the parsing you do in the main?
Alternatively, if we already fail when the section table is invalid, it might be better to do that parsing early (though it's preferred less and less to be less strict when parsing binaries, and allow user to handle errors however they might want).

@m4b
Copy link
Owner

m4b commented Jan 31, 2021

ping @not-wlan re my last comment, and in general, are you still interested in this feature? I'll leave it open for now, let me know whenever you find the time, no rush, thanks and thanks again for the PR and discussion!

@RaitoBezarius
Copy link
Contributor

ping @not-wlan re my last comment, and in general, are you still interested in this feature? I'll leave it open for now, let me know whenever you find the time, no rush, thanks and thanks again for the PR and discussion!

I'm interested into getting this PR back in shape if needed. :)
I agree with your last comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants