Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a feature to disable pe rva resolve for already mapped modules #188

Merged
merged 1 commit into from Jan 31, 2021

Conversation

ko1N
Copy link
Contributor

@ko1N ko1N commented Oct 6, 2019

It would be extremely helpful to use the capabilities of goblin when working with memory/process dumps as well as files on disk. Since memory dumps are already mapped we can just skip over the rva resolve function and all of the other functionality will work still.

I'm not sure however if it would be best to implement this as a compile-time feature (as this PR does it) or as a runtime feature (maybe splitting the parse function into parse_image / parse_mapped or something similiar.

@m4b
Copy link
Owner

m4b commented Oct 8, 2019

Hi @ko1N thanks for the PR! The use of features is a nice quick fix, but from what I understand, cargo features are meant to be "additive"; e.g., the semantics/ behavior of a function shouldn't typically change depending on which feature is added, e.g., someone activating this feature on top of someone else not, can break everyone's build, or make it behavior strangely, so-called action at a distance.

That being said, I'm not sure what the best solution is here; parse_image/parse_mapped sounds like a lot of boilerplate and a lot of work, whereas your current solution is quite clean, though it has the problems above.

Does anyone else have any suggestions here?

@ko1N
Copy link
Contributor Author

ko1N commented Oct 8, 2019

Hello @m4b! I gave this issue a bit more thought and ye, you're right. We shouldn't change functionality through a feature as it would prevent people from using it in both ways in the same program.

I think a clean solution would be to change the parse() function to additionally accept something like a ParseOptions struct. We could also add another function which wraps the new parse function with some default options to not introduce an api change.
Maybe this could also be beneficial for other use cases (e.g. adding an option which will make the parser instantly fail on error, opt-out/opt-in for specific parts of the header being parsed, etc).

@m4b
Copy link
Owner

m4b commented Oct 8, 2019

Parser options depending on how they’re implemented could be interesting; might also be a segue into partial parsing or other effects on how goblin parses ?

Eg #181

@ko1N
Copy link
Contributor Author

ko1N commented Oct 8, 2019

Ye I think this could be really useful in other cases as well. I can change the parse function so it accepts a ParseOption struct and add the Rva option to it if u want.

@ko1N
Copy link
Contributor Author

ko1N commented Oct 8, 2019

I removed the feature and added a runtime checked struct which will be passed through the relevant parse() calls. I also added wrappers (parse / parse_with_opts) to keep all the tests working and the API kind of stable.

Should we also add wrappers to all the other functions in pe::utils to keep its API consistent (try_name, find_offset_or, etc)?

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is great ! I like the general gist of this PR but I think all but the top parse is a backwards incompatible change? Maybe this is ok since it is somewhat unusual to be parsing with those internal functions. I’m not sure :/

@ko1N
Copy link
Contributor Author

ko1N commented Oct 8, 2019

Yes, you are totally right. I forgot to include the other parse() functions in my earlier commit. I'm not sure if we should write wrappers for the pe::utils though?

Edit:
Thinking about the wrapper() change I think this could introduce unintended code complexity down the line. If you want to change anything inside the PE crate you have to be very careful now to pass the options struct down the calls properly. If someone mixes up the regular parse() in a parse_with_opts() call this could create mean bugs.
I'd propose to leave the function wrappers in there for now but to remove them later down the road when doing a bigger version change so the API changes would be more tollerable.

Edit2:
It would also be great to construct a test case for this. But as I saw you are still going to figure out how to approach the binary management for that. I could create a windows test executable, dump it and add it as a test case if a binary test suite is merged in.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good, modulo docs and other minor issues.

So to summarize, with your change the user is able to use goblin to parse an in-memory PE binary, e.g., a dll or what have you that's been loaded into a running process, yes?

Quite cool if so! The elf version of this has tons of unsafe (though it is zero-copy-ish to be fair)

src/pe/exception.rs Show resolved Hide resolved
src/pe/options.rs Show resolved Hide resolved
src/pe/options.rs Outdated Show resolved Hide resolved
@m4b
Copy link
Owner

m4b commented Oct 9, 2019

Could this change be used for a windows backtrace functionality, e.g., in backtrace-rs with gimli feature? @philipc or @willglynn maybe you'd know?

@m4b
Copy link
Owner

m4b commented Oct 9, 2019

@ko1N you may have to rebase on master, I just merged another change for PE stuff

@philipc
Copy link
Collaborator

philipc commented Oct 9, 2019

@m4b Are you referring to unwinding or symbolication? The gimli feature in backtrace-rs is only for symbolication, and it already works on windows AFAIK.

@m4b
Copy link
Owner

m4b commented Oct 9, 2019

symbolification; interesting maybe this comment needs to be updated (I understood that it only works on linux atm): https://github.com/rust-lang/backtrace-rs/blob/master/Cargo.toml#L94

could this change help in unwinding? I'm asking random questions, haven't thought about it at all :P

@philipc
Copy link
Collaborator

philipc commented Oct 9, 2019

Yeah I think that comment needs updating, there's window support here. I don't think this has been needed there because we don't need to use RVAs to find sections or symbols.

For unwinding it's probably helpful since my understanding is the unwind info contains RVAs that point to other parts of the unwind info.

@ko1N
Copy link
Contributor Author

ko1N commented Oct 9, 2019

If I see it correctly libunwind-rs uses its own name to parse the file on disk with goblin right now. See here @m4b.

By removing the find_offset function to resolve the RVA we can now simply use a memory copy of a running process and parse the PE header with it. E.g. you can run a tool like pedump to dump the entire process memory on disk and parse the resulting file with goblin. Or you can use it do parse the PE header of a running process. You can even use it to parse a kernel memory dump now (as the kernel is just a PE executable as well).

@ko1N
Copy link
Contributor Author

ko1N commented Oct 9, 2019

When playing around I found that the debug guid wasnt reported with disabled rva resolve. I added a runtime switch where it would select either pointer_to_raw_data or address_of_raw_data in the codeview debugging parser depending on the configuration.

I'm not entirely sure if we need this in the CoffHeader parser as well. Maybe someone could double-check on this.

@m4b
Copy link
Owner

m4b commented Oct 15, 2019

@ko1N did you need help to push this through ?

What’s the status, your last comment is remaining issue before can be merged ?

I like this change. Is there anyway to tell at parse time whether the PE would require rva resolution or no? Would be nice if Eg, bingrep, just “worked” if I passed it a process dump or executable.

@ko1N
Copy link
Contributor Author

ko1N commented Oct 15, 2019

I checked back on it but I don't see any obvious issue holding the merge back right now. I'm already using goblin to parse some of those mapped executables and it works fairly well.
There is also very minimal risk of breaking tools which make use of goblin right now.

However it would still be a good idea to test this functionality. Do you think I should compile and dump a test executable so we can write an automated test for it. Or do you think we should wait on the binary test suite for that matter?

Edit: I'm not sure if we can easily tell apart a dumped from a regular binary. I will try and see if I can find a simple way. Probing sadly won't work as goblin will parse it without an error in both cases. There will just be quite a lot of fields missing.

@ko1N
Copy link
Contributor Author

ko1N commented Sep 4, 2020

Do you have any update on the testing situation?

@m4b
Copy link
Owner

m4b commented Jan 31, 2021

@ko1N major apologies; i'm not sure why this wasn't merged; i don't see any backwards incompatible changes here, and it seems like a reasonable api to have, albeit somewhat niche (but that's fine!)

If you're still interested, i can help rebase this, and push; if not i may close this, let me know whenever you find the time, thanks!

I'll leave this open for now

@ko1N
Copy link
Contributor Author

ko1N commented Jan 31, 2021

I squashed and rebased the commit to the current master.
It's a niche feature but I intend to use exactly this in my memflow project.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Sorry this took so long; so i ran your patch with rdr example on a bunch of windows system libraries, e.g., in System32 and nothing had a non-zero error code, so i'm good with merging this. thanks for being so patient!

@@ -59,14 +76,31 @@ pub const IMAGE_DEBUG_TYPE_FIXUP: u32 = 6;
pub const IMAGE_DEBUG_TYPE_BORLAND: u32 = 9;

impl ImageDebugDirectory {
#[allow(unused)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think it would just be better to delete this and the other unused private function, but not important

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to keep the API compatible here (even if existing code calls parse on something like ImageDebugDirectory manually). Maybe a better idea would be to mark these functions as deprecated and remove them in a next major release.

@m4b m4b merged commit 9359fd0 into m4b:master Jan 31, 2021
@m4b
Copy link
Owner

m4b commented Jan 31, 2021

@ko1N would you like a release? There are a couple changes coming, one of them breaking change, but i can branch and cherry-pick this and release a 0.3.3 so you can get off of git or whatever :) you've waited long enough

@m4b
Copy link
Owner

m4b commented Jan 31, 2021

ok released as 0.3.3 :)

@ko1N
Copy link
Contributor Author

ko1N commented Jan 31, 2021

Thanks so much! 👍

m4b pushed a commit that referenced this pull request Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants