Use u64/i64 instead of usize/isize for r_offset and r_addend #86

Amanieu · 2018-04-15T10:36:36Z

This avoids these values getting truncated when reading a 64-bit ELF files from a 32-bit platform.

[breaking-change]

This avoids these values getting truncated when reading a 64-bit ELF file from a 32-bit platform.

Amanieu · 2018-04-15T10:39:08Z

The same issue probably applies to other places as well, it's just that this is the one that I am actually using.

m4b · 2018-04-15T17:54:57Z

Hi @Amanieu thanks for the PR!

Could you provide a motivating usecase for this particular kind of workflow, however?

Specifically, goblin assumes the parser/analyzer (which is the “unified” structs) is on a 64-bit machine if you want to read 64-bit binaries.

As you’ve encountered, reading a 64-bit binary is not supported very well on 32-bit machines. I considered this but did not find a good motivating reason to want to do this, so I’m hoping you could elaborate what exactly you’re doing/ why parsing 64bit binaries on a 32-bit machine is necessary.

Notwithstanding the breakage this change would cause, I’m not sure how it really solves the problem in many ways. Specifically, various r_offset fields will be used as indexes into symbol tables, etc, which are indexed vectors.

Hence the truncation occurs here or at time of indexing, I believe.

Amanieu · 2018-04-15T18:08:50Z

I have a tool which works with ELF files of other architectures and it would be nice (though not strictly necessary) for it to work on 32-bit platforms.

I am particularly worried that this truncation could cause a silent error when running on a 32-bit platforms. It would be fine if it explicitly failed, but silent data corruption is worrying.

Now, regarding this change in particular: in dynamic relocations (.dynrel), r_offset contains the virtual address of the value being relocated (so that the dynamic linker can patch it) rather than its offset in the ELF file. This means that even if the size of the ELF file itself is < 4G, r_offset can have a value that does not fit in 32 bits.

r_addend is an even more obvious case: this is added to 64-bit values that are being relocated, so it must be a full 64-bit value.

Neither of these two values are used for indexing into any tables, so I think they should be changed to 64-bit values.

m4b · 2018-04-15T18:23:44Z

I am particularly worried that this truncation could cause a silent error when running on a 32-bit platforms. It would be fine if it explicitly failed, but silent data corruption is worrying.

Yes this is a legitimate worry, would be nice to warn on 32-bit platforms that reading a 64-bit file is not well supported, or something? But maybe this is client downstream consumer responsibility? Dunno.

Neither of these two values are used for indexing into any tables, so I think they should be changed to 64-bit values.

Ok, you have convinced me. I think this will break faerie here, but this is probably a good thing: https://github.com/m4b/faerie/blob/b25c392a265936dd1e005a76df271e15ec528a7c/src/elf.rs#L274

Thinking about it more, it is somewhat reasonable to want to emit 64-bit object files on a 32-bit system (why not?), and above is instance where this capability is broken.

Lastly, do you think we could add a test for some of the issues described here, and if so, what does it look like?

m4b · 2018-04-15T18:33:50Z

src/elf/reloc.rs

@@ -132,7 +132,7 @@ macro_rules! elf_rela_std_impl { ($size:ident, $isize:ty) => {
            impl From<Rel> for Reloc {
                fn from(rel: Rel) -> Self {
                    Reloc {
-                        r_offset: rel.r_offset as usize,
+                        r_offset: rel.r_offset as u64,
                        r_addend: 0,


We may also want this to be an Option, but I didn't do it cause i was being lazy ;)

Since this is a breaking change anyways, I turned it into an Option as part of this PR.

m4b · 2018-04-16T02:59:40Z

Ok, this looks good to me, will merge; if anyone else has any concerns, feel free to add here.

It might be a good idea to have someone review other uses of vm addresses, etc., in other places where usize is being used.

I'm still not sure on best policy for the remaining usizes, e.g., r_sym, etc. Any comments or insights welcome.

Amanieu · 2018-04-16T03:06:55Z

I left r_sym as a usize since it is a 32-bit even for 64-bit ELF and therefore it won't get truncated.

One potential issue is DynamicInfo which uses usize everywhere despite many of these values being virtual addresses. However that doesn't affect me personally since I work around that by parsing the Dyn array directly.

Use u64/i64 instead of usize/isize for r_offset and r_addend

58bee4f

This avoids these values getting truncated when reading a 64-bit ELF file from a 32-bit platform.

m4b mentioned this pull request Apr 15, 2018

32-bit ELF file support m4b/faerie#35

Open

m4b approved these changes Apr 15, 2018

View reviewed changes

Make r_addend an Option

bf3f7ba

m4b merged commit 971744d into m4b:master Apr 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use u64/i64 instead of usize/isize for r_offset and r_addend #86

Use u64/i64 instead of usize/isize for r_offset and r_addend #86

Amanieu commented Apr 15, 2018

Amanieu commented Apr 15, 2018

m4b commented Apr 15, 2018

Amanieu commented Apr 15, 2018

m4b commented Apr 15, 2018

m4b Apr 15, 2018

Amanieu Apr 15, 2018

m4b commented Apr 16, 2018

Amanieu commented Apr 16, 2018

Use u64/i64 instead of usize/isize for r_offset and r_addend #86

Use u64/i64 instead of usize/isize for r_offset and r_addend #86

Conversation

Amanieu commented Apr 15, 2018

Amanieu commented Apr 15, 2018

m4b commented Apr 15, 2018

Amanieu commented Apr 15, 2018

m4b commented Apr 15, 2018

m4b Apr 15, 2018

Choose a reason for hiding this comment

Amanieu Apr 15, 2018

Choose a reason for hiding this comment

m4b commented Apr 16, 2018

Amanieu commented Apr 16, 2018