Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should strtab parsing jump over contigous blocks of \0 #384

Open
wolfv opened this issue Dec 22, 2023 · 3 comments
Open

Should strtab parsing jump over contigous blocks of \0 #384

wolfv opened this issue Dec 22, 2023 · 3 comments

Comments

@wolfv
Copy link

wolfv commented Dec 22, 2023

I am wondering if adding additional \0 in the strtab is valid (we want to override the rpath of a shared library by replacing the string with a shorter, \0 padded one).

I looked at the code for strtab parsing I noticed that it creates empty strings if it finds multiple consecutive \0, e.g. this test fails:

#[cfg(test)]
mod test {
    use goblin::strtab::Strtab;

    #[test]
    fn test_strtab() {
        let data = "hello\0\0\0\0world\0\0".as_bytes();
        let strtab = Strtab::parse(&data, 0, data.len(), 0).unwrap();
        assert_eq!(strtab.to_vec().unwrap(), vec!["hello", "world"]);
    }
}

with

  left: ["hello", "", "", "", "world", ""]
 right: ["hello", "world"]

Wouldn't it be more expected that the parser jumps over blocks of \0? I am expecting the parsing with offsets to work as expected.

@wolfv wolfv changed the title Should strtab parsing jump over additional \0 Should strtab parsing jump over contigous blocks of \0 Dec 22, 2023
@philipc
Copy link
Collaborator

philipc commented Dec 23, 2023

I am wondering if adding additional \0 in the strtab is valid

Yes. Values in the strtab only have meaning if something references them.

we want to override the rpath of a shared library by replacing the string with a shorter, \0 padded one

Be aware that a string and its suffixes can be referenced from multiple places, so by modifying the string for rpath it is possible to be changing another string at the same time. This may be unlikely for the complete rpath string, but it's hard to tell when the suffix will have meaning somewhere else too.

Wouldn't it be more expected that the parser jumps over blocks of \0?

If something actually references the \0, then the parser should return an empty string. If the parser jumped over it, then I think Strtab::get_at would return None, which would be wrong. Compilers won't generate this sort of thing in practice, but it would be best to handle it correctly if it does happen.

As an aside, the way that goblin parses string tables into a list of strings is unusual. Most parsers don't do that.

@wolfv
Copy link
Author

wolfv commented Dec 23, 2023

Hi @philipc thanks for your insights! Am I correct in the assumption that the strings for rpath are in the .dynamic part, and in the dynstrtab table? So if I would check the entire section for suffixes I could rule out that any other string is corrupted by the edits?

@philipc
Copy link
Collaborator

philipc commented Dec 23, 2023

Yes the rpath reference is in .dynamic. The strings in the .dynstr section can be referenced by .dynamic, .dynsym, and the GNU version information sections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants