Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(gimli::write::ConvertError)InvalidAttributeValue while gimli::write::Dwarf::from() conversion #482

Open
vaibspider opened this issue Mar 31, 2020 · 24 comments

Comments

@vaibspider
Copy link
Contributor

I am using Dwarf::from() to convert a readable Dwarf instance to a writable one using convert_address() function as follows:-

fn convert_address(addr: u64) -> Option<gimli::write::Address> {
    Some(gimli::write::Address::Constant(addr))
}
let mut write_dwarf = match gimli::write::Dwarf::from(&dwarf, &convert_address) {
    Ok(write_dwarf) => write_dwarf,
    Err(error) => panic!("Error while Dwarf::from() : {:?}", error),
};

And I'm getting an InvalidAttributeValue error on the Dwarf::from line.
As mentioned in Dwarf::from()'s documentation,

convert_address is a function to convert read addresses into the Address type. For non-relocatable addresses, this function may simply return Address::Constant(address). For relocatable addresses, it is the caller's responsibility to determine the symbol and addend corresponding to the address and return Address::Symbol { symbol, addend }.

I'm simply returning Address::Constant(addr) in convert_address function.
Could someone please clarify the changes to be done for relocatable addresses in convert_address function? (Is convert_address the cause for this error?)

Thanks

@philipc
Copy link
Collaborator

philipc commented Mar 31, 2020

Unfortunately it's a bit hard to work out the cause of the InvalidAttributeValue without doing some debugging, since there's a few places that can generate it. You could modify gimli to have more specific errors. Or if you can provide the file you are trying to convert then I can look into it.

Your convert_address function is fine. It won't be the cause of the InvalidAttributeValue error. If you do want to handle relocatable addresses (because you're processing an object file instead of an executable file) then see https://github.com/philipc/rewrite/blob/master/src/dwarf.rs for an example. Be warned that it's a bit convoluted though. Implementing #409 would make this simpler.

Edit: I didn't realise you were doing this on an object file. The above comment only applies for executables.

@vaibspider
Copy link
Contributor Author

vaibspider commented Apr 2, 2020

Sure, thanks. This is the file I'm working with : object-file. Can you please have a look at it?

What I intend to do is : read an executable or relocatable ELF object file, modify some debugging information and then create a new executable or relocatable ELF object file with this modified debugging information.

I tried working on an executable (reading it and then re-creating it), but it didn't work (there were many sections/symbols/data missing in the output file and couldn't produce an executable output file). So, I moved on to work on a relocatable object file, re-creating it and then linking it with the compiler again to make an executable file. In this case, I'm getting an error - InvalidAttributeValue.

I'm currently using gimli with object for processing.
Could you please suggest a good way to do this task?(Should I work on executables or relocatables?)

Thanks for pointing me to rewrite/dwarf.rs and the related issue. Will look into it.

@philipc
Copy link
Collaborator

philipc commented Apr 3, 2020

I noticed that object library worked fine with a relocatable object file but had issues with an executable file(couldn't reproduce all the contents and couldn't create an executable file).

Correct, object doesn't support writing executables, only reading them.

On the other hand, gimli seems to be working fine with executable files (I can read DWARF sections and modify them) but has issues when given a relocatable object file as input - (getting InvalidAttributeValue error). Could you please provide your input on this?

I updated rewrite to handle current gimli and object versions, but I wasn't able to reproduce the InvalidAttributeValue error when running it on sumofn-O0.o. So it may just be that you're not handling relocations yet.

gimli won't handle relocations if you're only using its Reader implementations. You need to provide your own. See ReaderRelocate in rewrite. But note that doesn't work for relocations in .eh_frame (I've filed #484).

@vaibspider
Copy link
Contributor Author

Thanks @philipc ! I think I initially uploaded the wrong file. Sorry for that. The correct file with which I'm getting an InvalidAttributeValue error is sumofn-O3.o. I've updated my previous comment. Could you please test for this file?
Thanks.

@philipc
Copy link
Collaborator

philipc commented Apr 3, 2020

Ok okay, yeah I am getting the InvalidAttributeValue for that one. I'll look into it.

@philipc
Copy link
Collaborator

philipc commented Apr 3, 2020

It's the DW_AT_GNU_locviews attribute, which we don't support converting or writing yet.

@vaibspider
Copy link
Contributor Author

Thank you @philipc for looking into it! After disabling GNU location views while compiling, it worked!

@vaibspider
Copy link
Contributor Author

Hi @philipc !
I tested out your dwarf rewrite - rewrite/dwarf.rs for the sumofn-O3.o. As you pointed out, it doesn't handle .eh_frame relocations yet. Also, I found that .debug_aranges section was missing from the output object file. So I removed .debug_aranges from is_rewrite_dwarf_section() and copied it into output_object using object library. But, when I ran gdb on the output object file - output.o, it lacked type information for a global variable sum in sumofn-O3.o(input obj). I checked .debug_info section for the variable DIE and it is correctly pointing to a base_type DIE with name int and size 4 bytes. Still, when I try to (gdb) p sum inside gdb session, it gives an error : 'sum' has unknown type; cast it to its declared type. It couldn't find the size/type of the variable without explicit type casting. Can you please provide some pointers on this?
Thanks!

@philipc
Copy link
Collaborator

philipc commented Apr 22, 2020

Probably the problem is that gimli doesn't support writing DW_OP_addr yet (see #474). This requires a relocation for the address value, but currently rewrite will simply copy the DW_AT_location bytes over without handling the relocation, and gdb will end up with a null address.

@philipc
Copy link
Collaborator

philipc commented Apr 22, 2020

Also, I found that .debug_aranges section was missing from the output object file. So I removed .debug_aranges from is_rewrite_dwarf_section() and copied it into output_object using object library.

That's only working by chance. gimli needs to generate a new .debug_aranges. Copying the old one isn't guaranteed to work.

@vaibspider
Copy link
Contributor Author

Probably the problem is that gimli doesn't support writing DW_OP_addr yet (see #474). This requires a relocation for the address value, but currently rewrite will simply copy the DW_AT_location bytes over without handling the relocation, and gdb will end up with a null address.

Ah, okay. But, then in this case, even after explicit type casting (gdb) p (int)sum it wouldn't have worked?

@vaibspider
Copy link
Contributor Author

That's only working by chance. gimli needs to generate a new .debug_aranges. Copying the old one isn't guaranteed to work.

Oh, I see. There is no field for .debug_aranges in gimli::read::Dwarf struct. So should gimli produce this section automatically? Do you think it's a bug?

@philipc
Copy link
Collaborator

philipc commented Apr 22, 2020

Ah, okay. But, then in this case, even after explicit type casting (gdb) p (int)sum it wouldn't have worked?

At a guess, it works because it can still find the address from the symbol table.

Oh, I see. There is no field for .debug_aranges in gimli::read::Dwarf struct. So should gimli produce this section automatically? Do you think it's a bug?

Yes gimli should produce the section. Write support is still an incomplete feature. See #390.

@vaibspider
Copy link
Contributor Author

At a guess, it works because it can still find the address from the symbol table.

Okay, I'm confused, I wonder why gdb is not able to use the base_type DIE(which is providing the variable type and size) with this address then.

Yes gimli should produce the section. Write support is still an incomplete feature. See #390.

Okay, thanks! I would like to contribute to this. Can you please suggest a starting point?

@philipc
Copy link
Collaborator

philipc commented Apr 22, 2020

Okay, I'm confused, I wonder why gdb is not able to use the base_type DIE(which is providing the variable type and size) with this address then.

I haven't looked at the gdb source, but my expectation is that it has two code paths:

  • one that uses DWARF, and requires both a base type and a location in the DWARF
  • one that uses the symbol table, and requires the user to specify the type

Okay, thanks! I would like to contribute to this. Can you please suggest a starting point?

Great! .debug_aranges should be fairly straightforward to implement. I think you need to add it to Sections and output it when writing .debug_info.

I am currently working on #479.

@vaibspider
Copy link
Contributor Author

I haven't looked at the gdb source, but my expectation is that it has two code paths:

Okay, got it. Thanks!

Great! .debug_aranges should be fairly straightforward to implement. I think you need to add it to Sections and output it when writing .debug_info.

Sounds good, will start working on it!

I am currently working on #479.

👍

@vaibspider
Copy link
Contributor Author

vaibspider commented Apr 23, 2020

Hi @philipc !
I have tested a raw version for generating and writing .debug_aranges into output object file.
I created a src/write/aranges.rs module and called define_section! macro there and added Debug_Aranges into Sections inside src/write/section.rs. As Debug_Aranges had no offset required, so I added a new pattern inside macro_rules! define_section macro definition with only two arguments.
In the src/write/unit.rs module, where each unit is being iterated to write to .debug_info section, I added a call to write .debug_aranges section with the hard-coded byte array buf just to test.
But I will have to define a new Struct to write the Debug_Aranges section.
Any comments? Is this the right approach?
Thanks

@philipc
Copy link
Collaborator

philipc commented Apr 24, 2020

But I will have to define a new Struct to write the Debug_Aranges section.

I had initially thought that we could simply copy the information from the DW_AT_low_pc/DW_AT_high_pc/DW_AT_ranges attributes of the compilation unit, but that might get complicated because DW_AT_high_pc may be an address instead of the length we need (and similarly for DW_AT_ranges).

For now, it might be simpler to add a new aranges: Vec<(Address, u64)> data member to Unit. I don't think there is a need to define a new struct. Using a struct instead of the (Address, u64) tuple would be ok though if you want. Also, this is probably small enough to all fit in src/write/unit.rs, but a new file is fine if it is getting large.

By the way, what error were you seeing without .debug_aranges? My understanding was that it is optional, but maybe I have that wrong.

@vaibspider
Copy link
Contributor Author

vaibspider commented Apr 24, 2020

I had initially thought that we could simply copy the information from the DW_AT_low_pc/DW_AT_high_pc/DW_AT_ranges attributes of the compilation unit, but that might get complicated because DW_AT_high_pc may be an address instead of the length we need (and similarly for DW_AT_ranges).

Yes, I think we can check whether the low_pc and high_pc or the ranges attribute exists for a compilation unit and accordingly, create the aranges tuple. If it is low_pc and high_pc then it would be straightforward (doing high_pc - low_pc if high_pc is of the form address)(?) And for the ranges attribute, we can use this index to get the RangeList from current_unit.ranges field(?)
To get the offset into .debug_info section, we can use DebugInfoOffsets.unit() method(?).

For now, it might be simpler to add a new aranges: Vec<(Address, u64)> data member to Unit. I don't think there is a need to define a new struct. Using a struct instead of the (Address, u64) tuple would be ok though if you want. Also, this is probably small enough to all fit in src/write/unit.rs, but a new file is fine if it is getting large.

Sure, the tuple seems to be better instead of a struct and will write the code inside src/write/unit.rs itself.

By the way, what error were you seeing without .debug_aranges? My understanding was that it is optional, but maybe I have that wrong.

No, there was no error due to missing .debug_aranges. Actually, I would like to contribute in fixing the typecast error in gdb session - 'sum' has unknown type; cast it to its declared type (where variable 'sum' has a memory address). I just thought maybe working on .debug_aranges would make me familiar with the codebase so that I could later work on this part. Can you please give an estimate of how much effort will be needed to support relocation for Expression? Can I contribute in some way?

@philipc
Copy link
Collaborator

philipc commented Apr 24, 2020

Yes, I think we can check whether the low_pc and high_pc or the ranges attribute exists for a compilation unit and accordingly, create the aranges tuple. If it is low_pc and high_pc then it would be straightforward (doing high_pc - low_pc if high_pc is of the form address)(?) And for the ranges attribute, we can use this index to get the RangeList from current_unit.ranges field(?)

Yes, but the complicated part is that you may need to interpret the RangeList a bit to convert it to address + length pairs, and in theory this may not be possible, e.g. if an address pair uses two different symbols. It may be okay to fail in that case, I'm not sure. One thing I would like to do is look at llvm/gcc to see if they do something better, but I suspect they just have a higher level API so they don't have this problem.

To get the offset into .debug_info section, we can use DebugInfoOffsets.unit() method(?).

Yes.

Can you please give an estimate of how much effort will be needed to support relocation for Expression?

I did some work on this today in #479, and I think rewrite will handle the DW_OP_addr now. Still to do in that PR:

  • finish the DW_OP_call2/DW_OP_call4/base_type support
  • write tests

I think it's close enough to done that if you want to try it out and see if it works for what you want to do then that would be useful. It's not really at a point where having another contributor would help progress it though.

@vaibspider
Copy link
Contributor Author

vaibspider commented Apr 24, 2020

Yes, but the complicated part is that you may need to interpret the RangeList a bit to convert it to address + length pairs, and in theory this may not be possible, e.g. if an address pair uses two different symbols. It may be okay to fail in that case, I'm not sure. One thing I would like to do is look at llvm/gcc to see if they do something better, but I suspect they just have a higher level API so they don't have this problem.

Okay, I see. Maybe I could work on this sometime later, after getting more clarity.

I did some work on this today in #479, and I think rewrite will handle the DW_OP_addr now.
I think it's close enough to done that if you want to try it out and see if it works for what you want to do then that would be useful. It's not really at a point where having another contributor would help progress it though.

Got it, thanks! I would try it out and see if it handles the memory address so that there would be no need for explicit type casting in the gdb session.

@vaibspider
Copy link
Contributor Author

vaibspider commented Apr 25, 2020

I tried out your changes in #479 and it worked - now I can access the global variable in my sample program in a gdb session without the need for explicit type-casting.
Thanks a lot @philipc !

@jjang3
Copy link

jjang3 commented Jan 19, 2024

Thank you @philipc for looking into it! After disabling GNU location views while compiling, it worked!

Hello, I just happened to come across this post, and I was wondering how do you actually disable GNU location views? using -fno-gnu-extensions breaks the compilation process so I was wondering if there is some way to only disable GNU location views. Thank you.

@vaibspider
Copy link
Contributor Author

Hi @jjang3, I think you can use this option to disable it.
"-gno-variable-location-views" (as mentioned on the page below)
https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants