Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Absolute source file paths are required for debuginfod to work #91

Open
wolfpld opened this issue May 1, 2022 · 5 comments
Open

Absolute source file paths are required for debuginfod to work #91

wolfpld opened this issue May 1, 2022 · 5 comments

Comments

@wolfpld
Copy link

wolfpld commented May 1, 2022

This is a different take on #72 and #26.

Excerpt from the DEBUGINFOD(8) manual page about the requirement, with some reasoning about it:

   /buildid/BUILDID/source/SOURCE/FILE
       If  the  given  buildid is known to the server, this request will result in a binary object that
       contains the source file mentioned.  The path should be absolute.  Relative path names  commonly
       appear in the DWARF file's source directory, but these paths are relative to individual compila‐
       tion unit AT_comp_dir paths, and yet an executable is made up of multiple  CUs.   Therefore,  to
       disambiguate, debuginfod expects source queries to prefix relative path names with the CU compi‐
       lation-directory, followed by a mandatory "/".

       Note: the caller may or may not elide ../ or /./ or extraneous /// sorts of path  components  in
       the  directory  names.   debuginfod  accepts both forms.  Specifically, debuginfod canonicalizes
       path names according to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any //  to  /
       in the path.

       For example:

       #include <stdio.h>               /buildid/BUILDID/source/usr/include/stdio.h
       /path/to/foo.c                   /buildid/BUILDID/source/path/to/foo.c
       ../bar/foo.c AT_comp_dir=/zoo/   /buildid/BUILDID/source/zoo//../bar/foo.c

The problem is that libbacktrace is currently providing paths such as: /usr/src/debug/curl-7.83.0/lib/transfer.c or ../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c. The first path, which is absolute, can be used with debuginfod without any problems. However, the second path, a relative one, does not specify an exact location, due to reasons discussed in the manual page above.

Inspecting the libraries with objdump gives the following results:

objdump -W /usr/lib/libcurl.so.4
(...)
 <0><21a289>: Abbrev Number: 59 (DW_TAG_compile_unit)
    <21a28a>   DW_AT_producer    : (indirect string, offset: 0x4ab3): GNU C17 11.2.0 -march=x86-64 -mtune=generic -g -O2 -fvisibility=hidden -fno-plt -fexceptions -fstack-clash-protection -fcf-protection=full -flto -fPIC
    <21a28e>   DW_AT_language    : 29   (C11)
    <21a28f>   DW_AT_name        : (indirect line string, offset: 0x1ad9): /usr/src/debug/curl-7.83.0/lib/transfer.c
    <21a293>   DW_AT_comp_dir    : (indirect line string, offset: 0x383): /usr/src/debug/build-curl/lib
    <21a297>   DW_AT_stmt_list   : 0x5fdff
(...)

In this case, DW_AT_name is absolute, so DW_AT_comp_dir value does not need to be used.

objdump -W /usr/lib/dri/iris_dri.so
(...)
<0><864dd0>: Abbrev Number: 2 (DW_TAG_compile_unit)
    <864dd1>   DW_AT_producer    : (indirect string, offset: 0x17742): GNU C11 11.2.0 -march=x86-64 -mtune=generic -mtls-dialect=gnu -g -g1 -O0 -O2 -std=c11 -fvisibility=hidden -flto -fno-math-errno -fno-trapping-math -fno-common -ffunction-sections -fdata-sections -fno-plt -fexceptions -fstack-clash-protection -fcf-protection=full -flto -fPIC
    <864dd5>   DW_AT_language    : 29   (C11)
    <864dd6>   DW_AT_name        : (indirect line string, offset: 0x45fd): ../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c
    <864dda>   DW_AT_comp_dir    : (indirect line string, offset: 0x0): /usr/src/debug/build
    <864dde>   DW_AT_stmt_list   : 0x8446d9
(...)

However, in this second example, the DW_AT_name is relative, so DW_AT_comp_dir should be prepended to it, producing the following result: /usr/src/debug/build/../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c.

Is this something you are interested in implementing?

@ianlancetaylor
Copy link
Owner

Are you suggesting that libbacktrace should pass an absolute path if possible when calling the callback functions? That seems reasonable. I kind of thought that libbacktrace did that already. It certainly tries to do so. Maybe there are some cases that it misses. Can you show me a test case? Thanks.

@wolfpld
Copy link
Author

wolfpld commented May 2, 2022

Yes, absolute paths should be expected.

I have created a simple test application:

// g++ test.cpp -g ../libbacktrace/.libs/libbacktrace.a

#include <execinfo.h>
#include <stdio.h>
#include "../libbacktrace/backtrace.h"

int SymCb( void* data, uintptr_t pc, const char* fn, int lineno, const char* func )
{
    printf( "%s: %s:%i\n", func, fn, lineno );
    return 0;
}

void ErrCb( void* data, const char* msg, int errnum )
{
}

int main()
{
    uintptr_t addr;
    backtrace( (void**)&addr, 1 );

    auto state = backtrace_create_state( nullptr, 0, nullptr, nullptr );
    backtrace_pcinfo( state, addr, SymCb, ErrCb, nullptr );
}

Unfortunately, it is returning an absolute path:
main: /home/wolf/test/test.cpp:20

This is happening with a relative DW_AT_name:

    <12>   DW_AT_name        : (indirect line string, offset: 0x10): test.cpp
    <16>   DW_AT_comp_dir    : (indirect line string, offset: 0x0): /home/wolf/test

There must be some deeper interaction which is preventing proper DW_AT_comp_dir insertion in some currently unknown edge cases.

@wolfpld
Copy link
Author

wolfpld commented May 2, 2022

Ok, I have repro. The code is the same as above, but you have to build it in a slightly different way:

[1:07 wolf@mimir:~/test]% mkdir build
[1:07 wolf@mimir:~/test]% cd build
[1:07 wolf@mimir:~/test/build]% g++ ../test.cpp -g ../../libbacktrace/.libs/libbacktrace.a
[1:07 wolf@mimir:~/test/build]% ./a.out
main: ../test.cpp:22
    <12>   DW_AT_name        : (indirect line string, offset: 0x17): ../test.cpp
    <16>   DW_AT_comp_dir    : (indirect line string, offset: 0x0): /home/wolf/test/build

@wolfpld
Copy link
Author

wolfpld commented May 3, 2022

There are four places in dwarf.c, where directories are concatenated with filenames. All can be found by searching for '/'. Program flow reaches the one in read_lnct. Deeper inspection shows that a different kind of records is used here. Looking at the data read in read_line_header into hdr you can see the following values:

For the case returning absolute values:

Process 171530 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555565351 a.out`read_line_header(state=0x00007ffff7ffa000, ddata=0x00007ffff7fb9890, u=0x00007ffff7ffae38, is_dwarf64=0, line_buf=0x00007fffffffe068, hdr=0x00007fffffffe1f0) at dwarf.c:3052:7
   3049         return 0;
   3050     }
   3051
-> 3052   if (hdr_buf.reported_underflow)
   3053     return 0;
   3054
   3055   return 1;
(lldb) p hdr->dirs_count
(size_t) $0 = 3
(lldb) p hdr->dirs[0]
(const char *) $1 = 0x00007ffff7fb616f "/home/wolf/test"
(lldb) p hdr->dirs[1]
(const char *) $2 = 0x00007ffff7fb6188 "/usr/include"
(lldb) p hdr->dirs[2]
(const char *) $3 = 0x00007ffff7fb6195 "../libbacktrace"
(lldb) p hdr->filenames_count
(size_t) $4 = 6
(lldb) p hdr->filenames[0]
(const char *) $5 = 0x00007ffff6764668 "/home/wolf/test/test.cpp"
(lldb) p hdr->filenames[1]
(const char *) $6 = 0x00007ffff6764688 "/home/wolf/test/test.cpp"
(lldb) p hdr->filenames[2]
(const char *) $7 = 0x00007ffff67646a8 "/usr/include/stdint.h"
(lldb) p hdr->filenames[3]
(const char *) $8 = 0x00007ffff67646c0 "../libbacktrace/backtrace.h"
(lldb) p hdr->filenames[4]
(const char *) $9 = 0x00007ffff67646e0 "/usr/include/execinfo.h"
(lldb) p hdr->filenames[5]
(const char *) $10 = 0x00007ffff67646f8 "/usr/include/stdio.h"

And for the case returning relative values:

(lldb) p hdr->dirs_count
(size_t) $0 = 4
(lldb) p hdr->dirs[0]
(const char *) $1 = 0x00007ffff7fb617f "/home/wolf/test/build"
(lldb) p hdr->dirs[1]
(const char *) $2 = 0x00007ffff7fb6195 ".."
(lldb) p hdr->dirs[2]
(const char *) $3 = 0x00007ffff7fb6198 "/usr/include"
(lldb) p hdr->dirs[3]
(const char *) $4 = 0x00007ffff7fb61a5 "../../libbacktrace"
(lldb) p hdr->filenames_count
(size_t) $5 = 6
(lldb) p hdr->filenames[0]
(const char *) $6 = 0x00007ffff6764668 "../test.cpp"
(lldb) p hdr->filenames[1]
(const char *) $7 = 0x00007ffff6764678 "../test.cpp"
(lldb) p hdr->filenames[2]
(const char *) $8 = 0x00007ffff6764688 "/usr/include/stdint.h"
(lldb) p hdr->filenames[3]
(const char *) $9 = 0x00007ffff67646a0 "../../libbacktrace/backtrace.h"
(lldb) p hdr->filenames[4]
(const char *) $10 = 0x00007ffff67646c0 "/usr/include/execinfo.h"
(lldb) p hdr->filenames[5]
(const char *) $11 = 0x00007ffff67646d8 "/usr/include/stdio.h"

@wolfpld
Copy link
Author

wolfpld commented Aug 16, 2022

In dwarf_lookup_pc control reaches filename = ln->filename; line, which sets the actual file name value returned by the function. The ln->filename value is ../test.cpp. This is the only relevant field in this structure. At the same time, there is the entry->u data structure, which contains filename="../test.cpp" and comp_dir="/home/wolf/test". The abs_filename in this structure is nullptr. Information from entry->u is not used to fix the incomplete path data retrieved from ln.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants