Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long section names in PE #1043

Open
laanwj opened this issue Apr 17, 2024 · 2 comments
Open

Long section names in PE #1043

laanwj opened this issue Apr 17, 2024 · 2 comments

Comments

@laanwj
Copy link
Contributor

laanwj commented Apr 17, 2024

These are used for mingw debug information.

x86_64-w64-mingw32-objdump -r output:

Sections:
Idx Name            Size      VMA               LMA               File off  Algn  Flags
  0 .text           00b631b0  0000000140001000  0000000140001000  00000000  2**4  ALLOC, LOAD, READONLY, CODE, DATA
  1 .data           000092e0  0000000140b65000  0000000140b65000  00000000  2**4  ALLOC, LOAD, DATA
  2 .rdata          0021be20  0000000140b6f000  0000000140b6f000  00000000  2**4  ALLOC, LOAD, READONLY, DATA
  3 .pdata          0003efa0  0000000140d8b000  0000000140d8b000  00000000  2**2  ALLOC, LOAD, READONLY, DATA
  4 .xdata          000b1c98  0000000140dca000  0000000140dca000  00000000  2**2  ALLOC, LOAD, READONLY, DATA
  5 .bss            0000d1d0  0000000140e7c000  0000000140e7c000  00000000  2**4  ALLOC
  6 .edata          00000044  0000000140e8a000  0000000140e8a000  00000000  2**2  ALLOC, LOAD, READONLY, DATA
  7 .idata          000035a4  0000000140e8b000  0000000140e8b000  00000000  2**2  ALLOC, LOAD, DATA
  8 .CRT            00000068  0000000140e8f000  0000000140e8f000  00000000  2**2  ALLOC, LOAD, DATA
  9 .tls            00000010  0000000140e90000  0000000140e90000  00000000  2**2  ALLOC, LOAD, DATA
 10 .rsrc           000004a0  0000000140e91000  0000000140e91000  00000000  2**2  ALLOC, LOAD, DATA
 11 .reloc          0000701c  0000000140e92000  0000000140e92000  00000000  2**2  ALLOC, LOAD, READONLY, DATA
 12 .debug_aranges  0002cdb0  0000000140e9a000  0000000140e9a000  000004d0  2**0  CONTENTS, READONLY, DEBUGGING
 13 .debug_info     145449f8  0000000140ec7000  0000000140ec7000  0002d2d0  2**0  CONTENTS, READONLY, DEBUGGING
 14 .debug_abbrev   001e010b  000000015540c000  000000015540c000  14571cd0  2**0  CONTENTS, READONLY, DEBUGGING
 15 .debug_line     011878a4  00000001555ed000  00000001555ed000  14751ed0  2**0  CONTENTS, READONLY, DEBUGGING
 16 .debug_frame    0021b670  0000000156775000  0000000156775000  158d98d0  2**0  CONTENTS, READONLY, DEBUGGING
 17 .debug_str      002496ac  0000000156991000  0000000156991000  15af50d0  2**0  CONTENTS, READONLY, DEBUGGING
 18 .debug_line_str 000a771e  0000000156bdb000  0000000156bdb000  15d3e8d0  2**0  CONTENTS, READONLY, DEBUGGING
 19 .debug_loclists 02ae31fd  0000000156c83000  0000000156c83000  15de60d0  2**0  CONTENTS, READONLY, DEBUGGING
 20 .debug_rnglists 00788056  0000000159767000  0000000159767000  188c92d0  2**0  CONTENTS, READONLY, DEBUGGING

LIEF output:

.text     b631b0    1000      0         0         0         0         CNT_CODE - CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_EXECUTE - MEM_READ
.data     92e0      b65000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ - MEM_WRITE
.rdata    21be20    b6f000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ
.pdata    3efa0     d8b000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ
.xdata    b1c98     dca000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ
.bss      d1d0      e7c000    0         0         0         0         CNT_UNINITIALIZED_DATA - MEM_READ - MEM_WRITE
.edata    44        e8a000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ
.idata    35a4      e8b000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ - MEM_WRITE
.CRT      68        e8f000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ - MEM_WRITE
.tls      10        e90000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ - MEM_WRITE
.rsrc     4a0       e91000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_READ - MEM_WRITE
.reloc    701c      e92000    0         0         0         0         CNT_INITIALIZED_DATA - CNT_UNINITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/4        2cdb0     e9a000    2ce00     4d0       0         2.10536   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/19       145449f8  ec7000    14544a00  2d2d0     0         6.33078   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/31       1e010b    1540c000  1e0200    14571cd0  0         5.09492   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/45       11878a4   155ed000  1187a00   14751ed0  0         5.34058   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/57       21b670    16775000  21b800    158d98d0  0         3.79223   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/70       2496ac    16991000  249800    15af50d0  0         4.94782   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/81       a771e     16bdb000  a7800     15d3e8d0  0         4.75507   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/97       2ae31fd   16c83000  2ae3200   15de60d0  0         5.06585   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ
/113      788056    19767000  788200    188c92d0  0         5.05623   CNT_INITIALIZED_DATA - MEM_DISCARDABLE - MEM_READ

From what I read, the /<X> in the normal section name is interpreted as an decimal string containing an offset from PE_Header.PointerToSymbolTable + PE_Header.NumberOfSymbols*18 to a zero-terminated string containing the full section name.

As i'm trying to parse DWARF information inside a PE binary, it would be nice if these were accessible somehow.

Though it seems clearly possible to work around this limitation by doing the lookup manually.

@laanwj
Copy link
Contributor Author

laanwj commented Apr 17, 2024

It's straightforward enough--in case anyone else runs into this:

class StringTable:
    string_data: bytes

    def __init__(self, filename, binary):
        string_table_ofs = binary.header.pointerto_symbol_table + binary.header.numberof_symbols * 18

        with open(filename, 'rb') as f:
            f.seek(string_table_ofs)
            string_data = f.read(4)
            size = struct.unpack('<I', string_data)[0]
            string_data += f.read(size - 4)

        self.string_data = string_data

    def lookup(self, offset):
        endofs = self.string_data.index(b'\0', offset)
        return self.string_data[offset:endofs].decode()

...

    dbg_binary = lief.parse(dbg_filename)

    # load COFF string table
    string_table = StringTable(dbg_filename, dbg_binary)

    # augment section names
    extended_names = []
    for section in dbg_binary.sections:
        if section.name.startswith('/'):
            offset = int(section.name[1:])
            extended_names.append(string_table.lookup(offset))
        else:
            extended_names.append(section.name)

@romainthomas
Copy link
Member

Thank you @laanwj for raising this issue. It makes sense to have this support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants