Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FUNC and WIN STACK entries disagree on parameter size #284

Open
Gankra opened this issue Sep 23, 2021 · 6 comments
Open

FUNC and WIN STACK entries disagree on parameter size #284

Gankra opened this issue Sep 23, 2021 · 6 comments

Comments

@Gankra
Copy link
Contributor

Gankra commented Sep 23, 2021

This is evidently old behaviour that breakpad works around. I don't know why the two should ever diverge. This is important because the parameter size of the grand-callee (callee's callee) is necessary information to compute the size of the callee's stack frame (and therefore the offset to the return address).

Now that I've found it and also implemented a workaround it's not a big deal, but it's presumably "wrong". gabriele asked me to file this.

@Gankra
Copy link
Contributor Author

Gankra commented Sep 23, 2021

I ran into this when implementing rust-minidump/rust-minidump#240

You can see this problem in the symbols for this minidump https://crash-stats.mozilla.org/report/index/cd121a28-ca2b-48c2-a0d4-a71a40210915. Specifically it's an issue when unwinding core::panicking::panic_fmt() because its callee,std::panicking::begin_panic_handler(), has two different conflicting parameter sizes:

  • FUNC reports 0 (wrong)
  • WIN STACK reports 4 (correct)

As the linked code notes, the "fix" is just to always prefer WIN STACK.

@mstange
Copy link
Collaborator

mstange commented Jul 4, 2022

I'd like to find out if we could just make this behavior the rule: Always get the stack size from the STACK WIN lines, and completely ignore the values on PUBLIC / FUNC. Then we can simplify dump_syms to always emit zero for the parameter size in PUBLIC / FUNC.

Since we use the parameter size for stack unwinding, it seems more natural to me to get this information from the unwind information anyway, i.e. from the STACK WIN records. The PUBLIC / FUNC records are based on symbol information, so it seems odd to use those for unwinding.

If dump_syms can stop emitting the parameter size for PUBLIC / FUNC records, this might also allow us to stop doing our own pdb parsing in dump_syms. Instead, we could make use of symbolic-debuginfo's pdb parsing code, by sending dump_syms down the existing code path which we use for ELF / mach-O - that code gets its input from symbolic-debuginfo's Functions. At the moment this is tricky to do because symbolic does not preserve parameter size information in its symbol / function list.

@mstange
Copy link
Collaborator

mstange commented Jul 4, 2022

Hmm, looking at the stats of this 32bit xul.sym, it might not be quite that easy:

Of 3158 PUBLIC symbol addresses, 217 have a non-zero parameter size, and of those, 0 also have a STACK WIN record (and 217 do not: _CxxThrowException, AddVectoredExceptionHandler, ImmGetDefaultIMEWnd, MsiRecordGetStringW, RegGetValueW).
Of 221244 FUNC addresses, 14 have a non-zero parameter size, and of those, 14 also have a STACK WIN record (and 0 do not: ).
Of 1168854 STACK_WIN addresses, 221240 have a corresponding PUBLIC or FUNC record (and 947614 do not).

So there are 217 symbols for which we would lose the parameter size if we simply made dump_syms output zero.

And on this 32bit ntdll.sym, we'd lose the parameter size for 19 functions:

Of 3361 PUBLIC symbol addresses, 2677 have a non-zero parameter size, and of those, 2658 also have a STACK WIN record (and 19 do not: RtlXRestore, _seh_longjmp_unwind4, LZNT1DecompressChunk, RtlRaiseException, RtlpUnlinkHandler).
Of 0 FUNC addresses, 0 have a non-zero parameter size, and of those, 0 also have a STACK WIN record (and 0 do not: ).
Of 63079 STACK_WIN addresses, 3183 have a corresponding PUBLIC or FUNC record (and 59896 do not).

Here's the code to compute these stats, can be dropped into parser.rs in rust-minidump
fn yo(filename: &str) {
    let mut public_addresses = HashSet::new();
    let mut public_nonzeroparamsize_addresses = HashSet::new();
    let mut function_addresses = HashSet::new();
    let mut function_nonzeroparamsize_addresses = HashSet::new();
    let mut stack_win_addresses = HashSet::new();

    let mut public_names = HashMap::new();
    let mut function_names = HashMap::new();

    let file = std::fs::File::open(filename).unwrap();
    let reader = std::io::BufReader::new(file);
    use std::io::BufRead;
    for l in reader.lines() {
        let mut l = l.unwrap();
        l += "\n";
        let (_, parsed_line) = match line(l.as_bytes()) {
            Ok(l) => l,
            Err(_) => continue,
        };
        match parsed_line {
            Line::Module => {}
            Line::Info(_) => {}
            Line::File(_, _) => {}
            Line::InlineOrigin(_, _) => {}
            Line::Public(public_symbol) => {
                public_addresses.insert(public_symbol.address);
                public_names.insert(public_symbol.address, public_symbol.name);
                if public_symbol.parameter_size != 0 {
                    public_nonzeroparamsize_addresses.insert(public_symbol.address);
                }
            }
            Line::Function(function, _, _) => {
                function_addresses.insert(function.address);
                function_names.insert(function.address, function.name);
                if function.parameter_size != 0 {
                    function_nonzeroparamsize_addresses.insert(function.address);
                }
            }
            Line::StackWin(WinFrameType::Fpo(stack_win)) => {
                stack_win_addresses.insert(stack_win.address);
            }
            Line::StackWin(WinFrameType::FrameData(stack_win)) => {
                stack_win_addresses.insert(stack_win.address);
            }
            Line::StackWin(WinFrameType::Unhandled) => {}
            Line::StackCfi(_) => {}
        }
    }
    let pubnzps_and_stack = public_nonzeroparamsize_addresses
        .iter()
        .filter(|x| stack_win_addresses.contains(x))
        .count();
    let funcnzps_and_stack = function_nonzeroparamsize_addresses
        .iter()
        .filter(|x| stack_win_addresses.contains(x))
        .count();
    let stack_and_func_or_pub = stack_win_addresses
        .iter()
        .filter(|x| public_addresses.contains(x) || function_addresses.contains(x))
        .count();
    eprintln!(
        "Of {} PUBLIC symbol addresses, {} have a non-zero parameter size, and of those, {} also have a STACK WIN record (and {} do not: {}).",
        public_addresses.len(),
        public_nonzeroparamsize_addresses.len(),
        pubnzps_and_stack,
        public_nonzeroparamsize_addresses.len() - pubnzps_and_stack,
        public_names
            .iter()
            .filter_map(|(addr, name)| if !public_nonzeroparamsize_addresses.contains(addr) || stack_win_addresses.contains(addr) {
                None
            } else {
                Some(name.to_string())
            }).take(5)
            .collect::<Vec<_>>()
            .join(", ")
    );
    eprintln!(
        "Of {} FUNC addresses, {} have a non-zero parameter size, and of those, {} also have a STACK WIN record (and {} do not: {}).",
        function_addresses.len(),
        function_nonzeroparamsize_addresses.len(),
        funcnzps_and_stack,
        function_nonzeroparamsize_addresses.len() - funcnzps_and_stack,
        function_names
            .iter()
            .filter_map(|(addr, name)| if !function_nonzeroparamsize_addresses.contains(addr) || stack_win_addresses.contains(addr) {
                None
            } else {
                Some(name.to_string())
            })
            .collect::<Vec<_>>()
            .join(", ")
    );
    eprintln!(
        "Of {} STACK_WIN addresses, {} have a corresponding PUBLIC or FUNC record (and {} do not).",
        stack_win_addresses.len(),
        stack_and_func_or_pub,
        stack_win_addresses.len() - stack_and_func_or_pub
    );
}

#[test]
fn test_check_if_no_stack_win() {
    // Stuff
    yo("/Users/mstange/Downloads/ntdll-32bit.sym")
}

@mstange
Copy link
Collaborator

mstange commented Jul 4, 2022

Well, for PUBLIC symbols, the parameter size is really easy to compute: It's spelled out in the symbol name after an @ sigil. For example _AddVectoredExceptionHandler@8 or _RtlRaiseException@4.
And that symbol name is preserved by symbolic, so we can access it even if we use symbolic's pdb parsing.

I think we can do this!

@mstange
Copy link
Collaborator

mstange commented Jul 4, 2022

I'm proposing the following:

  • Formally deprecate the parameter size field on FUNC records.
  • Make dump_syms output zero for the parameter size field on FUNC records.
  • Update rust-minidump to ignore the parameter size field on FUNC records and only use the one from PUBLIC and STACK WIN records.

For PUBLIC records we can keep outputting the parameter size as before.

@mstange
Copy link
Collaborator

mstange commented Jul 4, 2022

It looks like the PUBLIC symbols without corresponding STACK WIN records are all imported symbols. For example, AddVectoredExceptionHandler in xul.dll is imported from KERNEL32.dll. The "machine code" in xul.dll for the import stub is 6 bytes long and is just a jump. It seems unlikely that we'd ever need to unwind from this symbol's address. So it's possible that we could stop emitting parameter size for PUBLIC symbols as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants