Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with natvis support #2836

Closed
kennykerr opened this issue Feb 6, 2024 · 17 comments · Fixed by #2838
Closed

Problems with natvis support #2836

kennykerr opened this issue Feb 6, 2024 · 17 comments · Fixed by #2838
Labels
question Further information is requested

Comments

@kennykerr
Copy link
Collaborator

kennykerr commented Feb 6, 2024

Originally posted by @tim-weis in #2023 (comment)

Thanks for the feedback. This has been rather insightful.

I went ahead and did some more research. The TL;DR is: The HSTRING visualizer works with CDB but not with other debuggers I tried (WinDbg, Visual Studio).

First, though, the windows::core vs. windows_core path discrepancy was down to me copy-pasting outdated expected test output from #2023. This has since been changed, and all debuggers (and visualizers) agree on windows_core as the package-relative path prefix. That turned out to be a red herring.

For reference, here's the full repro (slightly modified from my previous comment).

Cargo.toml

[package]
name = "win_nv"
version = "0.0.0"
edition = "2021"

[dependencies]
windows = { version = "0.52.0", features = [] }

.cargo/config.toml

[build]
# Request that visualizers are embedded into the PDB
rustflags = ["--cfg=windows_debugger_visualizer"]

src/main.rs

use windows::core::HSTRING;

#[inline(never)]
fn __break() {}

fn main() {
    let empty = HSTRING::new();
    println!("{empty}");

    let hstring = HSTRING::from("This is an HSTRING");
    println!("{hstring}");

    __break();
}

This is following a pattern I first discovered in Ridwan's debugger_test crate: It introduces a function (__break()) for the sole purpose of having a symbol to set a breakpoint on, providing a convenient way to insert checkpoints at which execution pauses. This works in combination with the bm command (bm *!*::__break "gu") instructing the debugger to "Go Up" ("gu") whenever the function is hit, taking us back to the scope of interest.

With the crate set up the following command line launches right into the CDB debugger:

cargo b && cd target\debug && "%WindowsSdkDir%Debuggers\x64\cdb.exe" -o win_nv.exe

Once in the debugger, we can set the breakpoint, run to the checkpoint and inspect the HSTRINGs:

0:000> bm *!*::__break "gu"
*** WARNING: Unable to verify checksum for win_nv.exe
  1: 00007ff6`e62d1010 @!"win_nv!win_nv::__break"
0:000> g

This is an HSTRING
win_nv!win_nv::main+0x104:
00007ff6`e62d1124 eb00            jmp     win_nv!win_nv::main+0x106 (00007ff6`e62d1126)
0:000> dx empty
empty            : "" [Type: windows_core::strings::hstring::HSTRING]
    [<Raw View>]     [Type: windows_core::strings::hstring::HSTRING]
    [len]            : 0x0 [Type: unsigned int]
0:000> dx hstring
hstring          : "This is an HSTRING" [Type: windows_core::strings::hstring::HSTRING]
    [<Raw View>]     [Type: windows_core::strings::hstring::HSTRING]
    [len]            : 0x12 [Type: unsigned int]
    [ref_count]      : 1 [Type: windows_core::imp::ref_count::RefCount]
    [flags]          : 0x0 [Type: unsigned int]
    [chars]
0:000> q

That explains why the tests are succeeding. Moving to WinDbg had some surprises, though: Setting the breakpoint the same way as in CDB behaved differently. The "gu" command string went up (at least) one stack frame more than expected. I'm not sure what's up with that, but I just replaced the command string with "pt" ("Step to Next Return") which seemingly worked. For completeness: bm *!*::__break "pt".

Once there, the debugger produced unexpected results for the HSTRING variables:

0:000> dx empty
empty                 [Type: windows_core::strings::hstring::HSTRING]
    [<Raw View>]     [Type: windows_core::strings::hstring::HSTRING]
    [len]            : Unexpected failure to dereference object
0:000> dx hstring
hstring                 [Type: windows_core::strings::hstring::HSTRING]
    [<Raw View>]     [Type: windows_core::strings::hstring::HSTRING]
    [len]            : Unexpected failure to dereference object

Can anyone please verify my observations?


Rust: 1.75.0 (stable)
CDB: cdb version 10.0.22621.382
WinDbg: Debugger client version: 1.2308.2002.0; Debugger engine version: 10.0.25921.1001
Host OS: Windows 10 19045.3930


While this is starting to feel like I'm losing my mind, here are a few more things I tried to make sure I'm looking at the same thing the debugger is:

  • Renamed the executable from windows_natvis to win_nv to prevent any sort of name clashes in .natvis lookup.
  • .nvunloadall followed by an explicit .nvload with a copy of windows.natvis from this repo.
  • Verified that the .natvis was loaded using .nvlist.
  • Enabled natvis diagnostics in Visual Studio, which didn't uncover any obvious issues.
  • Dumped all visualizers in the PDB and compared them against the respective sources (4 standard Rust visualizers plus windows.natvis from here).
  • Replaced all reserved XML tokens with XML entities (e.g., > -> &gt;) and reloaded the modified .natvis, just to be sure.
  • Probably a few other things I forget...

None of the above had any observable effect so I'm confident that windows.natvis is actually loaded and evaluated.

@MaulingMonkey
Copy link

MaulingMonkey commented Feb 6, 2024

Can anyone please verify my observations?

I cannot. Are you launching both cdb and windbg from C:\Program Files (x86)\Windows Kits\10\Debuggers\x64?

Moving to WinDbg had some surprises, though: Setting the breakpoint the same way as in CDB behaved differently.

Some variance between debugger versions is (sadly) common though. I'm not masochistic enough to attempt to unit test WinDbg or Visual Studio however. ...yet, at least.

The "gu" command string went up (at least) one stack frame more than expected.

No repro. The last time I encountered a similar issue, someone had enabled opt-level = "3" for their debug builds. I'm assuming you haven't hidden similar in a global/user-wide .cargo/config.toml however. That said, make sure you don't close win_nv's console window - you'll likely encounter TerminateProcess in an injected thread before your breakpoint.


rustc 1.75.0 (82e1608df 2023-12-21)
cdb version 10.0.22621.1
WinDbg 10.0.22621.1
Microsoft Windows [Version 10.0.19045.3930]


Using File > Open Executable...

image

@MaulingMonkey
Copy link

MaulingMonkey commented Feb 6, 2024

Slapping #[no_mangle] on __break for easier function breakpoint resolution in Visual Studio, and putting cargo-vs to use with:

pushd C:\local\_archive\2024\win_nv
cargo vs2017
cargo vs2019
cargo vs2022
start "" vs\vs2017.sln
start "" vs\vs2019.sln
start "" vs\vs2022.sln

I was able to verify hstring's visualizer seems to be working fine for Debug|x64 builds in VS as well on my machine:

VS2017

image

VS2019

image

VS2022

image

@tim-weis
Copy link
Contributor

tim-weis commented Feb 6, 2024

I cannot. Are you launching both cdb and windbg from C:\Program Files (x86)\Windows Kits\10\Debuggers\x64?

I'm not. I was using the tool formerly called "WinDbg Preview" that now goes by the name "WinDbg" and I'm struggling to disambiguate. With WinDbg from the Debugging Tools for Windows, everything works as expected. It never crossed my mind that "WinDbg" and "WinDbg" would behave differently, so thanks for that insight, @MaulingMonkey!

Just to clarify, I did my testing using the tool that used to be called "WinDbg Preview", and things are failing with that still (and Visual Studio).

The failure cases seem to be down to these two lines:

<Intrinsic Name="header" Expression="*((windows_core::strings::hstring::Header**)this)" ReturnType="windows_core::strings::hstring::Header *" />
<Intrinsic Name="is_empty" Expression="header() == nullptr" />

Avoiding this and nullptr solved the issue in "WinDbg Preview" for me (I don't know how to control .natvis files in Visual Studio, so I haven't verified that). I also couldn't find any reference documentation that explained those tokens.

@riverar
Copy link
Collaborator

riverar commented Feb 6, 2024

image

Not seeing any issues here with WinDbg Preview, client 1.2308.2002.0 / engine 10.0.25921.1001. Be aware WinDbg Preview is moving away from Store distribution and is now updated via AppInstaller https://aka.ms/windbg/download. So if you were relying solely on the Store copy, you may be running a outdated copy.

@riverar
Copy link
Collaborator

riverar commented Feb 6, 2024

Ah, I can reproduce @tim-weis's reported behavior when I introduce an empty hstring. Something is definitely funky with the evaluation of header/is_empty intrinsic. After failure, the natvis appears to stop further evaluation until reload. What's got me scratching my head is that I don't have a windows_core::strings::hstring::Header symbol.

@tim-weis
Copy link
Contributor

tim-weis commented Feb 6, 2024

Thank you, @riverar! At least it isn't just me anymore that's seeing things. The windows_core::strings::hstring::Header type should be available from the PDB. Does dt windows_core::strings::hstring::Header succeed for you?

Removing the empty HSTRING doesn't change things for me, though. It is failing either way in WinDbg Preview and Visual Studio.

@kennykerr
Copy link
Collaborator Author

Seems like something @wesleywiser would know about.

@riverar
Copy link
Collaborator

riverar commented Feb 8, 2024

In the case of empty HSTRINGs, the this prvalue is typed as a non-pointer (windows_core::strings::hstring::HSTRING), which results in cast failure. I added a intermediate cast to align the expression's behavior here.

@tim-weis Can you verify this works for you now?

<Intrinsic Name="header" Expression="*((windows_core::strings::hstring::Header**)(uintptr_t)this)" ReturnType="windows_core::strings::hstring::Header *" />

@riverar
Copy link
Collaborator

riverar commented Feb 8, 2024

Oh no, I'm discovering WinDbg, Visual Studio, and others evaluate slightly differently.

@tim-weis
Copy link
Contributor

tim-weis commented Feb 8, 2024

@riverar This change has no effect for me in WinDbg Preview. The only way for me to get the visualizer to work in WinDbg Preview is with this header expression:

<Intrinsic Name="header" Expression="(windows_core::strings::hstring::Header*)__0.tag" />

It appears as though this just isn't defined for me (in WinDbg Preview and Visual Studio). A dx this is met with an Error: Unable to bind name 'this'1. I have no idea where this symbol is (intended to be) defined. It's neither listed under the debugger intrinsics nor the pseudo variables. But it appears to be at the core of the issue I'm seeing.

Just to make sure we aren't talking past each other: The issue you (Rafael) are looking at and the issue I am observing are quite possibly distinct. While you are discovering (how, by the way?) that the expression this in a visualizer evaluates to different things across debuggers, I cannot seem to be using this as an expression altogether (in WinDbg Preview and Visual Studio).

Apparently, there's something peculiar about my specific environment.

Footnotes

  1. This is failing in CDB (where the visualizer otherwise works for me) in the same way, so this seems to be a peculiarity of the natvis infrastructure rather than the debugger engine.

@kennykerr
Copy link
Collaborator Author

For reference, here's what C++/WinRT does:

https://github.com/microsoft/cppwinrt/blob/master/natvis/cppwinrt.natvis#L56-L63

May be worth considering something similar to avoid these complications.

@MaulingMonkey
Copy link

I was using the tool formerly called "WinDbg Preview" that now goes by the name "WinDbg" and I'm struggling to disambiguate.

Ack. I've encountered this with PIX previously, which I disambiguated with the terms:

  • DXSDK PIX (e.g. installed as part of the DirectX SDK which last released June 2010)
  • VS PIX (e.g. built into Visual Studio and launched via Debug > Graphics > Start Graphics Debugging)
  • D3D12 PIX ("PIX on Windows", a modern remake/overhaul focused on D3D12+ debugging)
  • XB360 PIX (Xbox 360 PIX)
  • XB1 PIX (Xbox One PIX)

I think for myself I'll start calling these different versions of WinDbg:

It never crossed my mind that "WinDbg" and "WinDbg" would behave differently, so thanks for that insight, @MaulingMonkey!

It's pretty horrifying! 👍 A previous rabbit hole of CDB version specific debugging: rust-lang/rust#76352 (and that was without an overhaul/rewrite between CDB versions!)

I also couldn't find any reference documentation that explained those tokens.

Both are C++ keywords (this, nullptr). I'm a little suprised to hear of them not working in Visual Studio, I'm less suprised that they might cause problems in "WinDbg Preview" / UWP WinDbg. That they work for some of us but not all of us... ack. I have vague recollections of similar problems with this in the past, though, so don't let me gaslight you into thinking your install is broken somehow. Or unusually broken, at least 😄.

__0.tag

While this somewhat undermines a goal of #2077 ("This allows for changes to the HSTRING data structure without breaking the Natvis."), I'd still embrace this change - debug visualizers are sadly brittle no matter what you do IME, so I'd prioritize doing the dumb simple thing that works over cleverness, and rely on CI unit tests to catch the inevitable breakage.

@riverar
Copy link
Collaborator

riverar commented Feb 8, 2024

It appears as though this just isn't defined for me (in WinDbg Preview and Visual Studio). [...] It's neither listed under the debugger intrinsics nor the pseudo variables. But it appears to be at the core of the issue I'm seeing.

this and nullptr are C++ keywords and appear to be supported across all the C++ expression evaluators I've tested (VSpre, WinDbg, WinDbg Preview). this is bound when possible, such as in natvis intrinsic expressions or in the debugger when working with C++ class member functions. (As an example of the latter, launch/attach to charmap.exe, bp CharMap!CDropSource::AddRef, g, click around, gu, then dx this should work.)

Here's a simpler type block that should evaluate in VS and WinDbg for you that demonstrates this usage. (If not, that's very strange, we'll have to diagnose that one separately!)

<Type Name="windows_core::strings::hstring::HSTRING">
  <DisplayString>hello {(void*)this}</DisplayString>
</Type>

I believe there are several issues here commingling and making a mess. Please correct me if I'm wrong!

  1. Tim might be seeing abnormal expression behavior in some of his clients; further investigation needed, maybe differences across dbgeng.dll versions?
  2. Expression engines seem to be handling cast/safety differently, which makes it tricky to deal with (observed) this coming in with one of two types:
    • non-pointer type windows_core::strings::hstring::HSTRING (e.g. HSTRING::new();)
    • pointer type windows_core::strings::hstring::HSTRING* (e.g., h!("Hello."))
  3. Symbol-embedded natvis doesn't appear to be working at all; further investigation needed

@kennykerr
Copy link
Collaborator Author

I'm not familiar with the natvis format, or its various dialects and variants, but I did notice that the one for the Rust standard library types doesn't use "this" at all.

https://github.com/rust-lang/rust/blob/master/src/etc/natvis/libstd.natvis

@riverar
Copy link
Collaborator

riverar commented Feb 9, 2024

PR submitted w/ tweaked natvis. Works across VSpre, CDB, and WinDbg Preview. I've also verified it loads correctly when embedded in symbols.

@MaulingMonkey
Copy link

MaulingMonkey commented Feb 11, 2024

A compounding factor: when debugging *.natvis files on an unrelated project, after enabling Natvis diagnostic errors, I've noticed that incremental rust builds accumulates natvis files referenced by #![debugger_visualizer(natvis_file = "...")] rather than replacing them. This means you may have stale natvis files taking priority over your new natvis files:

Natvis: d3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6): Warning: Conflicting <Type> entries detected for type 'local_path_dependency_not_in_workspace::TYPE' at 'd3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' and 'd3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)'.  The <Type> entry at 'd3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' will have priority.
Natvis: d3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6): Warning: Conflicting <Type> entries detected for type 'local_path_dependency_not_in_workspace::TYPE' at 'd3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' and 'd3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)'.  The <Type> entry at 'd3d9create-0.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' will have priority.
Natvis: d3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6): Warning: Conflicting <Type> entries detected for type 'this_crate::Type1' at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' and 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)'.  The <Type> entry at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' will have priority.
Natvis: d3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6): Warning: Conflicting <Type> entries detected for type 'this_crate::Type2' at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6)' and 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6)'.  The <Type> entry at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6)' will have priority.
Natvis: d3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6): Warning: Conflicting <Type> entries detected for type 'this_crate::Type1' at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' and 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)'.  The <Type> entry at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(4,6)' will have priority.
Natvis: d3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6): Warning: Conflicting <Type> entries detected for type 'this_crate::Type2' at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6)' and 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6)'.  The <Type> entry at 'd3d9create-1.natvis (from C:\local\...\target\x86_64-pc-windows-msvc\debug\examples\d3d9create.pdb)(9,6)' will have priority.

This does not appear to happen when using natvis-pdbs (which simply passes /NATVIS:... to link.exe when creating a final .exe or .dll), so this is presumably a bug in rustc (and not windows nor link.exe.)

EDIT: reported upstream: rust-lang/rust#120913

@riverar
Copy link
Collaborator

riverar commented Feb 11, 2024

Nice catch, I feel bad for @tim-weis. He was probably hit by every single bug/quirk we've documented in this thread here so far 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants