Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdbstub_arch: Add support for AArch64 #109

Merged
merged 2 commits into from Aug 17, 2022

Conversation

ptosi
Copy link
Contributor

@ptosi ptosi commented Jul 25, 2022

Description

Implement gdbstub::arch::Arch for the 64-bit mode of the ARMv8-A architecture.

The noticeable difference between AArch64 and other supported architectures (incl. RISC-V, AFAICT) is the very high number of system registers (theoretically up to ~64k, currently ~800) that the architecture provides. When debugging baremetal code, having access to those registers through the debugger is essential.

To keep the amount of compiled code low, we represent all registers through a RegId-compliant enum where system registers contain their internal u16 "opcode" (this is the value used by the ISA to encode the register into an instruction) and use the regnum XML value to teach the GDB client about those u16 opcodes i.e. when the client sends a 'p' or 'P' packet to gdbstub, .from_raw_id() is trivial as it simply needs to wrap the regnum in the corresponding Rust enum. This provides a future-proof system with no risk of clash with core registers (as the architecture guarantees that the opcode's top bit will always be 1). We also provide compile-time const instances of the System variant to greatly improve readability of gdbstub's client; although impressive in terms of LoC, these constant values should be 0-cost once compiled.

The list of registers (code and XML entries) was auto-generated from official ARM register descriptions.

Note that the XML itself represents a large increase in .rodata size; a more dynamic approach would involve the target figuring out, for each register, if it is supported. This would improve the experience of the GDB client as only relevant registers would be reported through info registers but would prevent us from encapsulating the "regnum hack" described above in gdbstub_arch as the Target (or similar) would then be responsible for transmitting the XML (e.g. by implementing TargetDescriptionXmlOverride), would increase start-up time, and client code would need to make sure that .from_raw_id() doesn't break (although providing a .to_raw_id() would probably be enough).

It would also most likely duplicate code between gdbstub clients (the library should be where such code is centralized) and make gdbstub harder to use on AArch64, compared to other architectures. A simple way to reduce the size of the XML as it is here, if wanted, could be to group system registers in logical groups (e.g. those accessible from EL1, EL2 only, EL3 only, debug registers, ...) and include_str! based on a cfg but that would add architecture-specific features to the crate, which we might not want(?)

API Stability

  • This PR does not require a breaking API change

Note that AArch64RegId exposes a .len() method publicly, which can be useful to client code as AArch64 registers can be of different sizes.

Checklist

  • Documentation
    • Ensured any public-facing rustdoc formatting looks good (via cargo doc) -> how for gdbstub_arch?
  • Validation
    • Included output of running examples/armv4t with RUST_LOG=trace + any relevant GDB output under the "Validation" section below -> not relevant
    • Included output of running ./example_no_std/check_size.sh before/after changes under the "Validation" section below -> not relevant
  • If upstreaming an Arch implementation
    • I have tested this code in my project, and to the best of my knowledge, it is working as intended (see crosvm CLs).

Validation

GDB output

On a AArch64 Target:

(gdb) info registers 
x0             0x80000000          2147483648
x1             0x0                 0
x2             0x0                 0
x3             0x0                 0
x4             0x0                 0
x5             0x0                 0
x6             0x0                 0
x7             0x0                 0
x8             0x0                 0
x9             0x0                 0
x10            0x0                 0
x11            0x0                 0
x12            0x0                 0
x13            0x0                 0
x14            0x0                 0
x15            0x0                 0
x16            0x0                 0
x17            0x0                 0
x18            0x0                 0
x19            0x0                 0
x20            0x0                 0
x21            0x0                 0
x22            0x0                 0
x23            0x0                 0
x24            0x0                 0
x25            0x0                 0
x26            0x0                 0
x27            0x0                 0
x28            0x0                 0
x29            0x0                 0
x30            0x0                 0
sp             0x0                 0x0
pc             0x80200000          0x80200000
cpsr           0x3c5               [ SP EL=2 F I A D BTYPE=1 ]
fpsr           0x0                 [ ]
fpcr           0x0                 [ RMode=0 ]
OSDTRRX_EL1    <unavailable>
--Type <RET> for more, q to quit, c to continue without paging--
armv4t output

Necessary?

Before/After `./example_no_std/check_size.sh` output

No difference, as expected:

Before

File  .text    Size             Crate Name
2.9%  72.7% 10.9KiB         [Unknown] main
0.2%   4.6%    706B           gdbstub gdbstub::stub::state_machine::GdbStubStateMachineInner<gdbstub::stub::state_machine::state::Running,T,C>::report_stop
0.1%   2.0%    310B           gdbstub gdbstub::protocol::commands::breakpoint::BasicBreakpoint::from_slice
0.1%   1.7%    268B           gdbstub gdbstub::protocol::common::hex::decode_hex_buf
0.1%   1.7%    259B           gdbstub <gdbstub::protocol::common::thread_id::ThreadId as core::convert::TryFrom<&[u8]>>::try_from
0.1%   1.6%    253B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_specific_thread_id
0.1%   1.6%    249B           gdbstub gdbstub::stub::core_impl::resume::<impl gdbstub::stub::core_impl::GdbStubImpl<T,C>>::write_stop_common
0.1%   1.3%    207B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write
0.0%   0.8%    130B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::inner_write
0.0%   0.8%    127B           gdbstub gdbstub::protocol::common::hex::decode_hex
0.0%   0.7%    114B           gdbstub gdbstub::protocol::common::hex::decode_hex
0.0%   0.7%    111B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::flush
0.0%   0.7%    104B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_num
0.0%   0.7%    103B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_num
0.0%   0.7%    102B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::read_addrs
0.0%   0.6%     96B           gdbstub <gdbstub::protocol::common::thread_id::IdKind as core::convert::TryFrom<&[u8]>>::try_from
0.0%   0.6%     93B         [Unknown] __libc_csu_init
0.0%   0.6%     88B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_hex
0.0%   0.5%     79B           gdbstub <usize as gdbstub::internal::be_bytes::BeBytes>::from_be_bytes
0.0%   0.5%     77B           gdbstub <u32 as gdbstub::internal::be_bytes::BeBytes>::from_be_bytes
0.0%   0.4%     65B      gdbstub_arch <gdbstub_arch::arm::reg::arm_core::ArmCoreRegs as gdbstub::arch::Registers>::gdb_deserialize
0.0%   0.4%     65B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::write_addrs
0.0%   0.4%     65B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::write_registers
0.0%   0.4%     65B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::read_registers
0.0%   0.4%     57B compiler_builtins __rust_probestack
0.0%   0.3%     50B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadResume>::set_resume_action_continue
0.0%   0.3%     50B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadResume>::clear_resume_actions
0.0%   0.3%     50B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadResume>::resume
0.0%   0.3%     43B         [Unknown] _start
0.0%   0.0%      6B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::breakpoints::SwBreakpoint>::remove_sw_breakpoint
0.0%   0.0%      6B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::breakpoints::SwBreakpoint>::add_sw_breakpoint
0.0%   0.0%      1B         [Unknown] __libc_csu_fini
0.0%   0.0%      0B                   And 0 smaller methods. Use -n N to show more.
4.1% 100.0% 15.0KiB                   .text section size, the file size is 369.6KiB
target/release/gdbstub-nostd  :
section               size    addr
.interp                 28     792
.note.gnu.property      32     824
.note.gnu.build-id      36     856
.note.ABI-tag           32     892
.gnu.hash               36     928
.dynsym                360     968
.dynstr                193    1328
.gnu.version            30    1522
.gnu.version_r          48    1552
.rela.dyn              408    1600
.init                   23    4096
.plt                    16    4128
.plt.got                 8    4144
.text                15345    4160
.fini                    9   19508
.rodata                914   20480
.eh_frame_hdr          284   21396
.eh_frame             1472   21680
.init_array              8   28072
.fini_array              8   28080
.dynamic               448   28088
.got                   136   28536
.data                    8   28672
.bss                     8   28680
.comment                30       0
Total                19920

After

File  .text    Size             Crate Name
2.9%  72.7% 10.9KiB         [Unknown] main
0.2%   4.6%    706B           gdbstub gdbstub::stub::state_machine::GdbStubStateMachineInner<gdbstub::stub::state_machine::state::Running,T,C>::report_stop
0.1%   2.0%    310B           gdbstub gdbstub::protocol::commands::breakpoint::BasicBreakpoint::from_slice
0.1%   1.7%    268B           gdbstub gdbstub::protocol::common::hex::decode_hex_buf
0.1%   1.7%    259B           gdbstub <gdbstub::protocol::common::thread_id::ThreadId as core::convert::TryFrom<&[u8]>>::try_from
0.1%   1.6%    253B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_specific_thread_id
0.1%   1.6%    249B           gdbstub gdbstub::stub::core_impl::resume::<impl gdbstub::stub::core_impl::GdbStubImpl<T,C>>::write_stop_common
0.1%   1.3%    207B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write
0.0%   0.8%    130B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::inner_write
0.0%   0.8%    127B           gdbstub gdbstub::protocol::common::hex::decode_hex
0.0%   0.7%    114B           gdbstub gdbstub::protocol::common::hex::decode_hex
0.0%   0.7%    111B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::flush
0.0%   0.7%    104B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_num
0.0%   0.7%    103B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_num
0.0%   0.7%    102B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::read_addrs
0.0%   0.6%     96B           gdbstub <gdbstub::protocol::common::thread_id::IdKind as core::convert::TryFrom<&[u8]>>::try_from
0.0%   0.6%     93B         [Unknown] __libc_csu_init
0.0%   0.6%     88B           gdbstub gdbstub::protocol::response_writer::ResponseWriter<C>::write_hex
0.0%   0.5%     79B           gdbstub <usize as gdbstub::internal::be_bytes::BeBytes>::from_be_bytes
0.0%   0.5%     77B           gdbstub <u32 as gdbstub::internal::be_bytes::BeBytes>::from_be_bytes
0.0%   0.4%     65B      gdbstub_arch <gdbstub_arch::arm::reg::arm_core::ArmCoreRegs as gdbstub::arch::Registers>::gdb_deserialize
0.0%   0.4%     65B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::write_addrs
0.0%   0.4%     65B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::write_registers
0.0%   0.4%     65B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadBase>::read_registers
0.0%   0.4%     57B compiler_builtins __rust_probestack
0.0%   0.3%     50B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadResume>::set_resume_action_continue
0.0%   0.3%     50B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadResume>::clear_resume_actions
0.0%   0.3%     50B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::base::multithread::MultiThreadResume>::resume
0.0%   0.3%     43B         [Unknown] _start
0.0%   0.0%      6B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::breakpoints::SwBreakpoint>::remove_sw_breakpoint
0.0%   0.0%      6B    gdbstub_nostd? <gdbstub_nostd::gdb::DummyTarget as gdbstub::target::ext::breakpoints::SwBreakpoint>::add_sw_breakpoint
0.0%   0.0%      1B         [Unknown] __libc_csu_fini
0.0%   0.0%      0B                   And 0 smaller methods. Use -n N to show more.
4.1% 100.0% 15.0KiB                   .text section size, the file size is 369.6KiB
target/release/gdbstub-nostd  :
section               size    addr
.interp                 28     792
.note.gnu.property      32     824
.note.gnu.build-id      36     856
.note.ABI-tag           32     892
.gnu.hash               36     928
.dynsym                360     968
.dynstr                193    1328
.gnu.version            30    1522
.gnu.version_r          48    1552
.rela.dyn              408    1600
.init                   23    4096
.plt                    16    4128
.plt.got                 8    4144
.text                15345    4160
.fini                    9   19508
.rodata                914   20480
.eh_frame_hdr          284   21396
.eh_frame             1472   21680
.init_array              8   28072
.fini_array              8   28080
.dynamic               448   28088
.got                   136   28536
.data                    8   28672
.bss                     8   28680
.comment                30       0
Total                19920

@daniel5151 daniel5151 self-requested a review July 25, 2022 19:00
@daniel5151
Copy link
Owner

Thank you for the contribution!

This is certainly one of the more interesting Arch implementations I've seen, and I'm looking forward to merging it in :)


wrt. clippy and rustfmt: I think it's totally fine to slap a #[rustfmt::ignore] on the large constants array, and use #[allow(clippy::unusual_byte_groupings)] when appropriate.

wrt. some of the validation: I will continue to iterate on the specific wording on the PR template, and make it clear that in the case of adding a new Arch implementation, armv4t + check_size validation is not required.


Your analysis regarding target.xml is spot on, and it seems you've got a good handle on the different challenges + solutions to supporting target-specific subsets of system registers. And indeed, this issue falls under the umbrella of #12, which is currently the longest-standing unsolved issue in gdbstub.

One (currently unimplemented) feature that could be particularly useful for AArch64 is <xi:include>, which I describe in #12 (comment). I haven't fully fleshed out how this would work at the Arch/Target level, but the beauty of <xi:include> is that it would enable mixed static + dynamic target.xml reporting, where the "base" AArch64 implementation can embed core target XML as static .rodata, while deferring the dynamic system-register portion to a different handler (one that either generates it on-the-fly, or returns a pre-baked target-specific XML file).

A rough sketch of how this might work at the Arch level (for demonstrative purposes - not necessarily a final design I'd want to see implemented):

trait Sysregs {
    fn xml() -> &'static str {}
}

enum AArch64NoSysregs {}
impl Sysregs for AArch64NoSysregs {
    fn xml() -> &'static str { "" }
}


enum AArch64<S: Sysregs = AArch64NoSysregs> {}

impl<S: Sysregs> Arch for AArch64<S> {
    // support non-`target.xml` annexes
    fn target_description_xml(annex: &[u8]) -> Option<&'static str> {
        match annex {
            b"target.xml" => { /* return static string which includes a <xi:include href="sysregs.xml"/> tag */ }
            b"sysregs.xml" => { S::xml() }    
        }
    }
}

You could imagine extending Sysregs to also tie in to type RegId and/or type Registers, and if you go even further, it might be possible to define a macro (possibly proc macro?) to streamline the definition of this type + ensure the XML matches the type defn.


...of course, as exciting as all this sounds, this would require quite a bit of core gdbstub work, along with a healthy amount of design and implementation back and forth, along with an API breaking 0.7 release. I would be more than happy to embark on this journey with you, but if you're not up for such a hands-on contribution, that's totally understandable and fine with me :)

So, with that being said, assuming we want to land this in 0.6 (which I think we should), I see a few different options here:

  • Merge as-is, and report all system registers as part of the built-in target.xml
  • Don't report any system registers as part of the built-in target.xml, and include some rustdocs that live alongside AArch64 explaining why this is the case + how to support system registers (e.g: requiring users to maintain their own Arch implementation, using TargetXmlOverride, etc...)
  • Use some [proc]macro magic to make it easy for users to define a custom-tailored version of the AArch64 arch that suits their needs

(note that I would not want to introduce arch-specific compile time feature flags to gdbstub_arch)

I'm OK with any of those 3 options, so it really just depends what you're up for.

Note that regardless which option we go for, you'll need to beef-up the docs on the AArch64 type / module to discuss some of the nuance in supporting the large number of system registers. Or, in other words: most of the stuff in the PR description + some stuff from this follow-up comment should become part of the docs.

@ptosi
Copy link
Contributor Author

ptosi commented Jul 28, 2022

One (currently unimplemented) feature that could be particularly useful for AArch64 is xi:include, which I describe in #12 (comment).

That's a great idea: ship all of the XML in the Arch implementation and let the Target tell the compiler which parts to carve out by actually using them (on not) in its <xi:include> request handler! No need for compile time arch-specific Arch features but still pay-for-what-you-use .rodata size.

(note that I would not want to introduce arch-specific compile time feature flags to gdbstub_arch)

+1

I'm OK with any of those 3 options, so it really just depends what you're up for.

Okay, let me suggest a fourth one, then!

I'm not sure I follow your code example so please let me know if I'm missing something but instead of having a per-sysreg granularity (which I assume from your example), what do you think of introducing an Arch-specific GDB feature enum that would obviously cover the Standard Target Features but also our gdbstub-standardized features (debug, EL3, EL2, arch revisions, ...).

We could then have one <xi:include> "document" name per feature and implement an abstraction where gdbstub common code knows the mapping document name (from the GDB client) -> feature enum variant. The Target would then have an XML-producing method receiving that variant and would match on it: if it wants it included, it "returns" the XML that the Arch ships for that feature, otherwise, it returns nothing ("_ => None") and the compiler (hopefully!) figures out that we're not interesting in compiling the XML for that feature into the final binary. The Target then ships its own XML containing only the appropriate <xi:include>.

All of the XML-handling code remains abstracted away in Arch and gdbstub, the runtime overhead of packet exchange is negligible, AArch64 is flexible from the PoV of the Target but still straightforward, the .rodata size is minimized, and we have a neat way to support architecture extensions (because this isn't ARMv4T, it's going to evolve! 😄) while having a design that is coherent with the Target/Arch architecture of gdbstub and the XML terminology of GDB. Note that, regarding per-sysreg control, the Target can very well advertise a feature of which it doesn't support all registers by relying on #107 (yes, it would still ship the XML for unsupported registers but features should be small enough for it not to matter); what do you think?

BTW, would this necessarily need to break the "API" (e.g. by extending the Arch trait to have a Arch::Feature type) or do you see a way to keep the code backward-compatible?

it might be possible to define a macro (possibly proc macro?) to streamline the definition of this type + ensure the XML matches the type defn.

Making rustc generate the XML is indeed the ultimate goal, I think it's doable to get the result I've committed today but a further improvement I had in mind was to also add proper type description to the XML, which might be too challenging and/or verbose to also generate at compile-time (I haven't really thought about it yet, suggestion welcome!).

Ideally, the macro should be architecture-agnostic.

Note that regardless which option we go for, you'll need to beef-up the docs on the AArch64 type / module to discuss some of the nuance in supporting the large number of system registers.

Will do!

@daniel5151
Copy link
Owner

daniel5151 commented Jul 28, 2022

Okay, let me suggest a fourth one, then!

I read through your suggestion, and yeah, I think that'd work! Quite elegantly at that, I might add!

There is one thing you didn't touch on in your proposal that merits some thought: how this would affect type Registers?

After all, that type is supposed to represent all possible (core) register types a Arch exposes. Right now, this is generally modeled as a struct, with all supported feature registers are lumped into one big object which gets filled out in one pass. I suppose in this model, we'd want to tweak the type to be enum based, where the read/write_registers API get invoked multiple times when reporting different supported subsets of registers?

BTW, would this necessarily need to break the "API"

Until we have some kind of implementation on the table, it's hard to judge whether something requires a breaking API change. That said... my gut feeling is that any contortions we might do to avoid a breaking change would be overly complicated vs. just biting the bullet and tweaking APIs directly.

FWIW, releasing 0.7 isn't a huge deal, as solving #12 is as good of an excuse as any for a breaking release! Not to mention there are already a trail of small API papercuts that I wouldn't mind closing out in 0.7 as well :)

Making rustc generate the XML is indeed the ultimate goal

For clarity's sake, what I was proposing was a more moderate approach of rustc slicing/dicing existing XML together, rather than generating it from scratch.

That said, it would be cool to have some kind of comptime and/or runtime XML generator, as this would get us closer to unifying classic LLDB register definitions with target.xml (#103), but this is very much in the realm of long-term "nice to have".


With all that said, you didn't actually give me a straight answer for what you wanted to do in the short term with this PR 😅

Would you like to land things as-is (using one of those options I mentioned above), or hold-off until we have some kind of dynamic target feature selection story?

@ptosi
Copy link
Contributor Author

ptosi commented Aug 2, 2022

how this would affect type Registers?

From my basic understanding of the GDB protocol (you surely know more than I do), I had assumed that Registers ({read,write}_registers) represented the content of GDB's 'g' packets. On AArch64 those registers are mandatory (see org.gnu.gdb.aarch64.core) and any other register is accessed through 'p' packets ({read,write}_register and its regnum-wrapping RegId).

In this context (which might not hold for other architectures), I don't see the point of extending {read,write}_registers to enum variants representing non-aarch64.core GDB features: what GDB transaction would result in those being called and why should those particular registers be returned grouped based on our arbitrary features? If a GDB client wants to read some non-core registers, what's preventing it from successively issuing the corresponding 'p' packets?

From my PoV, I believe that the feature-based split shouldn't transpire beyond the gdbstub-GDB layer and shouldn't reach the (gdb) prompt; why would the end user be interested in knowing what feature contained which register or which registers where grouped under the same feature?

So, to answer your questions: it shouldn't if my assumptions about AArch64 are valid across architectures.

That said... my gut feeling is that any contortions we might do to avoid a breaking change would be overly complicated vs. just biting the bullet and tweaking APIs directly.

Yes, I think that's the right way to look at it.

Not to mention there are already a trail of small API papercuts that I wouldn't mind closing out in 0.7 as well :)

I can tell: https://github.com/daniel5151/gdbstub/milestone/3 :)

That said, it would be cool to have some kind of comptime and/or runtime XML generator

Yes, that's what I had in mind, similar to what thiserror provides. I've never used procedural macros so I might be underevaluating the difficulty of implementing this (and whether it'd be overkill) but we could derive everything from that one definition: register size (and the .len() method), feature-grouping, the XML, and the const values.

Would you like to land things as-is (using one of those options I mentioned above), or hold-off until we have some kind of dynamic target feature selection story?

Right, I'd probably like to land things as-is so that you can release a new (minor) version of the crate and I get unblocked on the crosvm side. Regarding the 3 options, I'd say we ship all the registers given that we currently don't have other users and in particular, have yet to see "embedded" integrations of AArch64 gdbstub that would care about the XML size ... Let's also open an issue to track implementing the GDB-feature-as-an-enum idea and another one for automatic generation of the XML (which should really be architecture-agnostic and specialized for AArch64 to implement the regnum-based mapping).

@daniel5151
Copy link
Owner

daniel5151 commented Aug 3, 2022

In this context (which might not hold for other architectures) [...]

Ahh, yeah, unfortunately, this doesn't hold for other architectures.

The contents of the g/G packet isn't related to "core" vs. non-core registers at all - rather, a target may send back however many registers it wants, and GDB will map said data onto registers based on their number value (as determined in the target.xml + the "regnum" attribute).

In practice, this means that many targets send their core registers along with their FPU registers as part of the g/G packets, leaving high-numbered sysregs to the purview of the p/P packets.

After all, while your use case doesn't benefit from sending FPU state back and forth on each bulk register transfer, that's not strictly true in all cases. Indeed, check out some of the Arch impls for mips, and you'll find that to work around current limitations in gdbstub, there is an entirely separate Arch implementation for MIPS with a DSP, which uses a different Registers type (that includes both the core regs and the DSP regs).

EDIT: I re-read my initial comment, and I can see where the confusion was! When I said "core", I was using it in the sense of "registers that aren't high-regnum sysregs", rather than the formal "core" feature in the target.xml. My bad - I should've been using more specific terminology.

If a GDB client wants to read some non-core registers, what's preventing it from successively issuing the corresponding 'p' packets?

Nothing, aside from perf :)

Way back in the day, the g/G packets were the only way to communicate register state to/from GDB, and are still part of the required protocol. That said, in 2022, my read (as a non-GDB dev) is that the g/G packets are moreso an "optimization", whereby you theoretically could do everything using only p/P packets, but doing so would require a roundtrip for every single packet, and this would make network debugging unbearable.

(aside: it'd be great if gdbstub had an API to tap into the "return register values as part of the Txx response reason", as this is another one of those features that'd make it easier to cut down on single-register-access roundtrips).

And really, this touches on an interesting point: the decision as to which registers should be sent as part of the bulk g/G packets vs. the excess p/P packets really ought to be a Target-specific implementation option. My "use an enum based Register type + support multiple {read,write)_registers invocations + have a target-specific metadata method to report which enum variants should be collected/reported" idea is one way to solve this problem.

Right, I'd probably like to land things as-is so that you can release a new (minor) version of the crate and I get unblocked on the crosvm side.

Sure, sounds good. Seems CI is failing right now, but if you can resolve those, we can go ahead and merge it in + push out a release :)

Let's also open an issue to track implementing the GDB-feature-as-an-enum idea and another one for automatic generation of the XML (which should really be architecture-agnostic and specialized for AArch64 to implement the regnum-based mapping).

Yeah, sounds good!

@michael2012z
Copy link

Right, I'd probably like to land things as-is so that you can release a new (minor) version of the crate and I get unblocked on the crosvm side. Regarding the 3 options, I'd say we ship all the registers given that we currently don't have other users and in particular, have yet to see "embedded" integrations of AArch64 gdbstub that would care about the XML size ... Let's also open an issue to track implementing the GDB-feature-as-an-enum idea and another one for automatic generation of the XML (which should really be architecture-agnostic and specialized for AArch64 to implement the regnum-based mapping).

Besides crosvm, Cloud Hypervisor will also benefit from this work. :)

Hi, I am working on the AArch64 GDB support of CH: cloud-hypervisor/cloud-hypervisor#4355. And I kept monitoring this PR and the knowledgeable discussion.

It's nice to know that the first-stage implementation was agreed.

@ptosi
Copy link
Contributor Author

ptosi commented Aug 11, 2022

rather, a target may send back however many registers it wants, and GDB will map said data onto registers based on their number value (as determined in the target.xml + the "regnum" attribute).

Thanks, I now understand that my misunderstanding came from my assumption that the GDB client somehow knew about "important registers" and was programmed to expect them from g packets but that's absolutely not the case (see arm_register_g_packet_guesses), as you explained: the GDB client receives a blank canvas and GDB simply gobbles up all the bytes it receives back by following the regnum+bitsize it got from the XML (which, in a way, is quite a neat architecture-agnostic implementation) and the "structure" of the exchange is actually only imposed by gdbstub (Arch::Registers)!

After all, while your use case doesn't benefit from sending FPU state back and forth on each bulk register transfer, that's not strictly true in all cases.

Absolutely! It should be the exact opposite of what I implemented, actually: FP&SIMD registers should be part of g packets (note that AArch64 mandates FP support!) but this shows the limits of the "regnum hack" presented above as, unlike the Vx registers, FPSR and FPCR (also part of the standard org.gnu.gdb.aarch64.fpu target feature) are "regular" system registers but using their opcode-regnum would prevent them from being part of g packets ... Having said that, I think it would make sense to consider those as a special case and support both the opcode-based regnum and the "counter" regnum.

And really, this touches on an interesting point: the decision as to which registers should be sent as part of the bulk g/G packets vs. the excess p/P packets really ought to be a Target-specific implementation option.

Agreed, for example in the case of crosvm/CH, we'd rather cut that to the minimum we can get away with as we need to issue an ioctl per register (KVM_SET_ONE_REG/KVM_GET_ONE_REG) so, if the client isn't interested in the FPU registers, they represent an unnecessary increased latency in servicing those packets (although I'm measuring ~100us per register so that's really not too bad).

But I don't see a way to control that at run-time or even in the Target implementation: if the architecture mandates supporting FP, how can we tell beforehand if it's going to be used by the debugged code? This is more complex than the MIPS situation (which AFAIU knows at Target-compile-time whether the FPU and its registers are available).

Seems CI is failing right now, but if you can resolve those, we can go ahead and merge it in + push out a release :)

Right, I've silenced clippy::unusual_byte_groupings and applied the line wrapping from cargo fmt.

I have pushed a few fixup! commits, which I intend to squash or drop depending on whether you want to keep them (I don't have a strong opinion), here's what they do:

  • updates the XML, AArch64CoreRegs, and AArch64RegId::from_raw_id to add FPU registers to g/G packets;
  • reworks .len() to panic when called on a variant that has been wrongly constructed (code should be in place to catch that!), which would allow the signature to become -> usize instead of -> Option<usize> (not implemented yet): WDYT? Note, in its current state, clippy will complain but I'll handle that during the rebase if you're fine with unreachable;
  • silences clippy::len_without_is_empty: are you fine with this or should I implement that method?

@daniel5151
Copy link
Owner

But I don't see a way to control that at run-time or even in the Target implementation: if the architecture mandates supporting FP, how can we tell beforehand if it's going to be used by the debugged code? This is more complex than the MIPS situation (which AFAIU knows at Target-compile-time whether the FPU and its registers are available).

I'm not sure I follow. As discussed earlier, there really isn't any such things as "architecture mandated registers", and it's entirely up to the discretion of the GDB stub to decide what registers it wants to report to the GDB client. i.e: you're well within your right to back a Aarch64 target.xml that's missing FP defns back to the GDB client, and never report FP registers. The only thing that'll happen is... you won't see those registers in your GDB client (even in the target is still using them under-the-hood).

In any case, wrt "how can we tell beforehand if it's going to be used by the debugged code": you're the one deciding the workloads! If you know you'll never need FP registers, feel free to leave them out. If you think you'll need them... include them. Maybe even structure the code to support a runtime switch that toggles them on/off depending on what specific workload you're debugging.

And fwiw, all the same logic applies to MIPS as well. And indeed, more concretely: there's nothing stopping someone from using the base Mips arch without DSP regs on a system with DSP regs. All that means is they'd be missing out on extra debug context.

I have pushed a few fixup! commits, which I intend to squash or drop depending on whether you want to keep them (I don't have a strong opinion)

I typically just squash-merge all PRs, so feel free to leave them as-is.

  • reworks .len() to panic

Hmmmmm...

While there is an argument to be made that the "no panic" guarantee shouldn't necessarily extend to gdbstub_arch, I think that's something I'll want to punt on for now.

Instead, I'd just swap out the unreachable for a None, which while not perfect, seems reasonable.

  • silences clippy::len_without_is_empty: are you fine with this or should I implement that method?

Nah, that's fine.

Though fwiw, I would just inline the .len() implementation, which would invalidate both of those last two bullet points :)


Also, bump from earlier in this PR discussion:

Note that regardless which option we go for, you'll need to beef-up the docs on the AArch64 type / module to discuss some of the nuance in supporting the large number of system registers. Or, in other words: most of the stuff in the PR description + some stuff from this follow-up comment should become part of the docs.

No need to go overboard or anything, but please add some module/type level docs (when appropriate) that direct users to the big lump of constants that are included in the code + draw attention to the fact that the default target.xml is massive, and that users may wish to fork this Arch to only report a smaller subset of registers that they directly support.

(oh, and link back to this PR in the docs - lots of useful context in this thread).

Implement gdbstub::arch::Arch for the 64-bit mode of the ARMv8-A
architecture.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
@ptosi
Copy link
Contributor Author

ptosi commented Aug 15, 2022

Instead, I'd just swap out the unreachable for a None, which while not perfect, seems reasonable.

Sounds good, I dropped 3796559

Though fwiw, I would just inline the .len() implementation, which would invalidate both of those last two bullet points :)

Is that possible for a pub method? Client code might need to know how large a register described by a AArch64RegId is.

Also, bump from earlier in this PR discussion:

I've added some documentation for the sysregs, which is the peculiarity of this architecture.

@daniel5151
Copy link
Owner

Though fwiw, I would just inline the .len() implementation, which would invalidate both of those last two bullet points :)

Is that possible for a pub method? Client code might need to know how large a register described by a AArch64RegId is.

I suppose it's fine to leave things as-is then. For the sake of consistency, I'm not too keen on having individual gdbstub_arch Arch implementations include these sorts of non-standard "helpers", but I suppose this isn't too intrusive...


Ignore that CI failure. The CI will alway uses the latest Rust version, and it seems 1.63 comes with a new clippy lint that I'll need to fix.


I think this should be good to merge!

Before I do that though, it'd be nice if @michael2012z could chip in and validate this implementation as well.

I won't block the PR on it or anything, but giving them a few days to chime in seems reasonable :)

@michael2012z
Copy link

Before I do that though, it'd be nice if @michael2012z could chip in and validate this implementation as well.

I am glad to do that. A little work is needed in my code to adapt to this PR. Will feedback late today or tomorrow.

@michael2012z
Copy link

I integrated the code of the PR to my development branch: https://github.com/michael2012z/cloud-hypervisor/tree/aarch64-gdb

Everything looks good! info registers printed the expected long list of registers. And other (a few) features that had been enabled on the branch worked well too.

Thank you both for this excellent code.

@ptosi
Copy link
Contributor Author

ptosi commented Aug 17, 2022

Thanks, @michael2012z!

info registers printed the expected long list of registers

It's not pretty but at least, it works; I'll try to improve that: #12 (comment)

@daniel5151 daniel5151 merged commit 78134a3 into daniel5151:master Aug 17, 2022
@daniel5151
Copy link
Owner

PR is merged, and gdbstub 0.6.3 + gdbstub_arch 0.2.4 are both published 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants