Dynamic `read_register` + `RegId` support #85

bet4it · 2021-09-26T04:35:13Z

Description

New approach to #58.

Part of #70.

Closes #58. Closes #80.

API Stability

This PR does not require a breaking API change

Checklist

Implementation
- cargo build compiles without errors or warnings
- cargo clippy runs without errors or warnings
- cargo fmt was run
- All tests pass
Documentation
- rustdoc + approprate inline code comments
- Updated CHANGELOG.md
- (if appropriate) Added feature to "Debugging Features" in README.md
If implementing a new protocol extension IDET
- Included a basic sample implementation in examples/armv4t
- Included output of running examples/armv4t with RUST_LOG=trace + any relevant GDB output under the "Validation" section below
- Confirmed that IDET can be optimized away (using ./scripts/test_dead_code_elim.sh and/or ./example_no_std/check_size.sh)
- OR Implementation requires adding non-optional binary bloat (please elaborate under "Description")
If upstreaming an Arch implementation
- I have tested this code in my project, and to the best of my knowledge, it is working as intended.

bet4it · 2021-09-26T04:36:19Z

We may also need to do similar changes to the g command?

daniel5151 · 2021-09-26T07:31:07Z

Hmm, it seems this change is doing a few different things:

Instead of stack-allocating a buffer for the read/write_register command, you're instead proposing we use the packet buffer.
You want to remove the register size from from_raw_id, and have the user specify the amount of data being read/written from/to a register.

That first change seems totally reasonable. That second one... well, I'll need a better explanation for that.

The API you're proposing makes it very easy for a target to inadvertently send too much / too little data to the GDB client. The whole point of encoding the expected register size as part of the RegId trait is so that gdbstub can provide a "guard rail" to implementations, making sure they send back the right amount of data.

Could you elaborate on why you want to remove this guard rail? Am I missing something?

Note that if this is an attempt to revive/complete #58, then I would much prefer the approach taken in that PR, where the "guard-rail" becomes optional vs. removing it entirely. I think there's a lot of merit in making it "hard to do the wrong thing", which has been a guiding principle in my approach to gdbstub's API and design.

(though it should be noted, I would not want to keep the callback approach proposed in #58 - we most certainly would want to use the PacketBuf directly)

In other words: I don't mind the API changes, but I don't think removing register sizes from RegId is the right move.

daniel5151 · 2021-09-26T07:35:39Z

We may also need to do similar changes to the g command?

Could you elaborate on what you mean? The g and G packets use an entirely different approach, whereby the API doesn't expose the raw input/output buffer to the user. Instead, the Register trait performs a translation between the raw byte stream, and a nicely structured struct of registers which the API passes to the implementation.

bet4it · 2021-09-26T16:04:54Z

The whole point of encoding the expected register size as part of the RegId trait is so that gdbstub can provide a "guard rail" to implementations, making sure they send back the right amount of data.

Fine. I thought you use register size because you don't know how long the buffer should be passed. If you think the size guard is necessary, I will add it back.

I just thought that, if the user want to fill dst, he must already know the size needed. You think if he really don't know, he could get a tip from dst.len()? Or we want to rely on the panic of copy_from_slice?

The API you're proposing makes it very easy for a target to inadvertently send too much / too little data to the GDB client.

Is this so important? For example, if I want to report a 64-bit register is 0, can't I just reply "00"? ( Not tested )

Note that if this is an attempt to revive/complete #58, then I would much prefer the approach taken in that PR, where the "guard-rail" becomes optional vs. removing it entirely.

Then how to design the API if we replace usize with Option<NonZeroUsize>?

whereby the API doesn't expose the raw input/output buffer to the user

Oh, you remind me.

What I want to do is replacing the write_byte callback of gdb_serialize to PacketBuf, so we can collect all the registers in this buffer and send them out together. (Remember the discussion of write_all in #69?)

daniel5151 · 2021-09-26T19:19:55Z

I just thought that, if the user want to fill dst, he must already know the size needed. You think if he really don't know, he could get a tip from dst.len()? Or we want to rely on the panic of copy_from_slice?

Truncating the buffer to the required length makes it possible to use try_into() to convert the &[u8] into a &[u8; _] (i.e: between a reference to a variable length slice vs. a reference to a fixed-length array), which can simplify the implementation.

Though I'll admit: this is a somewhat minor detail, and we could very-well pass the full untruncated PacketBuf to the implementation, forcing the implementation to truncate it to the correct length. Then again, why force users to write boilerplate, when we can just do it ourselves ¯_(ツ)_/¯

Is this so important? For example, if I want to report a 64-bit register is 0, can't I just reply "00"? ( Not tested )

As per the spec: "The register number n is in hexadecimal, and r… contains two hex digits for each byte in the register (target byte order)."

So it seems the response needs to be exactly the right size.

Then how to design the API if we replace usize with Option<NonZeroUsize>?

I was thinking it would look something like:

    fn read_register(
        &mut self,
        tid: Id,
        reg_id: <Self::Arch as Arch>::RegId,
        buf: &mut [u8],
    ) -> TargetResult<usize, Self>;

If a size-hint was provided: the PacketBuf is truncated to the expected length prior to being passed to the handler (for the aforementioned try_into convenience), and the implementation simply returns buf.len() from the method (i.e: it used the entire buffer). Within gdbstub, we double-check that the returned length matches the size-hint, in case a faulty target implementation accidently wrote too little / too much data.
If no size-hint is provided (i.e: to enable the "dynamic RegId" use-case) the PacketBuf is not truncated, and gdbstub trusts that the target wrote the expected number of bytes into buf.

To reiterate: the rationale behind including the guard-rails in the API is realizing the following key aspect of implementing a GDB stub: it is very easy to mess up a handler implementation, and then have things Not Work:tm: for very non-obvious reasons. I love it when libraries double-check things for me, and return a useful error when I've messed up my end of the contract, thereby making my life a lot easier.

That's my philosophy with gdbtsub: folks using the library aren't using it because they're keen on reading the GDB RSP, and the details of how the debugging protocol works... they're just trying to get a stub working to debug the code they're actually working on! A such, gdbstub tries really hard to make the integration as painless as possible - including foreseeing the inevitable human mistakes we all make 😄

As a side-note: at some point, I hope to eventually revisit the idea I linked to in #80 (comment), and see if there's some way of tying a fixed-size array / actual native number type with each RegId variant. i.e: having variants like TargetRegId::Some32BitReg(&mut u32), which are passed to the read/write_register handlers, making it easy to copy data into/out of the enum without resorting to manual to/from_le_bytes() shenanigans.

This would require a lot of changes, and I still don't have a clear idea of how to best design / implement this.

Nonetheless, I thought I'd mention this idea to you, for two reasons:

I'd be interested to hear your opinions on this theoretical "radical overhaul" of the API
To make it clear that I don't mind this "iterative improvement" of the API you're proposing in this PR, and that it would most likely stay as-is for the next little while.

What I want to do is replacing the write_byte callback of gdb_serialize to PacketBuf, so we can collect all the registers in this buffer and send them out together. (Remember the discussion of write_all in #69?)

Yep, I'm all for this change. It would certainly make those implementations a lot cleaner, so if you're keen on improving that aspect of gdbstub's internal plumbing, I'd be more than happy to review + merge it in :)

Co-authored-by: Dr. Chat <arkolbed@gmail.com>

bet4it · 2021-09-27T04:31:23Z

I'd be interested to hear your opinions on this theoretical "radical overhaul" of the API

This reminds me that why we need both read_register and read_registers? We only need read_register, and pass a reference of one field of Arch::Registers struct as an argument to store the result, read_registers can be done by calling read_register with every register id.

What I want to do is replacing the write_byte callback of gdb_serialize to PacketBuf, so we can collect all the registers in this buffer and send them out together.

If I do this, I need to replace all the write_byte in gdb_serialize with copy the data to different part of buffer. Is there any easy way to do this? I may need something like this put_u8.

daniel5151 · 2021-09-27T16:57:13Z

why we need both read_register and read_registers

As per the spec: "At a minimum, a stub is required to support the ‘?’ command to tell GDB the reason for halting, ‘g’ and ‘G’ commands for register access, and the ‘m’ and ‘M’ commands for memory access."

https://sourceware.org/gdb/current/onlinedocs/gdb/Overview.html

i.e: single register reads/writes are optional, and shouldn't be a required part of a target implementation.

...though now that I re-read your question, I think you're actually suggesting something else: "why not implement the g/G packets via the read/write_register methods in the user-level API, removing the current "bulk" read/write_registers methods". Is that the correct interpretation of your question?

If so. then that is certainly an interesting point, that might merit a separate discussion. I can think of a few reasons why the current approach is better, though on the flipside, it would be nice to simplify this bit of the API...

To avoid burying this discussion in this PR, could you re-state the question as a Discussion thread? That will make it easier to refer back to the discussion in the future.

If I do this, I need to replace all the write_byte in gdb_serialize with copy the data to different part of buffer. Is there any easy way to do this? I may need something like this put_u8.

I would probably be alright with taking a dependency on something like byteorder, which includes the functionality you're looking for.

That said, I just brushed up on why exactly I opted to use a callback in gdb_serialize, and it was because I wanted to avoid leaking the detail of "unavailable bytes" (i.e: returning "xx") to the Arch implementation. Ideally, I'd like to preserve this property in whatever refactored implementation you come up with.

Also, to avoid losing this context and discussion, can you open a new Discussion thread on the topic of refactoring gdb_serialize? This isn't something that would land as part of this PR, so there's no reason to have the discussion here.

src/gdbstub_impl/ext/single_register_access.rs

src/target/ext/base/single_register_access.rs

bet4it · 2021-09-28T05:18:47Z

though now that I re-read your question, I think you're actually suggesting something else: "why not implement the g/G packets via the read/write_register methods in the user-level API, removing the current "bulk" read/write_registers methods". Is that the correct interpretation of your question?

Yes, that's what I mean!
I will re-state this to Discussion in next days.

This isn't something that would land as part of this PR, so there's no reason to have the discussion here.

Oh, I just thought that pass PacketBuf to gdb_serialize is also something related to "Pass PacketBuf when read registers", so I prepared to make the changes in this PR😅

daniel5151 · 2021-09-28T16:27:14Z

Lets keep this PR focused on changing the read_register API, and the underlying RegId API.

I've also gone ahead and changed the PR's title to reflect this focus.

daniel5151

Tiny nit, but aside from that, I think this is good to merge.

src/arch.rs

Co-authored-by: Daniel Prilik <danielprilik@gmail.com>

daniel5151

Give me a 👍 if this is good to merge :)

from_raw_id doesn't return register size now

014e418

Add register size back

2638ea0

bet4it force-pushed the reg branch from d9192b9 to 199bb18 Compare September 27, 2021 03:24

Dynamic RegId Support

633e9d1

Co-authored-by: Dr. Chat <arkolbed@gmail.com>

bet4it force-pushed the reg branch from 199bb18 to 633e9d1 Compare September 27, 2021 03:47

daniel5151 requested changes Sep 27, 2021

View reviewed changes

src/gdbstub_impl/ext/single_register_access.rs Outdated Show resolved Hide resolved

src/target/ext/base/single_register_access.rs Outdated Show resolved Hide resolved

daniel5151 mentioned this pull request Sep 27, 2021

RIP size on x86-64 #80

Closed

Some fixes

561642f

daniel5151 changed the title ~~Pass PacketBuf when read registers~~ Dynamic read_register + RegId support Sep 28, 2021

daniel5151 self-requested a review September 28, 2021 16:29

daniel5151 marked this pull request as ready for review September 28, 2021 16:29

daniel5151 requested changes Sep 28, 2021

View reviewed changes

src/arch.rs Outdated Show resolved Hide resolved

Fix import

7b0561a

Co-authored-by: Daniel Prilik <danielprilik@gmail.com>

daniel5151 self-requested a review September 28, 2021 18:41

daniel5151 approved these changes Sep 28, 2021

View reviewed changes

daniel5151 merged commit c5489a0 into daniel5151:dev/0.6 Sep 29, 2021

bet4it deleted the reg branch September 29, 2021 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic `read_register` + `RegId` support #85

Dynamic `read_register` + `RegId` support #85

bet4it commented Sep 26, 2021 •

edited by daniel5151

bet4it commented Sep 26, 2021

daniel5151 commented Sep 26, 2021 •

edited

daniel5151 commented Sep 26, 2021

bet4it commented Sep 26, 2021

daniel5151 commented Sep 26, 2021 •

edited

bet4it commented Sep 27, 2021

daniel5151 commented Sep 27, 2021

bet4it commented Sep 28, 2021

daniel5151 commented Sep 28, 2021

daniel5151 left a comment

daniel5151 left a comment

Dynamic read_register + RegId support #85

Dynamic read_register + RegId support #85

Conversation

bet4it commented Sep 26, 2021 • edited by daniel5151

Description

API Stability

Checklist

bet4it commented Sep 26, 2021

daniel5151 commented Sep 26, 2021 • edited

daniel5151 commented Sep 26, 2021

bet4it commented Sep 26, 2021

daniel5151 commented Sep 26, 2021 • edited

bet4it commented Sep 27, 2021

daniel5151 commented Sep 27, 2021

bet4it commented Sep 28, 2021

daniel5151 commented Sep 28, 2021

daniel5151 left a comment

Choose a reason for hiding this comment

daniel5151 left a comment

Choose a reason for hiding this comment

Dynamic `read_register` + `RegId` support #85

Dynamic `read_register` + `RegId` support #85

bet4it commented Sep 26, 2021 •

edited by daniel5151

daniel5151 commented Sep 26, 2021 •

edited

daniel5151 commented Sep 26, 2021 •

edited