Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocess debugging #124

Open
daniel5151 opened this issue Feb 28, 2023 · 8 comments
Open

Multiprocess debugging #124

daniel5151 opened this issue Feb 28, 2023 · 8 comments
Labels
API-breaking Breaking API change design-required Getting this right will require some thought help wanted Extra attention is needed new-api Add a new feature to the API (possibly non-breaking)

Comments

@daniel5151
Copy link
Owner

daniel5151 commented Feb 28, 2023

gdbstub already implements protocol-level multiprocess extensions "under-the-hood", and simply hard-codes a fake PID in single/multi-threaded mode (much the same way it hard-codes a fake TID in single-threaded mode).

Adding true multiprocess debugging will most likely involve adding a new target::ext::base::multiprocess API set, along with doing the requisite plumbing to track and report the current PID.

In addition, we'd want to add a new in-tree armv4t_multiprocess example, which wouldn't actually spin up multiple processes, but would instead spin up multiple multi-core armv4t emulator instances, just so there's an in-tree example of this stuff working.

Why API-breaking? Unfortunately, I neglected to mark the BaseOps enum as non_exhaustive, which means implementing this will be a breaking change. Aside from that, this likely could've been entirely non-breaking...

@daniel5151 daniel5151 added API-breaking Breaking API change new-api Add a new feature to the API (possibly non-breaking) design-required Getting this right will require some thought labels Feb 28, 2023
@daniel5151 daniel5151 added the help wanted Extra attention is needed label Mar 2, 2023
@xobs
Copy link
Contributor

xobs commented Mar 9, 2023

I think the only thing that needs to be implemented is info os processes:

(gdb) info os processes
[remote] Sending packet: $qXfer:osdata:read:processes:0,1000#fa
[remote] Packet received: l<osdata type="processes">\n<item><column name="pid">1</column><column name="user">root</column><column name="command">/init </column><column name="cores">6,15</column></item><item><column name="pid">4</column><column name="user">root</column><column name="command">plan9 --control-socket 5 --log-level 4 --server-fd 6 --pipe-fd 8 --log-truncate </column><column name="cores">5,19</column></item><item><column name="pid">7</column><column name="user">root</column><column name="command">/init </column><column nam [2029 bytes omitted]
pid        user       command    cores
1          root       /init      6,15
4          root       plan9 --control-socket 5 --log-level 4 --server-fd 6 --pipe-fd 8 --log-truncate 5,19
7          root       /init      16
8          root       /init      2
9          user       -bash      23
10         root       /init      11
11         root       /init      11
12         user       -bash      13
468        user       dbus-launch --autolaunch 857863f4cf286a4142beec6a62a03f1f --binary-syntax --close-stderr 20
469        user       /usr/bin/dbus-daemon --syslog-only --fork --print-pid 5 --print-address 7 --session 23
499        user       gdb        0,1,2,4,5,6,8,9,10,12,13,14,15,16,17,18,20,21,22,23
525        root       /init      17
526        root       /init      19
527        user       -bash      21
544        user       gdbserver --multi localhost:3454 14
(gdb)

This would then be followed by vAttach with a Pid that would be interpreted by the kernel:

(gdb) attach 527
Attaching to program: /usr/bin/sleep, process 527
[remote] Sending packet: $vAttach;20f#ce
[remote] Packet received: T0006:a0faeb5ce87f0000;07:30086b64ff7f0000;10:f418dc5ce87f0000;thread:p20f.20f;core:15;
[remote] packet_ok: Packet vAttach (attach) is supported
[remote] Sending packet: $qXfer:exec-file:read:20f:0,1000#e1
[remote] Packet received: l/usr/bin/bash
[remote] Sending packet: $vFile:setfs:0#bf
[remote] Packet received: F0
[remote] packet_ok: Packet vFile:setfs (hostio-setfs) is supported
[remote] Sending packet: $vFile:open:6a7573742070726f62696e67,0,1c0#ed
[remote] Packet received: F-1,2
[remote] packet_ok: Packet vFile:open (hostio-open) is supported
[remote] Sending packet: $vFile:setfs:20f#57
[remote] Packet received: F0
[remote] Sending packet: $vFile:open:2f7573722f62696e2f62617368,0,0#f4
[remote] Packet received: F5
[remote] remote_hostio_pread: readahead cache miss 1
[remote] Sending packet: $vFile:pread:5,47ff,0#6a
[remote] Packet received: F4756;\177ELF\002\001\00.........

Since vAttach is already supported, the only thing that would be needed is info os processes, which should be incredibly simple to implement -- just add :osdata to the configuration listing and respond to w $qXfer:osdata:read:processes:0,1000#fa with the list.

@daniel5151
Copy link
Owner Author

Unfortunately, I don't think it's quite that simple.

While implementing info os processes would be a good QOL feature (it'd give you "first class" support for listing procs in GDB vs. you having to hand-roll a custom monitor command), I doubt it'll affect the GDB client's internal bookkeeping when it comes to things like reading/writing memory/registers, switching between procs, etc...

The big feature in gdbstub is the fact that there's lots of code that either ignores or fakes PID values:

A great example of this is something like this:

res.write_specific_thread_id(SpecificThreadId {
pid: self
.features
.multiprocess()
.then_some(SpecificIdKind::WithId(FAKE_PID)),
tid: SpecificIdKind::WithId(tid),
})?;

See how gdbstub uses FAKE_PID here as part of it's response?

That means if you tried to vAttach to a process that had a PID values that isn't FAKE_PID (i.e: 1), the GDB client would get very confused when the gdbstub would begin sending back data related to an unrelated PID.

Similarly, there's code like this as well:

current_mem_tid: SINGLE_THREAD_TID,
current_resume_tid: SpecificIdKind::WithId(SINGLE_THREAD_TID),

Namely: gdbstub should be tracking current_mem_tid_pid, and current_resume_tid_pid, but instead, the code that sets/reads from these variables either discards the pid the GDB client sends, or sends back a FAKE_PID.

...things like that are why the correct solution here would be to plumb through a whole new target::ext::base::multiprocess interface, which - to be clear - would be a straight mirror of the existing target::ext::base::multithread API, except instead of using Tid in all the APIs, it'd use (Pid, Tid) instead.

@xobs
Copy link
Contributor

xobs commented Mar 11, 2023

I see -- I thought "multiprocess" meant "inferior support", which seems to be mostly managed by the GDB client itself. That is, when debugging multiple concurrent processes, it's the job of the GDB client to launch multiple processes or initiate connections via the bridge, and it's not the job of the bridge to make those connections.

Accordingly, there is no way to get a list of running processes with inferior support -- the only way to do it is info os processes.

I suppose target::ext::base::multiprocess would be distinct from inferior support?

@daniel5151
Copy link
Owner Author

I see -- I thought "multiprocess" meant "inferior support", which seems to be mostly managed by the GDB client itself. That is, when debugging multiple concurrent processes, it's the job of the GDB client to launch multiple processes or initiate connections via the bridge, and it's not the job of the bridge to make those connections.

Unless I'm misunderstanding something, I don't believe there's much distinction at the GDB RSP layer between "managed" inferiors vs. "attached" inferiors, aside from some client-side book-keeping regarding whether the process should be terminated on disconnect, vs. automatically resumed, among other similar trivialities. A process is a process, regardless if the GDB client has told the remote stub to spin it up itself vs. asking the remote host for a list of processes + attaching to some existing one.

I suppose target::ext::base::multiprocess would be distinct from inferior support?

I think a better way to think about it is in the sense that gdbstub has always supported a single inferior, but without a target::ext::base::multiprocess API, is unable to supporting multiple inferiors (both in the sense of simultaneous debugging, and in the sense of switching between different inferiors on-the-fly).

@xobs
Copy link
Contributor

xobs commented Mar 13, 2023

With #129 I'm able to switch processes. The hardest part is that FAKE_PID is encoded in lots of places and GDB wants the actual PID.

When attaching to a new process, GDB sends vKill to the previous process which the target can use to stop debugging the previous thread.

@daniel5151
Copy link
Owner Author

When attaching to a new process, GDB sends vKill to the previous process which the target can use to stop debugging the previous thread.

Oh, interesting! After doing a bit of reading, it might be the case that unless you specifically spin up a new "inferior" prior to doing a vAttach, GDB assumes you're all done with the last process and detaches from it...

I've never actually played around with the semantics of multi-process debugging in GDB myself, but now you've got me interested to poke around and see what the different behaviors are (i.e: get a little gdb and gdbserver playground going on my machine that I can test some stuff in)


Of course, that's neither here nor there wrt. how gdbstub should handle proper multiprocess debugging.

The changes in #129 will likely unblock your use-case of "debug a single process at a time, with the ability to jump between which process is being debugged", but it's not tenable for the sort of true "multiple attached processes" type debugging that this tracking issue encompasses.

@xobs
Copy link
Contributor

xobs commented Apr 6, 2023

With the recent merge, I've created betrusted-io/xous-core#360 which implements full support for GDB with Xous, including debugging processes in a live device. Switching processes is now possible, though without info os processes the user must know what process ID corresponds to which process.

As you note, it's not true multiprocess support, but I think it's still pretty neat regardless!

@daniel5151
Copy link
Owner Author

Yep, it's a huge first step towards getting "true" multiprocess support in gdbstub - thanks again for your help in implementing + testing things!

Of course, it'd be great to land support for info os processes as well (i.e: by responding to qXfer:osdata:read:processes packets), but I totally understand if that's not something you're interested in contributing at this time (given that you've got your custom monitor command that reports active processes instead).

@daniel5151 daniel5151 pinned this issue Apr 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API-breaking Breaking API change design-required Getting this right will require some thought help wanted Extra attention is needed new-api Add a new feature to the API (possibly non-breaking)
Projects
None yet
Development

No branches or pull requests

2 participants