Replies: 1 comment 5 replies
-
This sounds like an interesting use-case for Shadow.
I would say it's easy to access the state of the simulated program, but it wouldn't be easy to restore the state of the simulated program. For example if you were envisioning a checkpoint/restore type feature that lets you resume from an earlier point in the simulation, that would probably be tricky to get working. You would need to restore Shadow's internal simulation state, and also the state of the Linux process (memory such as the stack and heap, registers, misc syscalls that Shadow passes through to Linux, etc). The checkpoint/restore project might be able to help for the Linux part. So it's probably possible, but would be a lot of work. Since Shadow is (mostly) deterministic, it might be easier to just restart the simulation from the beginning to get back to your "checkpoint" state. But this could be costly in terms of simulation time if you have a large space to explore.
Shadow delivers packets from one host to another by creating an event at a time based on the configured latency between the two hosts. So if the latency between hosts A and B is 50 ms, a packet sent from A to B will be delivered in exactly 50 ms. What way would you plan to modify the message delivery order? Just reorder packets that arrive at the same time? Or add some network jitter so that packets don't arrive at a consistent time? |
Beta Was this translation helpful? Give feedback.
-
Hello,
For a research project I'm considering using Shadow to enable state space exploration of distributed systems.
I would simulate said distributed systems with Shadow, but add a replay mechanism to explore the execution with different message delivery order, then go go back to continue the simulation etc. and explore the state space that way.
I wanted to ask you if you think using Shadow as a base is a good idea for this, if this sounds feasible to you, and also if you could give me some pointers as to how I would actually implement this without breaking everything, since the codebase is quite dense, and your insight might save me a lot of time and trouble.
Is it easy/feasible to access the state of the simulated program from the simulation controller, to modify the message delivery order, and to modify the simulation flow to introduce this replay mechanism ?
I appreciate any help, thanks in advance ! (if needed I'd be happy to hop on a call)
PS: I already read the paper, the docs, I run the examples with Shadow and I read the codebase (superficially), I'm looking for pointers to go deeper.
Beta Was this translation helpful? Give feedback.
All reactions