Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful shutdown of simulations #3194

Open
robgjansen opened this issue Sep 28, 2023 · 0 comments
Open

Graceful shutdown of simulations #3194

robgjansen opened this issue Sep 28, 2023 · 0 comments
Labels
Type: Enhancement New functionality or improved design

Comments

@robgjansen
Copy link
Member

robgjansen commented Sep 28, 2023

It would be nice to have a way to preemptively end a Shadow simulation early by effectively modifying the simulation stop_time from whatever it was initially configured as to the current sim time (now).

Our current behavior on receiving SIGTERM or SIGINT (Ctrl-C) is to flush the log and exit, which isn't quite as graceful as we could be and doesn't allow shadow to log things normally logged at the end of the simulation (such as syscall counts, object usage), check processes expected end states, etc.

It might be easy to add this, at the end of each scheduler round you could check if a flag was set by a signal handler and then stop the simulation early.

We're thinking we want something like the following:

  • First SIGINT (ctrl-c) puts you into SIGINTx1 state, graceful shutdown mode, where the scheduler will check at the end of the next round and end the simulation early, and then do cleanup and end-of-simulation stuff as normal.
  • Second SIGINT (ctrl-c) is for the impatient and puts you into SIGINTx2 state, where we give up on whatever we're doing, flush the log, and exit.
  • Any SIGTERM puts you into the SIGINTx1, but will not move you out of the SIGINTx2 state if you're already there.
  • The end of the shadow log should have a warning that shadow exited with the type of termination state that triggered the exit.
  • We probably want to return a non-zero error code, for sure if we exited via SIGINTx2, and maybe also if we exited via SIGINTx1
@robgjansen robgjansen added the Type: Enhancement New functionality or improved design label Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Enhancement New functionality or improved design
Projects
None yet
Development

No branches or pull requests

1 participant