Executors: snooze on a lower level? #1576

sk1p · 2024-01-23T18:18:02Z

Another follow-up of #1572:

With #1572, libertem-server can temporarily shut down the whole executor. This solves the issue of long-running processes eating memory and CPU time very well, but a similar issue exists for dangling notebooks. We might want to add the same feature on the executor level, at least for the default dask executor, such that notebooks also free up their resources after a while. Important for shared systems, and especially when a GPU is shared between multiple users.

The text was updated successfully, but these errors were encountered:

matbryan52 · 2024-01-23T19:03:17Z

That would definitely be helpful in our use case, we already have the problem for LT executors started from within Digital Micrograph.

I am happy to look at how to do this.

sk1p · 2024-01-23T20:26:20Z

Great to hear! I was thinking that this could be handled partially by the Context, which would spawn the background task/thread which checks for the timer expiration, and then asks the executor to snooze, if it is supported. The pipelined executor (edit: or LiveContext), for example, could opt out of this, as it is contra productive for live processing.

Maybe we can discuss this further tomorrow.

Setting the snooze timeout very low, for example by starting with `libertem-server --snooze-timeout 0.001`, exposes some issues which this commit fixes, namely that the executor snoozes before `get_fs_listing`/`detect` finish running. It's not recommended to set the snooze timeout so small, but this change should increase stability in these cases - also possibly useful in cases where a lot of time passes between getting the executor and when the actual function is called, for example when the system is put to sleep, or possibly in case of forward jumps in time (?) Refs LiberTEM#1576 - lower-level keep-alive would prevent this kind of issue completely.

sk1p · 2024-04-23T13:43:50Z

I was thinking that this could be handled partially by the Context, which would spawn the background task/thread which checks for the timer expiration, and then asks the executor to snooze, if it is supported.

This also has the component that the executor needs to be kept alive across all those "running code/jobs/..." operations, which is best done in the executor itself (or maybe a wrapper kind of type), while, as noted above, high-level decisions about the timeout and concrete point of time of snoozing might be handled better by high-level code like Context or the executor state as we have it implemented now.

Setting the snooze timeout very low, for example by starting with `libertem-server --snooze-timeout 0.001`, exposes some issues which this commit fixes, namely that the executor snoozes before `get_fs_listing`/`detect` finish running. It's not recommended to set the snooze timeout so small, but this change should increase stability in these cases - also possibly useful in cases where a lot of time passes between getting the executor and when the actual function is called, for example when the system is put to sleep, or possibly in case of forward jumps in time (?) Refs #1576 - lower-level keep-alive would prevent this kind of issue completely.

sk1p added the enhancement New feature or request label Jan 23, 2024

sk1p added this to the 0.14 milestone Jan 23, 2024

sk1p modified the milestones: 0.14, 0.15 Apr 11, 2024

sk1p mentioned this issue Apr 23, 2024

Ensure liveness of executor for some utility endpoints #1629

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Executors: snooze on a lower level? #1576

Executors: snooze on a lower level? #1576

sk1p commented Jan 23, 2024

matbryan52 commented Jan 23, 2024

sk1p commented Jan 23, 2024 •

edited

sk1p commented Apr 23, 2024

Executors: snooze on a lower level? #1576

Executors: snooze on a lower level? #1576

Comments

sk1p commented Jan 23, 2024

matbryan52 commented Jan 23, 2024

sk1p commented Jan 23, 2024 • edited

sk1p commented Apr 23, 2024

sk1p commented Jan 23, 2024 •

edited