refactor(protocol-engine): implement pause command #8161

mcous · 2021-07-26T18:29:40Z

Overview

This PR builds on the play/pause logic of #8152 by adding a Pause command to the ProtocolEngine, the engine's SyncClient, and the JSON CommandTranslator.

With this PR, a protocol-engine protocol may pause itself and be resumed by an POST .../actions { "actionType": "resume" } HTTP request. Closes #7918.

Changelog

I tried make a commit at each iteration of my TDD loop, so if you're curious about that, I recommend checking out the diff one commit at a time. Roughly, this flow was:

Add opentrons.protocol_api_experimental.ProtocolContext.pause and tests, shaking out changes to SyncClient.pause
Add SyncClient.pause, shaking out Pause, PauseData, and PauseResult command value objects
Implement PauseImplementation, shaking out opentrons.protocol_engine.execution.RunControlHandler
Implement CommandTranslator logic for translating Pause commands
- I should've done this first or second, because this is an entry point of the feature, but I forgot!
- It ended up being fine, though, because I still hadn't gotten to any actual pause logic
- I also took this opportunity to move the CommandTranslator into opentrons.file_runner
Implement RunControlHandler.pause, shaking out protocol_engine.state.CommandView.get_is_running
Implement CommandView.get_is_running, completing the feature

Review requests

As usual, the enableProtocolEngine feature flag has to be on for this one. I've updated the Postman collection with a new JSON protocol: Simple Test Protocol With Pause.json.

Smoke test plan

This is the test procedure I ran on my robot. ~~While #8151 is outstanding, it remains easier to test this with JSON protocols than Python protocols, but the behavior should be exactly the same.~~

POST /protocols with files: Simple Test Protocol With Pause.json
- Or: modify testosaur_v3.py to have a ctx.pause() in there
POST /sessions
POST /sessions/:id/actions { actionType: "start" }
Protocol will pause after one transfer
- Take this opportunity to inspect state
- The pause action should have a status of running while the robot is waiting
POST /sessions/:id/actions { actionType: "resume" }
- Robot should continue moving
Continue issuing { actionType: "pause" } and { actionType: "resume" } requests if you want

Risk assessment

Low, but keep in mind this is foundational work for important future functionality.

codecov · 2021-07-26T18:29:51Z

Codecov Report

Merging #8161 (9c16a35) into edge (881a67a) will increase coverage by 0.87%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             edge    #8161      +/-   ##
==========================================
+ Coverage   87.31%   88.19%   +0.87%     
==========================================
  Files         429      430       +1     
  Lines       22572    24633    +2061     
==========================================
+ Hits        19708    21724    +2016     
- Misses       2864     2909      +45

Impacted Files	Coverage Δ
...service/session/command_execution/base_executor.py	`66.66% <0.00%> (-13.34%)`	⬇️
robot-server/robot_server/system/time_utils.py	`79.36% <0.00%> (-4.36%)`	⬇️
...bot_server/robot/calibration/deck/state_machine.py	`90.47% <0.00%> (-1.84%)`	⬇️
...rver/robot_server/service/pipette_offset/router.py	`91.83% <0.00%> (-1.50%)`	⬇️
...t-server/robot_server/service/tip_length/router.py	`95.65% <0.00%> (-1.13%)`	⬇️
...rver/service/session/models/command_definitions.py	`96.07% <0.00%> (-1.11%)`	⬇️
...ot-server/robot_server/service/protocol/analyze.py	`95.23% <0.00%> (-0.54%)`	⬇️
..._server/service/notifications/handle_subscriber.py	`92.10% <0.00%> (-0.49%)`	⬇️
.../robot_server/service/legacy/routers/networking.py	`98.14% <0.00%> (-0.39%)`	⬇️
robot-server/robot_server/settings.py	`100.00% <0.00%> (ø)`
... and 93 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 881a67a...9c16a35. Read the comment docs.

SyntaxColoring

Code looks good to me! I'll ✅ after I smoke-test on a robot.

With this PR, a protocol-engine protocol may pause itself and be resumed by an POST .../actions { "actionType": "resume" } HTTP request.

Edit: Moved this comment onto #8041.

Not for this PR:
I don't have a strong use case for this yet, but I've been thinking we should have three separate actions:

A "pause from the outside" action. Issued by HTTP clients, and by pressing the OT-2's front button.

A resume action that undoes (1).

A resume action that undoes pauses that are built into the protocol.

Because I think an HTTP client should be able to naively undo its own pause and be confident that the robot will be left exactly how it was before, without knowing anything deep about the protocol.

For example, say you're writing a client to issue HTTP commands to the OT-2 to coordinate it with other equipment. You want to implement a "stop everything" button. You also want to implement a button that resumes from the "stop everything."

If the OT-2 was in a protocol-issued pause state when the user clicked "stop everything," you probably wouldn't want the OT-2 to start moving on its own when they click resume. You'd expect the OT-2 to go back to doing exactly what it was before—and what it was doing was waiting.

Currently, to work around this, you'd have to know that the robot is currently on a protocol-issued pause, and, as a special case, avoid resuming it.

~~Edit: And I haven't thought about how this interacts with the door opening and closing.~~

api/src/opentrons/file_runner/json_command_translator.py

SyntaxColoring · 2021-07-30T15:57:16Z

api/src/opentrons/file_runner/json_command_translator.py

-    def __init__(self) -> None:
-        """Construct a command translator"""
-        pass
-


Nice catch.

Not necessarily for this PR, but would decoy still work if we made all these methods @staticmethod, too?

Yup, Decoy doesn't do a whole lot with the spec object other than try to figure out if a given spy should be sync or async. So in practice @staticmethod doesn't really make a difference. I think SessionView in robot-server is all static, and we mock it out in the router tests.

Thinking about it, I don't have a test case in Decoy covering @staticmethod and inspect.signature support, so there's a chance that's not quite right, but inspect.signature support is only really needed in specific cases, like not blowing up FastAPI's DI system

Edit: Yeah so inspect.signature support for @staticmethod was definitely broken: mcous/decoy#51

api/src/opentrons/file_runner/json_command_translator.py

api/src/opentrons/protocol_engine/commands/pause.py

api/src/opentrons/protocol_engine/execution/run_control.py

Co-authored-by: Max Marrone <max@opentrons.com>

sanni-t

LGTM!
Tested on a robot with a python protocol and it works as expected! The only non-intuitive thing was that both a start and a resume action resumes a paused protocol. But that's a UX question and easy to change later so I'm good with this for now.
▶️ ⏸

sanni-t · 2021-08-02T12:03:40Z

api/tests/opentrons/protocol_engine/commands/test_pause.py

+    )
+
+    data = PauseData(message="hello world")
+


I get confused whether/when to decoy.verify that a method wasn't called before our test call (before subject.execute in this case). Is it that in this case we know all dependencies are decoys so we are sure that a pause would not have been called before, and so we don't need to verify that it was not called before?

I think in this case, the key is that we haven't interacted with our test subject yet. Usually, if I'm putting a times=0 check (or assertNotCalled or whatever), I'm doing it because something in the test has interacted with the subject, so it's conceivable that something could go wrong.

Maybe, for example, you need to call a setup method of the subject before you call the specific method you're testing. But that also might be a sign that the interaction is too complicated

Cool. Makes sense.

But that also might be a sign that the interaction is too complicated

Good point

sanni-t · 2021-08-02T13:42:15Z

api/src/opentrons/protocols/models/json_protocol.py

-class Params4(BaseModel):
-    wait: Union[float, Literal[True]] = Field(
+class DelayCommandParams(BaseModel):
+    wait: Union[Literal[True], float] = Field(


Would this try to cast a wait value to the first type in the Union. What will happen if we specify a delay of 1 second.. or actually, any non-zero value?
I know it doesn't apply to this PR but something to keep an eye out for when we implement delay.

Oh good call. I changed this because { "wait": true } in JSON was getting cast to wait=1 in Python, but I didn't think to check that it worked the other way once I reordered it.

If it breaks in the way you theorize (makes a lot of sense to me that it would break this way), I think maybe if we switch the order back and go with a StrictFloat instead we might get the behavior we need

mcous · 2021-08-02T14:54:17Z

The only non-intuitive thing was that both a start and a resume action resumes a paused protocol

Oops, good callout. That wasn't intentional on my part, but makes sense why it's happening. That behavior feels like continuing pressure to change the runner / engine relationship, to me.

POST { "actionType": "start" } calls runner.run, which also calls engine.play
POST { "actionType": "resume" } calls engine.play
Neither HTTP request is validated as something that is allowed to happen at that time

If it's cool with y'all, would like to punt that HTTP layer stuff to another ticket. Ideally, I would like:

Validation of actions at the router level
More alignment between HTTP action names and ProtocolEngine method names
Potentially, a single HTTP action to both start and pause

sanni-t · 2021-08-02T15:22:11Z

If it's cool with y'all, would like to punt that HTTP layer stuff to another ticket. Ideally, I would like:

Validation of actions at the router level

More alignment between HTTP action names and ProtocolEngine method names

Potentially, a single HTTP action to both start and pause

Ya, we haven't really had the design discussion about the endpoints and action rules. Definitely worth a separate ticket.

SyntaxColoring · 2021-08-02T16:30:29Z

Tested a modified testosaur_v3.py and it works as expected, with the caveat that @sanni-t pointed out.

The only non-intuitive thing was that both a start and a resume action resumes a paused protocol

And both start and resume actions can start loaded protocols.

Oops, good callout. That wasn't intentional on my part, but makes sense why it's happening.
[...]
If it's cool with y'all, would like to punt that HTTP layer stuff to another ticket....

Yep, broadly makes sense to me.

Ideally, I would like:

Validation of actions at the router level

Yep.

Well, I think Protocol Engine should be the thing determining if an action is valid. Since Protocol Engine is the single source of truth of protocol state, it should also be the single source of truth for protocol state transitions. But maybe that's already what you have in mind.

More alignment between HTTP action names and ProtocolEngine method names

Yep.

Potentially, a single HTTP action to both start and pause

I find myself wondering if we need separate start and resume actions, and separate "ready" and "paused" states.

Like, maybe creating a protocol session automatically sets it up paused at step 0. And the "Start run" button just unpauses it from that state.

Instead of clients doing this:

if protocol.paused:
    show_resume_button()
elif protocol.loaded:
    show_start_button()
else:
    show_pause_button()

They'd do this:

if protocol.paused:
    if protocol.next_step == protocol.steps[0]:
        show_start_button()  # Never before started.
    else:
        show_resume_button()  # Started and then paused mid-run.
else:
    show_pause_button()

And the start button and resume button would be implemented through the same underlying resume HTTP action.

mcous added robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). protocol-engine Ticket related to the Protocol Engine project and associated HTTP APIs labels Jul 26, 2021

Base automatically changed from engine_action-reaction to edge July 28, 2021 22:28

mcous added 6 commits July 28, 2021 18:31

wip: start implementing experimental ProtocolContext.pause

299d9c2

wip: add pause to SyncClient

4bf6369

wip: wire up PauseImplementation

62f5cc0

wip: move CommandTranslator into opentrons.file_runner module

752fc13

wip: add pause translation to CommandTranslator

96c77cb

refactor(protocol-engine): implement pause command

64c749d

mcous force-pushed the engine_pause-command branch from 460c00d to 64c749d Compare July 28, 2021 22:35

mcous marked this pull request as ready for review July 28, 2021 22:38

mcous requested a review from a team as a code owner July 28, 2021 22:38

mcous requested review from sanni-t and SyntaxColoring July 29, 2021 21:36

SyntaxColoring reviewed Jul 30, 2021

View reviewed changes

Update api/src/opentrons/file_runner/json_command_translator.py

9c16a35

Co-authored-by: Max Marrone <max@opentrons.com>

SyntaxColoring mentioned this pull request Jul 30, 2021

UX validation and recommendations surrounding pause/cancel effects #8041

Closed

mcous mentioned this pull request Jul 30, 2021

fix(spy): match inspect.signature for staticmethods mcous/decoy#51

Merged

sanni-t approved these changes Aug 2, 2021

View reviewed changes

SyntaxColoring self-requested a review August 2, 2021 16:30

SyntaxColoring approved these changes Aug 2, 2021

View reviewed changes

mcous merged commit 65d64ae into edge Aug 2, 2021

mcous deleted the engine_pause-command branch August 2, 2021 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(protocol-engine): implement pause command #8161

refactor(protocol-engine): implement pause command #8161

mcous commented Jul 26, 2021 •

edited

codecov bot commented Jul 26, 2021 •

edited

SyntaxColoring left a comment •

edited

SyntaxColoring Jul 30, 2021

mcous Jul 30, 2021 •

edited

sanni-t left a comment •

edited

sanni-t Aug 2, 2021

mcous Aug 2, 2021

sanni-t Aug 2, 2021

sanni-t Aug 2, 2021

mcous Aug 2, 2021

mcous commented Aug 2, 2021 •

edited

sanni-t commented Aug 2, 2021

SyntaxColoring commented Aug 2, 2021 •

edited

refactor(protocol-engine): implement pause command #8161

refactor(protocol-engine): implement pause command #8161

Conversation

mcous commented Jul 26, 2021 • edited

Overview

Changelog

Review requests

Smoke test plan

Risk assessment

codecov bot commented Jul 26, 2021 • edited

Codecov Report

SyntaxColoring left a comment • edited

Choose a reason for hiding this comment

SyntaxColoring Jul 30, 2021

Choose a reason for hiding this comment

mcous Jul 30, 2021 • edited

Choose a reason for hiding this comment

sanni-t left a comment • edited

Choose a reason for hiding this comment

sanni-t Aug 2, 2021

Choose a reason for hiding this comment

mcous Aug 2, 2021

Choose a reason for hiding this comment

sanni-t Aug 2, 2021

Choose a reason for hiding this comment

sanni-t Aug 2, 2021

Choose a reason for hiding this comment

mcous Aug 2, 2021

Choose a reason for hiding this comment

mcous commented Aug 2, 2021 • edited

sanni-t commented Aug 2, 2021

SyntaxColoring commented Aug 2, 2021 • edited

mcous commented Jul 26, 2021 •

edited

codecov bot commented Jul 26, 2021 •

edited

SyntaxColoring left a comment •

edited

mcous Jul 30, 2021 •

edited

sanni-t left a comment •

edited

mcous commented Aug 2, 2021 •

edited

SyntaxColoring commented Aug 2, 2021 •

edited