refactor(engine): prevent instant rescheduling #9237

pihme · 2022-04-27T09:35:38Z

Description

Before this change, the delay calculated to reschedule a task could be negative or close to 0.
This lead to the checker being immediately rescheduled. This is bad, because it does not leave room
for other tasks to run.

With this change, a lower floor is applied when the task is rescheduled.

Related issues

closes #9236
preparation for #9238

Definition of Done

Code changes:

The changes are backwards compatibility with previous versions
If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

There are unit/integration tests that verify all acceptance criterias of the issue
New tests are written to ensure backwards compatibility with further versions
The behavior is tested manually
The change has been verified by a QA run
The impact of the changes is verified by a benchmark

Documentation:

The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
New content is added to the release announcement
If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

saig0

@pihme good idea 👍

The changes look good but I've one concern. Please have a look at my comment.

engine/src/main/java/io/camunda/zeebe/engine/processing/scheduled/DueDateChecker.java

saig0

👍

Before this change, the delay calculated to reschedule a task could be negative or close to 0. This lead to the checker being immediately rescheduled. This is bad, because it does not leave room for other tasks to run. With this change, a lower floor is applied when the task is rescheduled.

pihme · 2022-04-29T11:53:42Z

bors merge

zeebe-bors-camunda · 2022-04-29T12:24:01Z

Build succeeded:

github-actions · 2022-04-29T12:24:57Z

Successfully created backport PR #9255 for stable/1.3.

github-actions · 2022-04-29T12:25:02Z

Successfully created backport PR #9256 for stable/8.0.

9255: [Backport stable/1.3] refactor(engine): prevent instant rescheduling r=pihme a=github-actions[bot] # Description Backport of #9237 to `stable/1.3`. relates to #9236 #9238 Co-authored-by: pihme <pihme@users.noreply.github.com>

9256: [Backport stable/8.0] refactor(engine): prevent instant rescheduling r=pihme a=github-actions[bot] # Description Backport of #9237 to `stable/8.0`. relates to #9236 #9238 Co-authored-by: pihme <pihme@users.noreply.github.com>

9249: Yield control if too many timers due r=pihme a=pihme ## Description Adds a mechanism for the `DueDateTimeChecker` to yield control after some time. This is to stop it from iterating over an unknown number of due timer events and blocking execution while doing so. Overall, this change should work well in cases where there is a huge backlog of timers. This backlog would then be reduced bit by bit. The change is potentially bad for cases in which there is a constant and high load with many timers being created all the time. In this case, the change of this PR can lead to due timers continuously growing and the timers triggered will fall more and more behind real time. Overall, this tradeoff was deemed advantageous. At least it removes that dangers that the iteration blocks the execution for so long that the node is marked as unhealthy. When this situation is reached there is currently no practical recovery possible. Even before this point is reached, execution will be blocked for long stretches of time, and no progress can be made on that partition. So one faulty process can block all others from executing. Both issues are addressed by this PR. With this PR it should be always possible to make some progress, albeit small. This would allow users to cancel or change any faulty process, or to reduce the load if needed. Further work will be needed to figure out a way how to trigger timers without potentially falling further and further behind real time. ## Review Hints This PR has duplicate commits from #9237 ## Related issues  closes #9238 Co-authored-by: pihme <pihme@users.noreply.github.com>

pihme requested a review from saig0 April 27, 2022 09:35

pihme added backport stable/1.3 labels Apr 28, 2022

pihme mentioned this pull request Apr 28, 2022

Yield control if too many timers due #9249

Merged

10 tasks

saig0 requested changes Apr 29, 2022

View reviewed changes

engine/src/main/java/io/camunda/zeebe/engine/processing/scheduled/DueDateChecker.java Outdated Show resolved Hide resolved

pihme requested a review from saig0 April 29, 2022 10:55

saig0 approved these changes Apr 29, 2022

View reviewed changes

pihme force-pushed the 9236-time-checker-positive-delay branch from 5f0896f to 0f4a6ab Compare April 29, 2022 11:53

zeebe-bors-camunda bot merged commit 890a33e into main Apr 29, 2022

zeebe-bors-camunda bot deleted the 9236-time-checker-positive-delay branch April 29, 2022 12:24

github-actions bot mentioned this pull request Apr 29, 2022

[Backport stable/1.3] refactor(engine): prevent instant rescheduling #9255

Merged

github-actions bot mentioned this pull request Apr 29, 2022

[Backport stable/8.0] refactor(engine): prevent instant rescheduling #9256

Merged

npepinpe added release/8.0.2 version:1.3.8 labels May 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(engine): prevent instant rescheduling #9237

refactor(engine): prevent instant rescheduling #9237

pihme commented Apr 27, 2022 •

edited

saig0 left a comment

saig0 left a comment

pihme commented Apr 29, 2022

zeebe-bors-camunda bot commented Apr 29, 2022

github-actions bot commented Apr 29, 2022

github-actions bot commented Apr 29, 2022

refactor(engine): prevent instant rescheduling #9237

refactor(engine): prevent instant rescheduling #9237

Conversation

pihme commented Apr 27, 2022 • edited

Description

Related issues

Definition of Done

saig0 left a comment

Choose a reason for hiding this comment

saig0 left a comment

Choose a reason for hiding this comment

pihme commented Apr 29, 2022

zeebe-bors-camunda bot commented Apr 29, 2022

github-actions bot commented Apr 29, 2022

github-actions bot commented Apr 29, 2022

pihme commented Apr 27, 2022 •

edited