Engine abstraction - rewrite Writers class to use ResultProcessorBuilder provided by platform #9853

pihme · 2022-07-21T10:18:27Z

Description

This is the last PR in the "shape legacy code into new interfaces" series.

Previously, the new interfaces ProcessingResultBuilder and ProcessingResult were introduced. This is how the engine is supposed to return the processing result to the stream processor. Also previously, these classes were passed between stream processor and engine on interface level, and the division of labor between stream processor and engine was established.

What was not achieved yet, is that the engine uses these new classes. Instead, the engine used existing legacy code to circumvent the new interfaces.

This PR transforms the engine to use the new classes provided by the stream processor:

Central point of transformation is the Writers class which at the outset depended on stream writers, state writers, response writers and so on
In the end, the Writers class only depends on ProcessingResultBuilder which will be swapped out for each record processed, or each error to be handled
To make this transition possible we a) need to refine the Typed(Response/Rejection/Command)Writer classes and remove methods like reset, flush which will no longer be offered by the new interfaces and b) need to implement the refined interfaces with an implementation that forwards calls to ProcessingResultBuilder

Review hints

@npepinpe 95% of changes is in the engine, @remcowesterhoud will look into those
These changes are done under "better done than perfect" regime. Some naming might be not exactly right. Some clean up might be done after the fact. The focus is to establish much of the engine abstraction concept, so that both teams afterwards can work without interfering with each other.

Related issues

related to #9725

Definition of Done

Code changes:

The changes are backwards compatibility with previous versions
If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

There are unit/integration tests that verify all acceptance criterias of the issue
New tests are written to ensure backwards compatibility with further versions
The behavior is tested manually
The change has been verified by a QA run
The impact of the changes is verified by a benchmark

Documentation:

The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
New content is added to the release announcement
If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

This commit is a bit larger than I would like. It could be split up into a small commit and a large commit, so I just left it as one large commit. It started because I wanted to implement a TypedCommandWriter that uses ProcessingResultBuilder as backend instead of accessing the writers directly. Turns out, TypedCommandWriter contains reset() and flush() methods which will no longer be allowed. (There will be a reset() method for the full result, but not for individual parts of the result). So I created a smaller interface. RestrictedTypedCommandWriter which does not contain these two methods. (will be renamed in next commit) Then I went through the current uses of TypedCommandWriter and replaced all with RestrictedTypedCommandWriter. In one place this was not possible: DeploymentDistributionBehavior, which schedules a background task that uses reset and flush. For this use case, a new class LegacyTask was introduced. We need to migrate all background tasks anyway. The idea of LegacyTask is to be a bag to capture all unwanted dependencies. Then there was a need to schedule such a legacy task and provide the dependencies when the task is run. The effect of both of these are: - Only the background task has a dependency to a writer with flush and reset; and this task is fully in control by the platform - All other places in the engine only write, but don't use reset or flush

…Writer

…sultBuilder

…ResultBuilder Refactor classes to reuse similar code

…lder Refactor classes to reuse similar code

…iter

Make sure that engine uses only the limited (no reset, flush) interface. Change logic in ProcessingStateMachine regarding how the writers are reset.

…tBuilder Also add a TODO to BufferedProcessingResultBuilder because there is certainly more work to do.

…ngResultBuilder This also required extending the interface. The philosophy here, is that in the result builder interface you have a low level of abstraction (e.g. all fields you can set). In the Typed...Writer interfaces you have a higher level of abstraction and more comfort. So TypedResponseWriter has methods like writeRejectionOnCommand(...), writeEventOnCommand(...), but these are all mapped onto the generic withResponse(...) method in ProcessingResultBuilder

For some unknown reason we need to set the hasResponse flag back to true. This should not be necessary, but it is. Setting it to true by default eliminates ~168 failed tests. Since the direct access shall be replaced with a buffered one I think there is no point in investigating this further. Finally, the error induced in StreamProcessorHealthTest must be moved to a different method to reflect the way the code flow works now.

...io/camunda/zeebe/engine/processing/deployment/distribute/DeploymentDistributionBehavior.java

      deploymentDistributionRecord.setPartition(partitionId);
      commandWriter.reset();
      commandWriter.appendFollowUpCommand(
          key, DeploymentDistributionIntent.COMPLETE, deploymentDistributionRecord);

      final long pos = commandWriter.flush();
      if (pos < 0) {
-        scheduleService.runDelayed(Duration.ofMillis(100), this);
+        schedulingService.runDelayed(Duration.ofMillis(100), this);


github-actions · 2022-07-21T10:33:51Z

Unit Test Results

  793 files +    1   793 suites +1 1h 47m 50s ⏱️ + 3m 42s
6 321 tests +120 6 312 ✔️ +120 9 💤 ±0 0 ❌ ±0
6 490 runs +120 6 481 ✔️ +120 9 💤 ±0 0 ❌ ±0

Results for commit 5d26a75. ± Comparison against base commit 358dc73.

♻️ This comment has been updated with latest results.

remcowesterhoud

Thanks for the small commits, that made it a lot better to review this 🙇

I'll be honest I didn't understand everything that's been changed, probably because I'm not that deep into the abstraction topic. I have added some questions and some minor suggestions.

Overall I didn't see any strange stuff and I like the look of the abstraction so far, so great job 🚀

engine/src/main/java/io/camunda/zeebe/streamprocessor/StreamProcessorContext.java

...main/java/io/camunda/zeebe/engine/processing/bpmn/behavior/LegacyTypedStreamWriterProxy.java

engine/src/main/java/io/camunda/zeebe/engine/Engine.java

...ava/io/camunda/zeebe/engine/processing/streamprocessor/writers/EventApplyingStateWriter.java

The class is only used in one test

npepinpe

👍 It's great to see everything coming together/into shape!

There was only one issue, which is very minor, about a potentially unused field. But nothing that requires a second review I think.

💭 As I wrote, not sure about the whole mutex and if it does anything useful, but happy to see how it helps us in practice.
💭 It's a bit out of scope, but now that we have an expandable buffer in the result builder, could a large synchronous multi-instance collection cause us to run out of memory by writing tons of event? I guess it's not really all that different from before as we were anyway staging them in memory, right? So just for my peace of mind, the situation is more or less the same as before, correct?

npepinpe · 2022-07-22T08:26:32Z

engine/src/main/java/io/camunda/zeebe/engine/Engine.java


  private final ErrorRecord errorRecord = new ErrorRecord();

+  private final ProcessingResultBuilderMutex resultBuilderMutex =


❓ What is the goal of the mutex? To prevent callers from caching and using a result builder, or to ensure that callers always get the latest request-scoped builder? If it's the later, then all good. If it's the former, I'm not sure it's the best mechanism, as it doesn't really prevent any caching from happening, and I'm not sure what would happen if a processor were to cache it. I don't have much better ideas to be honest, but it feels like unnecessary complexity as it's not really preventing anything. I guess what's missing for me is that we aren't preventing the builder from being used out of scope, and I'm not sure it'll be obvious when we use it out of scope, or what the behavior will be (retry endlessly? blacklist?).

The mutex is mainly to ensure that callers get the latest request-scoped builder. You are right that it doesn't prevent the result from being cached.

If you want to make sure the builder is not used out of scope, the platform could pass a proxy to the engine and invalidate the proxy after it goes out of scope. Then, even if the engine were to cache the result, you could detect it.

engine/src/main/java/io/camunda/zeebe/engine/Engine.java

.../zeebe/engine/processing/streamprocessor/writers/ResultBuilderBackedTypedResponseWriter.java

engine/src/main/java/io/camunda/zeebe/engine/processing/streamprocessor/writers/Writers.java

engine/src/main/java/io/camunda/zeebe/streamprocessor/BufferedProcessingResultBuilder.java

engine/src/main/java/io/camunda/zeebe/streamprocessor/ProcessingScheduleServiceImpl.java

pihme · 2022-07-22T09:01:25Z

It's a bit out of scope, but now that we have an expandable buffer in the result builder, could a large synchronous multi-instance collection cause us to run out of memory by writing tons of event? I guess it's not really all that different from before as we were anyway staging them in memory, right? So just for my peace of mind, the situation is more or less the same as before, correct?

The expandable buffer is not in use yet. We are still using the direct access. So it is exactly as before

pihme · 2022-07-22T09:01:32Z

bors merge

zeebe-bors-camunda · 2022-07-22T09:18:10Z

Build succeeded:

9852: Move stream processor classes into the stream processor package r=pihme a=pihme ## Description - Move classes that are only used by stream processor into the corresponding package - There were no other changes apart from moving the classes ## Related issues  related to #9725 **Merge only after** #9853 (this PR can easily be regenerated; the other one cannot) Co-authored-by: pihme <pihme@users.noreply.github.com>

pihme added 13 commits July 21, 2022 11:27

refactor(engine): rename TypedCommandWriter to LegacyTypedCommandWriter

0e03167

refactor(engine): rename RestrictedTypedCommandWriter to TypedCommand…

8a48af7

…Writer

refactor(engine): implement TypedCommandWriter backed by ProcessingRe…

cfb4767

…sultBuilder

refactor(engine): implement TypedRejectionWriter backed by Processing…

103249c

…ResultBuilder Refactor classes to reuse similar code

refactor(engine): implement StateWriter backed by ProcessingResultBui…

30b15cd

…lder Refactor classes to reuse similar code

refactor(engine): rename TypedResponseWriter to LegacyTypedResponseWr…

709cad3

…iter

refactor(engine): introduce limited TypedResponseWriter interface

9e89f3f

Make sure that engine uses only the limited (no reset, flush) interface. Change logic in ProcessingStateMachine regarding how the writers are reset.

refactor(engine): configure source event position via ProcessingResul…

463366f

…tBuilder Also add a TODO to BufferedProcessingResultBuilder because there is certainly more work to do.

refactor(engine): swap writers to use processing result builder

0e5935d

refactor(engine): add method to reset current result

c7a4b73

pihme requested a review from npepinpe July 21, 2022 10:18

github-advanced-security bot found potential problems Jul 21, 2022

View reviewed changes

pihme mentioned this pull request Jul 21, 2022

Move stream processor classes into the stream processor package #9852

Merged

10 tasks

pihme requested a review from remcowesterhoud July 21, 2022 11:36

remcowesterhoud approved these changes Jul 21, 2022

View reviewed changes

pihme added 2 commits July 21, 2022 19:05

refactor(engine): move class to tests

4f2904b

The class is only used in one test

refactor(engine): remove obsolete todo

c89e6ab

npepinpe approved these changes Jul 22, 2022

View reviewed changes

refactor(engine): remove obsolete code

5d26a75

zeebe-bors-camunda bot merged commit 6efb06b into main Jul 22, 2022

zeebe-bors-camunda bot deleted the 9725-shape-legacy-code-into-new-interfaces-good-bits-part4 branch July 22, 2022 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Engine abstraction - rewrite Writers class to use ResultProcessorBuilder provided by platform #9853

Engine abstraction - rewrite Writers class to use ResultProcessorBuilder provided by platform #9853

pihme commented Jul 21, 2022 •

edited

github-actions bot commented Jul 21, 2022 •

edited

remcowesterhoud left a comment •

edited

npepinpe left a comment

npepinpe Jul 22, 2022

pihme Jul 22, 2022

pihme commented Jul 22, 2022

pihme commented Jul 22, 2022

zeebe-bors-camunda bot commented Jul 22, 2022


		private final ErrorRecord errorRecord = new ErrorRecord();

		private final ProcessingResultBuilderMutex resultBuilderMutex =

Engine abstraction - rewrite Writers class to use ResultProcessorBuilder provided by platform #9853

Engine abstraction - rewrite Writers class to use ResultProcessorBuilder provided by platform #9853

Conversation

pihme commented Jul 21, 2022 • edited

Description

Review hints

Related issues

Definition of Done

github-actions bot commented Jul 21, 2022 • edited

Unit Test Results

remcowesterhoud left a comment • edited

Choose a reason for hiding this comment

npepinpe left a comment

Choose a reason for hiding this comment

npepinpe Jul 22, 2022

Choose a reason for hiding this comment

pihme Jul 22, 2022

Choose a reason for hiding this comment

pihme commented Jul 22, 2022

pihme commented Jul 22, 2022

zeebe-bors-camunda bot commented Jul 22, 2022

pihme commented Jul 21, 2022 •

edited

github-actions bot commented Jul 21, 2022 •

edited

remcowesterhoud left a comment •

edited