Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC]: Engine Abstraction #9600

Closed
69 of 71 tasks
Zelldon opened this issue Jun 24, 2022 · 0 comments · Fixed by #10979
Closed
69 of 71 tasks

[EPIC]: Engine Abstraction #9600

Zelldon opened this issue Jun 24, 2022 · 0 comments · Fixed by #10979
Assignees
Labels
kind/epic Categorizes an issue as an umbrella issue (e.g. OKR) which references other, smaller issues
Milestone

Comments

@Zelldon
Copy link
Member

Zelldon commented Jun 24, 2022

Why?

Our team has grown to a certain size, which makes it hard to manage and keep being focused (e.g. discussions etc.). We will split our development team into two separate teams to keep being focused on certain topics/issues, and the team and discussions around important topics manageable.

The idea is the following:

The Zeebe Distributed Platform (ZDP) Team provides an abstraction layer around how records are stored/persisted, read, and written. The Zeebe Platform Automation (ZPA) Team is in charge of processing the records and updating their state. With that, the idea is that both teams can fully concentrate/focus on their topics, e.g. ZPA on the real business logic.

In order to support this split, we will need some better abstraction between certain layers/modules/parts of the system. One of them is the engine.

Todo:

Design / Finding phase

Implementation:

issueorder (1)

I marked the path blue which is necessary to satisfy #9420. This means we need to have the error handling in the engine implementation in order to achieve this.

  • Remove LogStream writers from Engine #9724

    • In order to work on this in parallel we copy the TypedStreamWriterImpl to a new class and make certain changes
      • Copy the content from TypedStreamWriterImpl to a new class
      • Remove the LogStreamBatch usage from that class, write directly into a pre-claimed buffer
      • Add a new method to the LogStreamBatch
      • Write tests for both
    • Find a good new name for the Writers Interface, including Command-, Rejection-, StateWriter. In the POC [POC]: Engine Abstraction #9602, we called it Builders, because wanted in the end build a list of Records. But maybe we can also use a different name here.
    • Create a new Result Interface / Class which can be returned by the Engine
      • The Result can for simplicity return a BufferWriter like we did in the POC [POC]: Engine Abstraction #9602 (we can improve that later), but we can also just return a list of Records if you find a good way for it
      • Result should be used by Processing and ScheduledTasks to return Results
    • Engine should return the result - ⚠️ here we need the Stream Processing changes first
    • Use result in platform, write the records to the LogStreamBatchWriter
    • Bonus: The result contains only the list of the records (instead of the general Buffer or BufferWriter)
    • Bonus2 The serialization is done in the StreamPlatform on writing to the LogStreamBatchWriter
    • We might need to adjust some tests
  • Refactor StreamProcessor / Engine #9725

    • Mark the processing explicitly as currently ongoing, via flag. The currentProcessor field shouldn't be used anymore.
    • Find a good name for the StreamProcessor
    • Rename the current StreamProcessor, e.g. StreamingPlatform (see [POC]: Engine Abstraction #9602)
    • Introduce a new StreamProcessor interface, which covers the reality (this needs to be adjusted during progressing with [EPIC]: Engine Abstraction #9600 )
      • init Method
      • replay Method
      • process Method (at the beginning with no Result returning, only after parts of Remove LogStream writers from Engine #9724 are done)
      • onError Method - might be empty at the begin
      • Implement LifecycleAware?
    • Create a new StreamProcessor implementation called Engine
    • Move some code out from the StreamingPlatform, ReplayStateMachine and ProcessingStateMachine to the Engine, see POC [POC]: Engine Abstraction #9602 and related branch
    • ⚠️ OnError implementation, can be completed after the writer change is done and we can return a Result see Remove LogStream writers from Engine #9724
      • Transaction is rolled back by StreamPlatform, new Transaction is created and Engine is called (onError)
      • Engine should be able to receive the Throwable, Command and current position (?)
      • Based on the input it can return a Result, which will be written to the log, and the transaction is committed again. Here we can use the same path from the ProcessingStateMachine as we do from the normal processing.
    • Bonus: If we create the Engine outside of the StreamingPlatform and give it to the builder, we can inject the dependencies outside and can simplify the StreamProcessor tests
    • Cleanup
  • Introduce ProcessingScheduleService #9730

    • Create a new interface for the ProcessingScheduleService (with narrowed scope)
      • Possibily only two methods, runDelayed and runComplete take a look at the POC [POC]: Engine Abstraction #9602
      • Ideally we use real futures on the runComplete, but might be something we refactor later
      • The interface contains a way to return some records, which will be written later on
      • Support runAfterCommit Iterate over Side effects #9723 (comment) to replace side effects
    • Implement the interface, and provide certain guarantees, take a look at POC for more details
      • Guarantee 1: We are executed after processing and committing transaction (e.g. if scheduling happened during processing)
      • Guarantee 2: We are not executed in parallel with processing (discussable, depends on the Writers and Result implementation) We removed this again Run ScheduledTasks during processing #10355
    • Bonus: We might want to execute multiple delayed tasks on parallel maybe even in parallel during processing, for that we need to make sure that resources are not shared
      • The above was not easily possible with Remove the extra submit from ProcessingScheduleServiceImpl #10428 we removed the shared writer, and the addition submit, but other than that it was not possible to create a new actor for it.
      • The consumer of the SchedulingService have only ReadOnly access to the database This was not easily possible
    • Add unit tests for the implementation
    • Before migrating to the new abstraction, migrate the ActorCOntrol#runAtFixedRate consumers to the #runDelayed usage, this means after each run the job needs to be scheduled again
    • Integrate the ProcessingScheduler in the ProcessingStateMachine, means if task is executed we can write the result to the log ProcessingScheduleService allows to schedule a task which returns a Result #9999
    • Migrate step by step the actorControl usage
    • Remove the actor control from the ProcessingContext
    • Remove LogStreamWriter from the ProcessingContext
  • Iterate over Side effects #9723

    • Clarify that use case or question whether we still want to support executing side effects after successful processing only Engine abstraction - Investigate side effects #9686
    • If yes, we might want to improve the current side effects to make it not so unstable as it is right now (if a side effect is added it is likely the previous is overwritten). We maybe want to support a queue
    • If no, remove the side effects and send requests/responses directly
  • Clean up StreamingPlatform / Engine #9727

  • Update downstream repos

    • Zeebe process test
    • Eze
@Zelldon Zelldon added the kind/epic Categorizes an issue as an umbrella issue (e.g. OKR) which references other, smaller issues label Jun 24, 2022
@Zelldon Zelldon self-assigned this Jun 24, 2022
@Zelldon Zelldon added this to the 8.1 milestone Jul 7, 2022
@pihme pihme removed their assignment Aug 16, 2022
zeebe-bors-camunda bot added a commit that referenced this issue Aug 23, 2022
10118: Add new RecordBatch classes and interfaces r=Zelldon a=Zelldon

## Description

With the #9600 and especially with #9724 we wanted to create a way to return results (records) on processing and on scheduled Tasks, without the need to use the LogStreamWriters. This should allow us to reduce the dependency on the log stream and in general, make the test setup more easier. 

We did the first attempt with the [BufferedProcessingResultBuilder](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/streamprocessor/BufferedProcessingResultBuilder.java), which only contains a buffer to write into the needed record details. This was also described as a solution in the issue #9724. 

During the discussion of the proposal, there was already the idea raised that it would be nicer to have a List of records returned by the Processing- and TaskResult. This was added as a Bonus in #9724.

Today I thought longer about the `BufferedProcessingResultBuilder` and how we could test it and in general make it nicer to return the processing result. Checking the code of the result builder and [LogStreamBatchWriterImpl.java](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java), I realized that we do not really need to copy the record details into a big buffer (as we do it [here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L163-L195)). We could create a "struct" (some would call it a java record but to avoid confusion I call it struct now) of the record details which we want to write. Most parts of the struct are primitives (like enums intent, valueType etc.) and the RecordValue would be only thing we would need to copy into a buffer. This would allow better debuggability (e.g. we can see during processing what is already part of the result) and no need to mess around with offsets in the implementation. We can keep references to these "structs" in a list and iterate over it in the StreamProcessor to write the record details, similar to what we [do in the LogStreamBatchWriter here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L240). 

One interesting part after writing the classes and tests, I also realized that the batch entry interface could return the `UnifiedRecordValue` in order to not have to mess again with buffers (BufferWriters or BufferReaders) and to make the engine tests easier to implement. For example, if we would like to split the modules we could write tests without the stream processor and take the records from the processing result and give it directly again to the engine 🤯

So this PR adds new interfaces and first implementation for so called RecordBatch and RecordBatchEntry, which should be later used in the Processing- and TaskResult. Since the CommandWriter needs similar information as the LogStreamWriter, we can reuse the RecordBatch for both the records to write and for the response, which is part of the Processing Result. No other code was touched/modified.

Right now the interfaces are kept to a minimum and documentation as well. I guess we might need to iterate over them later again, but I think it is a good step forward. 

Feel free to propose another names for the classes and interfaces, nothing is set into stone. #namingishard

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10001



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 23, 2022
10118: Add new RecordBatch classes and interfaces r=Zelldon a=Zelldon

## Description

With the #9600 and especially with #9724 we wanted to create a way to return results (records) on processing and on scheduled Tasks, without the need to use the LogStreamWriters. This should allow us to reduce the dependency on the log stream and in general, make the test setup more easier. 

We did the first attempt with the [BufferedProcessingResultBuilder](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/streamprocessor/BufferedProcessingResultBuilder.java), which only contains a buffer to write into the needed record details. This was also described as a solution in the issue #9724. 

During the discussion of the proposal, there was already the idea raised that it would be nicer to have a List of records returned by the Processing- and TaskResult. This was added as a Bonus in #9724.

Today I thought longer about the `BufferedProcessingResultBuilder` and how we could test it and in general make it nicer to return the processing result. Checking the code of the result builder and [LogStreamBatchWriterImpl.java](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java), I realized that we do not really need to copy the record details into a big buffer (as we do it [here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L163-L195)). We could create a "struct" (some would call it a java record but to avoid confusion I call it struct now) of the record details which we want to write. Most parts of the struct are primitives (like enums intent, valueType etc.) and the RecordValue would be only thing we would need to copy into a buffer. This would allow better debuggability (e.g. we can see during processing what is already part of the result) and no need to mess around with offsets in the implementation. We can keep references to these "structs" in a list and iterate over it in the StreamProcessor to write the record details, similar to what we [do in the LogStreamBatchWriter here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L240). 

One interesting part after writing the classes and tests, I also realized that the batch entry interface could return the `UnifiedRecordValue` in order to not have to mess again with buffers (BufferWriters or BufferReaders) and to make the engine tests easier to implement. For example, if we would like to split the modules we could write tests without the stream processor and take the records from the processing result and give it directly again to the engine 🤯

So this PR adds new interfaces and first implementation for so called RecordBatch and RecordBatchEntry, which should be later used in the Processing- and TaskResult. Since the CommandWriter needs similar information as the LogStreamWriter, we can reuse the RecordBatch for both the records to write and for the response, which is part of the Processing Result. No other code was touched/modified.

Right now the interfaces are kept to a minimum and documentation as well. I guess we might need to iterate over them later again, but I think it is a good step forward. 

Feel free to propose another names for the classes and interfaces, nothing is set into stone. #namingishard

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10001



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/epic Categorizes an issue as an umbrella issue (e.g. OKR) which references other, smaller issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants