Remove LogStream writers from Engine #9724

Zelldon · 2022-07-07T06:43:34Z

Description

Part of #9600

In the current state the TypedProcessors (which might become the Engine later) and other entities, like JobTimeoutTrigger, DeploymentDistributor etc. have knowledge about how to write Records to the LogStream abstraction. Ideally, they shouldn't care about that detail. In the end, it would be great if the Engine can get something in and produce something out. For that we have to do some pre-work like removing the actual LogStreamBatchWriters usage and reduce related interfaces.

In the POC #9602 we split up the implementation of the TypedStreamWriterImpl so that it just writes into a wrapped buffer. This allowed to pre-claim that buffer in the start, and initialize the Engine with that Writer. This means we can reduce the dependency to the LogStream (helps when we split the Engine and the StreamPlatform). Later we renamed the Writers to something like Builder, but this is discussable.

See Writers in #9602 (comment)

Todo:

The text was updated successfully, but these errors were encountered:

Zelldon · 2022-07-08T11:26:02Z

There was a comment regarding the naming by @saig0 maybe we consider this here #9602 (comment)

Zelldon · 2022-07-08T11:27:55Z

There was a comment by @npepinpe regarding to the serialization and the Result class:

#9602 (comment)
Should the engine implementation actually serialize the outputs?
Pros: frictionless, no need to interact with the distributed, easier to extend later if we add different processors.
Cons: more complicated for testing, embedding, possibly need to implement own integrity checks, schema evolution, not reusing the expertise of the distributed team regarding persistence.
I personally lean with Peter on having "record in -> record(s) out" instead of dealing with (de)serialization.

10057: refactor(engine): merge two legacy interfaces together r=pihme a=pihme ## Description * `LegacyTypedStreamWriter` already extended `LegacyTypedCommandWriter` * This PR replaces all references to `LegacyTypedCommandWriter` with `LegacyTypedStreamWriter` * It then merges the two interfaces and deleted `LegacyTypedCommandWriter` * The net effect of this, is that we only need to replace one interface going forward ## Related issues relates to #9724 Co-authored-by: pihme <pihme@users.noreply.github.com>

10076: Engine abstraction remove legacy writer r=pihme a=pihme ## Description - Introduces a new task that can be scheduled by the engine and that can return records to be written - Replaces all references to `LegacyTypedStreamWriter`outside of stream processor with scheduling of the new tasks - The overall result is that the engine no longer depends on any writer that writes directly to the stream. Stream processor is now in full control of when and how records shall be written (Heureka!) ## Related issues related to #9724, #9730, #9725 Co-authored-by: pihme <pihme@users.noreply.github.com>

10076: Engine abstraction remove legacy writer r=Zelldon a=pihme ## Description - Introduces a new task that can be scheduled by the engine and that can return records to be written - Replaces all references to `LegacyTypedStreamWriter`outside of stream processor with scheduling of the new tasks - The overall result is that the engine no longer depends on any writer that writes directly to the stream. Stream processor is now in full control of when and how records shall be written (Heureka!) ## Related issues related to #9724, #9730, #9725 Co-authored-by: pihme <pihme@users.noreply.github.com> Co-authored-by: Christopher Zell <zelldon91@googlemail.com>

10118: Add new RecordBatch classes and interfaces r=Zelldon a=Zelldon ## Description With the #9600 and especially with #9724 we wanted to create a way to return results (records) on processing and on scheduled Tasks, without the need to use the LogStreamWriters. This should allow us to reduce the dependency on the log stream and in general, make the test setup more easier. We did the first attempt with the [BufferedProcessingResultBuilder](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/streamprocessor/BufferedProcessingResultBuilder.java), which only contains a buffer to write into the needed record details. This was also described as a solution in the issue #9724. During the discussion of the proposal, there was already the idea raised that it would be nicer to have a List of records returned by the Processing- and TaskResult. This was added as a Bonus in #9724. Today I thought longer about the `BufferedProcessingResultBuilder` and how we could test it and in general make it nicer to return the processing result. Checking the code of the result builder and [LogStreamBatchWriterImpl.java](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java), I realized that we do not really need to copy the record details into a big buffer (as we do it [here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L163-L195)). We could create a "struct" (some would call it a java record but to avoid confusion I call it struct now) of the record details which we want to write. Most parts of the struct are primitives (like enums intent, valueType etc.) and the RecordValue would be only thing we would need to copy into a buffer. This would allow better debuggability (e.g. we can see during processing what is already part of the result) and no need to mess around with offsets in the implementation. We can keep references to these "structs" in a list and iterate over it in the StreamProcessor to write the record details, similar to what we [do in the LogStreamBatchWriter here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L240). One interesting part after writing the classes and tests, I also realized that the batch entry interface could return the `UnifiedRecordValue` in order to not have to mess again with buffers (BufferWriters or BufferReaders) and to make the engine tests easier to implement. For example, if we would like to split the modules we could write tests without the stream processor and take the records from the processing result and give it directly again to the engine 🤯 So this PR adds new interfaces and first implementation for so called RecordBatch and RecordBatchEntry, which should be later used in the Processing- and TaskResult. Since the CommandWriter needs similar information as the LogStreamWriter, we can reuse the RecordBatch for both the records to write and for the response, which is part of the Processing Result. No other code was touched/modified. Right now the interfaces are kept to a minimum and documentation as well. I guess we might need to iterate over them later again, but I think it is a good step forward. Feel free to propose another names for the classes and interfaces, nothing is set into stone. #namingishard  ## Related issues  related to #10001 Co-authored-by: Christopher Zell <zelldon91@googlemail.com>

10163: Use RecordBatch to write follow up records r=Zelldon a=Zelldon ## Description In order to determine the maximum record batch size, we use a new method on the batch writer. This method allows us to check whether we would be able to write a specific event count and batch size to the writer. During processing a RecordBatch is built up in the ProcessingResultBuilder and given as ImmutableRecordBatch inside the ProcessingResult to the PrcoessingStateMachine. Here we tried to not change the interfaces, which causes us to cast on certain places, this is similar to how it currently is done in the LegacyWriter. The RecordBatch is consumed by the ProcessingStateMachine in order to write the records. A follow-up PR will clean up other parts, like the unused writer in the ProcessingResult, etc. Some StreamProcessor tests relied on the writer usage, which has been disabled/removed for now. We will soon rewrite them.  ## Related issues  related #9724 related #10001 Co-authored-by: Christopher Zell <zelldon91@googlemail.com>

Zelldon added kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. team/distributed area/maintainability Marks an issue as improving the maintainability of the project labels Jul 7, 2022

This was referenced Jul 7, 2022

[EPIC]: Engine Abstraction #9600

Closed

Refactor StreamProcessor / Engine #9725

Closed

Clean up StreamingPlatform / Engine #9727

Closed

Zelldon mentioned this issue Jul 8, 2022

Introduce ProcessingScheduleService #9730

Closed

17 tasks

menski removed team/distributed labels Jul 11, 2022

pihme self-assigned this Jul 11, 2022

menski unassigned pihme Jul 25, 2022

Zelldon self-assigned this Aug 4, 2022

Zelldon mentioned this issue Aug 4, 2022

Iterate over BufferedProcessingResultBuilder #10001

Closed

6 tasks

pihme mentioned this issue Aug 11, 2022

refactor(engine): merge two legacy interfaces together #10057

Merged

10 tasks

pihme mentioned this issue Aug 15, 2022

Engine abstraction remove legacy writer #10076

Merged

10 tasks

Zelldon mentioned this issue Aug 18, 2022

Add new RecordBatch classes and interfaces #10118

Merged

10 tasks

Zelldon mentioned this issue Aug 24, 2022

Use RecordBatch to write follow up records #10163

Merged

10 tasks

Zelldon mentioned this issue Aug 26, 2022

Remove legacy writers #10203

Merged

10 tasks

zeebe-bors-camunda bot closed this as completed in bbd0e5d Aug 29, 2022

deepthidevaki added the release/8.1.0-alpha5 label Sep 6, 2022

Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove LogStream writers from Engine #9724

Remove LogStream writers from Engine #9724

Zelldon commented Jul 7, 2022 •

edited

Zelldon commented Jul 8, 2022

Zelldon commented Jul 8, 2022

Remove LogStream writers from Engine #9724

Remove LogStream writers from Engine #9724

Comments

Zelldon commented Jul 7, 2022 • edited

Zelldon commented Jul 8, 2022

Zelldon commented Jul 8, 2022

Zelldon commented Jul 7, 2022 •

edited