Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterate over BufferedProcessingResultBuilder #10001

Closed
6 tasks
Tracked by #9724
Zelldon opened this issue Aug 4, 2022 · 1 comment · Fixed by #10196
Closed
6 tasks
Tracked by #9724

Iterate over BufferedProcessingResultBuilder #10001

Zelldon opened this issue Aug 4, 2022 · 1 comment · Fixed by #10196
Assignees
Labels
kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@Zelldon
Copy link
Member

Zelldon commented Aug 4, 2022

Description

We introduced the BufferedProcessingResultBuilder to have a standalone ProcessungResultBuilder, without many dependencies. This should be ideally used in the Engine and Processors or SchedulingService consumers which need to create a Result.

What is missing:

  • supporting command responses
  • creating the real result
  • tests for the BufferedStreamWriter
  • tests for the new builder
  • Use in engine
  • use in ScheduleService consumers

See #10001 (comment)

related to #9724

@Zelldon Zelldon added the kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. label Aug 4, 2022
@Zelldon Zelldon self-assigned this Aug 18, 2022
zeebe-bors-camunda bot added a commit that referenced this issue Aug 23, 2022
10118: Add new RecordBatch classes and interfaces r=Zelldon a=Zelldon

## Description

With the #9600 and especially with #9724 we wanted to create a way to return results (records) on processing and on scheduled Tasks, without the need to use the LogStreamWriters. This should allow us to reduce the dependency on the log stream and in general, make the test setup more easier. 

We did the first attempt with the [BufferedProcessingResultBuilder](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/streamprocessor/BufferedProcessingResultBuilder.java), which only contains a buffer to write into the needed record details. This was also described as a solution in the issue #9724. 

During the discussion of the proposal, there was already the idea raised that it would be nicer to have a List of records returned by the Processing- and TaskResult. This was added as a Bonus in #9724.

Today I thought longer about the `BufferedProcessingResultBuilder` and how we could test it and in general make it nicer to return the processing result. Checking the code of the result builder and [LogStreamBatchWriterImpl.java](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java), I realized that we do not really need to copy the record details into a big buffer (as we do it [here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L163-L195)). We could create a "struct" (some would call it a java record but to avoid confusion I call it struct now) of the record details which we want to write. Most parts of the struct are primitives (like enums intent, valueType etc.) and the RecordValue would be only thing we would need to copy into a buffer. This would allow better debuggability (e.g. we can see during processing what is already part of the result) and no need to mess around with offsets in the implementation. We can keep references to these "structs" in a list and iterate over it in the StreamProcessor to write the record details, similar to what we [do in the LogStreamBatchWriter here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L240). 

One interesting part after writing the classes and tests, I also realized that the batch entry interface could return the `UnifiedRecordValue` in order to not have to mess again with buffers (BufferWriters or BufferReaders) and to make the engine tests easier to implement. For example, if we would like to split the modules we could write tests without the stream processor and take the records from the processing result and give it directly again to the engine 🤯

So this PR adds new interfaces and first implementation for so called RecordBatch and RecordBatchEntry, which should be later used in the Processing- and TaskResult. Since the CommandWriter needs similar information as the LogStreamWriter, we can reuse the RecordBatch for both the records to write and for the response, which is part of the Processing Result. No other code was touched/modified.

Right now the interfaces are kept to a minimum and documentation as well. I guess we might need to iterate over them later again, but I think it is a good step forward. 

Feel free to propose another names for the classes and interfaces, nothing is set into stone. #namingishard

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10001



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 23, 2022
10118: Add new RecordBatch classes and interfaces r=Zelldon a=Zelldon

## Description

With the #9600 and especially with #9724 we wanted to create a way to return results (records) on processing and on scheduled Tasks, without the need to use the LogStreamWriters. This should allow us to reduce the dependency on the log stream and in general, make the test setup more easier. 

We did the first attempt with the [BufferedProcessingResultBuilder](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/streamprocessor/BufferedProcessingResultBuilder.java), which only contains a buffer to write into the needed record details. This was also described as a solution in the issue #9724. 

During the discussion of the proposal, there was already the idea raised that it would be nicer to have a List of records returned by the Processing- and TaskResult. This was added as a Bonus in #9724.

Today I thought longer about the `BufferedProcessingResultBuilder` and how we could test it and in general make it nicer to return the processing result. Checking the code of the result builder and [LogStreamBatchWriterImpl.java](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java), I realized that we do not really need to copy the record details into a big buffer (as we do it [here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L163-L195)). We could create a "struct" (some would call it a java record but to avoid confusion I call it struct now) of the record details which we want to write. Most parts of the struct are primitives (like enums intent, valueType etc.) and the RecordValue would be only thing we would need to copy into a buffer. This would allow better debuggability (e.g. we can see during processing what is already part of the result) and no need to mess around with offsets in the implementation. We can keep references to these "structs" in a list and iterate over it in the StreamProcessor to write the record details, similar to what we [do in the LogStreamBatchWriter here](https://github.com/camunda/zeebe/blob/main/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStreamBatchWriterImpl.java#L240). 

One interesting part after writing the classes and tests, I also realized that the batch entry interface could return the `UnifiedRecordValue` in order to not have to mess again with buffers (BufferWriters or BufferReaders) and to make the engine tests easier to implement. For example, if we would like to split the modules we could write tests without the stream processor and take the records from the processing result and give it directly again to the engine 🤯

So this PR adds new interfaces and first implementation for so called RecordBatch and RecordBatchEntry, which should be later used in the Processing- and TaskResult. Since the CommandWriter needs similar information as the LogStreamWriter, we can reuse the RecordBatch for both the records to write and for the response, which is part of the Processing Result. No other code was touched/modified.

Right now the interfaces are kept to a minimum and documentation as well. I guess we might need to iterate over them later again, but I think it is a good step forward. 

Feel free to propose another names for the classes and interfaces, nothing is set into stone. #namingishard

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10001



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
@Zelldon
Copy link
Member Author

Zelldon commented Aug 24, 2022

We have a replacement for the BufferedProcessingResultBuilder approach, with the new RecordBatch #10118

Usage of RecordBatch for the ProcessingResult https://github.com/camunda/zeebe/pull/10163/checks?check_run_id=7994456256

I will close this issue after we have done the following:

  • Use the RecordBatch in the TaskResults
  • Use RecordBatch for the CommandResponses
  • Delete the old BufferedProcessingResultBuilder

@Zelldon Zelldon mentioned this issue Aug 24, 2022
10 tasks
zeebe-bors-camunda bot added a commit that referenced this issue Aug 25, 2022
10171: Small Clean up r=Zelldon a=Zelldon

## Description

* Remove unused BufferedProcessingResultBuilder, see #10001
* Remove some unused fields

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10001



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 25, 2022
10163: Use RecordBatch to write follow up records r=Zelldon a=Zelldon

## Description

In order to determine the maximum record batch size, we use a new method on the batch writer. This method allows us to check whether we would be able to write a specific event count and batch size to the writer. 

During processing a RecordBatch is built up in the ProcessingResultBuilder and given as ImmutableRecordBatch inside the ProcessingResult to the PrcoessingStateMachine. Here we tried to not change the interfaces, which causes us to cast on certain places, this is similar to how it currently is done in the LegacyWriter.

The RecordBatch is consumed by the ProcessingStateMachine in order to write the records. A follow-up PR will clean up other parts, like the unused writer in the ProcessingResult, etc.

Some StreamProcessor tests relied on the writer usage, which has been disabled/removed for now. We will soon rewrite them.


<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related #9724
related #10001



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 26, 2022
10188: Use Recordbatch with TaskResult r=Zelldon a=Zelldon

## Description


Prework was to create an inverse static EventRegistry -> TypeRegistry to reuse this in the different result builders. This allows getting the value type for a certain record class.

The RecordBatch is now part of the TaskResult. The RecordBatch is filled with the TaskResultBuilder and returned with
the TaskResult. The RecordBatch is written to the Dispatcher in the ProcessingScheduleService.

After migrating to the RecordBatch we were able to delete `writeRecordsToStream`, which further reduce the dependencies. For example Backup module no longer depends on logstream.

Next step: Migrate the response writing to record batch, this allows to remove further dependencies.

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related #10001 



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 26, 2022
10181: Add canWriteRecord method to RecordBatch r=Zelldon a=Zelldon

## Description

This method can be used to verify whether a record length could potentially be appended to the batch, this is useful if you don't want to append the record right now, like if you add more and more data and always check whether it still fits. This is used in the JobBatchCollector for example.

Removed the maxMessageLength, since this was only used in one place and I felt is somehow breaks the abstraction, especially it would block us to really remove the write from the implementation.
<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related #10001 



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 26, 2022
10191: Handle append error differently based on ResultBuilder r=Zelldon a=Zelldon

## Description

:x: Blocked by #10188 

This PR adds Either to the RecordBatch as return value. This allows to handle the error case differently. For example in the Processing right now we want to throw an exception, but later this can also handled differently which is why we added here a separate Method on the ProcessingResultBuilder. This is discussable whether we already want that.

What we need and want is to being able to handle failed appends in the consumers of the ProcessingScheduleService. Means at the TaskResultBuilder we want to now whether it was successful or not and just stop to append more records. This is slightly similar to how we did it before. 

This fixes a critical bug #10147

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #10147
related #10001 



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants