Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ProcessingScheduleService #9730

Closed
17 tasks done
Zelldon opened this issue Jul 7, 2022 · 5 comments
Closed
17 tasks done

Introduce ProcessingScheduleService #9730

Zelldon opened this issue Jul 7, 2022 · 5 comments
Assignees
Labels
kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. version:8.1.0-alpha4 version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@Zelldon
Copy link
Member

Zelldon commented Jul 7, 2022

Description

Part of #9600

Create an abstraction around the processing actor, since we want to remove the dependency and knowledge about the current execution model from the engine. This will simplify the tests as we can use a simple implementation for the execution and it should also help us to make further changes to the actor scheduler etc.

See SchedulingService in #9602 (comment)

Todo:

  • Create a new interface for the ProcessingScheduleService (with narrowed scope)
    • Possibily only two methods, runDelayed and runComplete take a look at the POC [POC]: Engine Abstraction #9602
    • Ideally we use real futures on the runComplete, but might be something we refactor later
    • The interface contains a way to return some records, which will be written later on
    • Support runAfterCommit Iterate over Side effects #9723 (comment) to replace side effects
  • Implement the interface, and provide certain guarantees, take a look at POC for more details
    • Guarantee 1: We are executed after processing and committing transaction (e.g. if scheduling happened during processing)
    • Guarantee 2: We are not executed in parallel with processing (discussable, depends on the Writers and Result implementation) We removed this again Run ScheduledTasks during processing #10355
  • Bonus: We might want to execute multiple delayed tasks on parallel maybe even in parallel during processing, for that we need to make sure that resources are not shared
    • The above was not easily possible with Remove the extra submit from ProcessingScheduleServiceImpl #10428 we removed the shared writer, and the addition submit, but other than that it was not possible to create a new actor for it.
    • The consumer of the SchedulingService have only ReadOnly access to the database This was not easily possible
  • Add unit tests for the implementation
  • Before migrating to the new abstraction, migrate the ActorCOntrol#runAtFixedRate consumers to the #runDelayed usage, this means after each run the job needs to be scheduled again
  • Integrate the ProcessingScheduler in the ProcessingStateMachine, means if task is executed we can write the result to the log ProcessingScheduleService allows to schedule a task which returns a Result #9999
  • Migrate step by step the actorControl usage
  • Remove the actor control from the ProcessingContext
  • Remove LogStreamWriter from the ProcessingContext
@Zelldon Zelldon added kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. team/distributed labels Jul 7, 2022
@Zelldon Zelldon mentioned this issue Jul 7, 2022
71 tasks
@Zelldon
Copy link
Member Author

Zelldon commented Jul 8, 2022

Question raised from @npepinpe after the POC:

#9602 (comment)
Abstracting actor scheduler sounds good. What is the minimal interface? Can we get rid of runUntilDone and the likes, and just go for two capabilities: defer execution (e.g. run or submit) and schedule (e.g. runDelayed)? That would greatly simplify things. The caller can take care of runUntilDone by implementing their own state machine which ensures no other tasks can be completed in between. However if we want to keep the queuing mechanism, then this is not possible. Do we need it, though?

I think this is a bit unrelated to the topic and can be discussed with the PDT regarding the actor scheduler changes. The ProcessingScheduleService will have a really limited interface, which you can see above.

@Zelldon Zelldon mentioned this issue Jul 8, 2022
3 tasks
@pihme pihme self-assigned this Jul 8, 2022
@npepinpe
Copy link
Member

npepinpe commented Jul 8, 2022

Do you mean this one?

interface SchedulingService {
  Collection<Records> schedule(Duration d, Runnable r);
}

@Zelldon
Copy link
Member Author

Zelldon commented Jul 8, 2022

No sorry. For example in the POC it looked like this: 176cfab#diff-f27601ec63a23e86b4a992a9cd4ac54192d81fbea9da126a8edeaa3b20123427R483-R487

public interface ProcessingSchedulingService {

   void runWithDelay(Duration duration, Supplier<ProcessingResult> scheduledProcessing);
 }

We need to pass in an Supplier otherwise it doesn't work, the method itself can't return this.

It depends now on #9724 how the result looks like, but this will be then the return of the Supplier or function or what ever is passed in.

@pihme
Copy link
Contributor

pihme commented Jul 13, 2022

This comment came up during a discussion: #9756 (review)

Regarding timers, we do return a cancellable ScheduledTimer, which will prevent the execution of the timer. I didn't notice it being used during cancellation. I know we need to do both (since if the timer was added to the actor task queue before it's cancelled, it will be executed regardless), but it may avoid unnecessary runs. If we don't want to expand the API with another interface just yet, we could return something like a CloseableSilently for now. It does mean keeping track of the current timer however

For now, I didn't follow that suggestion and want to see whether we can do without it.

zeebe-bors-camunda bot added a commit that referenced this issue Jul 13, 2022
9756: Hide scheduling behind interface r=pihme a=pihme

## Description

In terms of the scope defined in #9730 this implements the following:

- [x] Create a new interface for the ProcessingScheduleService (with narrowed scope)
  - [x] Possibily only two methods, runDelayed and runComplete take a look at the POC #9602 
- [x] Implement the interface
- [x] Before migrating to the new abstraction, migrate the ActorCOntrol#runAtFixedRate consumers to the #runDelayed usage, this means after each run the job needs to be scheduled again
- [x] Migrate step by step the actorControl usage
- [x] Remove the actor control from the ProcessingContext

## Related issues

related to #9730

## Review Hints
* This is not the final form of the scheduling interface, instead the focus of this PR is to hide all the dependencies behind an interface first
* The change is not a pure refactoring. So there is a residual risk that the behavior is different in subtle ways (which is why I would like to have a review by both of you)
* The new code sometimes (indirectly) calls different methods on the `ActorControl`. Therefore there might be differences in the way tasks are scheduled (top of queue/bottom of queue; fast lane or not). The intention of the change was to simplify the interface that is available to the engine. In this regard some subtle changes are unavoidable
* Part of the simplification is also that the engine does not have access to something like a `ScheduledTimer`. Therefore, the engine is unable to cancel tasks which have been scheduled
* `RunAtFixedRate` has been replaced by tasks that reschedule themselves after they are called
* There is a difference in the way exceptions are propagated. See commit message a406d3f for one such example
* Other than that, the tests pass and I just started a QA run, so let's see 🤞 



Co-authored-by: pihme <pihme@users.noreply.github.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 16, 2022
10076: Engine abstraction remove legacy writer r=pihme a=pihme

## Description

- Introduces a new task that can be scheduled by the engine and that can return records to be written
- Replaces all references to `LegacyTypedStreamWriter`outside of stream processor with scheduling of the new tasks
- The overall result is that the engine no longer depends on any writer that writes directly to the stream. Stream processor is now in full control of when and how records shall be written (Heureka!)

## Related issues

related to #9724, #9730, #9725



Co-authored-by: pihme <pihme@users.noreply.github.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 16, 2022
10076: Engine abstraction remove legacy writer r=pihme a=pihme

## Description

- Introduces a new task that can be scheduled by the engine and that can return records to be written
- Replaces all references to `LegacyTypedStreamWriter`outside of stream processor with scheduling of the new tasks
- The overall result is that the engine no longer depends on any writer that writes directly to the stream. Stream processor is now in full control of when and how records shall be written (Heureka!)

## Related issues

related to #9724, #9730, #9725



Co-authored-by: pihme <pihme@users.noreply.github.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 17, 2022
10076: Engine abstraction remove legacy writer r=Zelldon a=pihme

## Description

- Introduces a new task that can be scheduled by the engine and that can return records to be written
- Replaces all references to `LegacyTypedStreamWriter`outside of stream processor with scheduling of the new tasks
- The overall result is that the engine no longer depends on any writer that writes directly to the stream. Stream processor is now in full control of when and how records shall be written (Heureka!)

## Related issues

related to #9724, #9730, #9725



Co-authored-by: pihme <pihme@users.noreply.github.com>
Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
@Zelldon Zelldon self-assigned this Aug 26, 2022
@deepthidevaki deepthidevaki added release/8.1.0-alpha5 version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 and removed release/8.1.0-alpha5 labels Sep 6, 2022
@Zelldon
Copy link
Member Author

Zelldon commented Sep 22, 2022

With #10428 we completed this part of #9600

@Zelldon Zelldon closed this as completed Sep 22, 2022
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. version:8.1.0-alpha4 version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

No branches or pull requests

5 participants