Actuators are cumbersome to maintain and extend #9996

oleschoenburg · 2022-08-04T14:11:06Z

Description

We have multiple open issues to extend or improve the actuators:

Implementing actuators can be very cumbersome due to the need for SpringGatewayBridge and SpringBrokerBridge which leads to code duplication (see RebalancingService).

There are other issues, for example the /rebalance actuator only works on a standalone gateway and on brokers with an embedded gateway. There's no technical reason it shouldn't work on standalone brokers without the embedded gateway, the only reason it doesn't work currently is that no one builds a BrokerClient.

I think we should spend some time to improve this. I've tried out a couple of approaches but they require a decent amount of refactoring.

My approach would be this:

Let the dist module be in charge of creating Broker , Gateway , BrokerClient , AtomixCluster and ActorScheduler. For StandaloneGateway we are doing most of that already but not for StandaloneBroker.
Actuators in dist have direct access to a BrokerClient because it can be injected by Spring, which removes the need for the spring bridges and code duplication.

related to #8845

The text was updated successfully, but these errors were encountered:

9989: Create engine outside r=Zelldon a=Zelldon ## Description Create the engine outside of the StreamProcessor, which should allow to also inject other RecordProcessors. Furthermore we can fill the Engine with the needed TypedRecordProcessorFactory, which means we no longer need that in our builders and contexts. I hope this helps you in make further progress `@deepthidevaki` Sidenote: This also simplifies the StreamProcessor/RecordProcessor tests, since we do not necessarily need to set up so much anymore.  ## Related issues  related to #9725 10012: refactor: let `dist` build the actor scheduler for gateway r=oleschoenburg a=oleschoenburg ## Description This moves the responsibility of creating an actor scheduler for the gateway to the `dist` module. As an added benefit, this allows us to remove some redundant constructors, simplify BrokerClient to never start the actor scheduler and a deprecated method from `BrokerStartupContext`. ## Related issues  relates to #9996 Co-authored-by: Christopher Zell <zelldon91@googlemail.com> Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>

10007: Add OCI & OpenShift labels to Docker image r=npepinpe a=npepinpe ## Description This PR adds labels to the Docker image following the OCI and OpenShift specs. This includes modifications to the build process to inject the few values which are dynamic, namely: - the created at ISO 8601 timestamp - the commit SHA (or revision) of the artifact - the semantic version of the artifact You can find the specs here: - https://github.com/opencontainers/image-spec/blob/main/annotations.md - https://docs.openshift.com/container-platform/4.10/openshift_images/create-images.html#images-create-metadata_create-images On top of adding labels, this modifies the `Dockerfile` a bit and pins the production image to specific sha of the base image, ensuring reproducible builds. In the future we should update this sha when need be, and update the golden file (`docker/test/docker-labels.golden.json`). This also adds `hadolint` to lint our Dockerfile and applies some of the recommendations to it. A new code quality job is added, `Docker checks`, which runs hadolint and verifies that the labels are as expected. The verification is done via a bash script which grabs the labels from a `docker inspect`, and compares it with an interpolated golden file (since we have a few dynamic values). The comparison is done using `diff`, so the output should be familiar to most. ## Related issues related to #9940 blocks #10013 10012: refactor: let `dist` build the actor scheduler for gateway r=oleschoenburg a=oleschoenburg ## Description This moves the responsibility of creating an actor scheduler for the gateway to the `dist` module. As an added benefit, this allows us to remove some redundant constructors, simplify BrokerClient to never start the actor scheduler and a deprecated method from `BrokerStartupContext`. ## Related issues  relates to #9996 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com> Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>

10032: refactor: let `dist` build gateways broker client r=oleschoenburg a=oleschoenburg ## Description Builds the broker client in dist, making it available for injection. Because `BrokerClient` needs to schedule actors, the actor scheduler must be started before. To achieve this, `BrokerClient` gets a `start` method that does the scheduling. ## Related issues  relates to #9996 Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>

npepinpe · 2022-08-17T13:43:15Z

Semi related: #10101

npepinpe · 2022-08-31T13:23:55Z

Should I take over here?

npepinpe · 2022-09-07T17:06:26Z

So looking into it, what I see is the following needs to be done to enable the broker client to be used in the broker actuator:

the actor scheduler is created in the dist module at the StandaloneBroker level, and injected into the broker as is - see Create and configure scheduler in distribution module #10290
the Atomix cluster is created in the dist module at the StandaloneBroker level, and injected into the broker as is - Allow AtomixCluster to be autowired everywhere #10354
the broker client must be changed to accept a request timeout instead of the complete GatewayCfg; this will avoid having to pass around the embedded gateway config which may or may not be initialized, and decouple the client a little from the gateway in general

I'm a bit reticent to do more changes, as the final destination isn't clear to me. Where do we want to end up with this?

I have some ideas while I'm working on this which fit with what it's in our team vision document, and I'll expand on them after this is done (or during), but won't include much of this in this issue as we should focus on the hot backups first and foremost.

10290: Create and configure scheduler in distribution module r=npepinpe a=npepinpe ## Description As we want to use certain parts of the system in the actuators, we need to pull out of the broker and up into the distribution the creation of such instances. For this PR, we're moving the `ActorScheduler`' configuration and creation into dist, and instead injecting it directly in the broker at construction time. There is some duplication as of now with the gateway's `ActorSchedulerComponent`, but I'd rather not deal with this now until we have a clearer picture of where we want to end up. ## Related issues related to #9996 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

npepinpe · 2022-09-14T08:05:20Z

Working on the broker side, it's clear that doing this will introduce confusion between the broker and the standalone broker. Since we would now create some components outside of the Broker (and broker module), you would expect them to also not be configured by the broker, yet they are still configured via the BrokerCfg. Why is this an issue? Our test rules expect to configure everything in the broker via BrokerCfg, but now that you inject the cluster into the Broker, you have to either:

Have some factory which can take a BrokerCfg and produce a cluster, living in the broker module which is used in production only in the dist module, and in tests can be used by the rules we have
Duplicate the cluster creation code, such that it lives both in the dist module and in the broker tests
Move the configuration out of the broker module (which is what I would expect) and have the cluster services 100% injectable, with controlled implementations for the broker tests

The latter is the correct way, I think, but much more effort, and I'm not super keen on doing this right now due to time constraints 🤷

npepinpe · 2022-09-14T14:55:59Z

God this second PR was painful...

10354: Allow AtomixCluster to be autowired everywhere r=npepinpe a=npepinpe ## Description This PR extracts building the `AtomixCluster` from the broker into the `dist` module. Unfortunately, because many of our testing utilities break all kinds of abstraction, and our tests still heavily rely on modifying the cluster via the `BrokerCfg`, mapping the `BrokerCfg` to the `AtomixCluster` is still done in the `broker` module. However, everything is prepared to eventually move this out to the dist module itself, such that our configuration would simply instantiate the Atomix `ClusterConfig` type, and later inject a `AtomixCluster` in the broker. See the `StandaloneGateway` as an example; there is plenty of opportunity to DRY it up once both are simply configuring `ClusterConfig` directly, and it will remove intermediate layers of builders/config. In order to achieve that at the moment, since the network configuration relies on being initialized with the right broker base path, I also had to pull up the working directory into its own configuration, such that it can be autowired as well. This is a little more abstraction than I'd like, but it's due to how the `BrokerCfg` is still the central point for all the components, even those not specific to the broker. The broker configuration is thus not initialized in the system context or broker anymore, but is given to the broker already initialized. This is done via the Spring configuration `BrokerConfiguration`. It uses two beans to do this, which I dislike, but I'm not sure what's the best way to do this. I generally dislike that the configuration must be initialized anyway, it's very brittle and easy to forget, so I'd rather figure a way to avoid this in the future. I would propose a follow up which removes this, and instead enforces that we create the broker config either passing the base already, or using a factory method which does this. I can also do it here, but it will increase how much changes are here. I think in the future the cluster services should be started before the broker itself, but we can figure this out later. For now, it means that if the actuator wants to make use of them before the broker finishes starting, it will fail - this is something we can handle semi-gracefully, so I would leave it as is. ## Related issues related to #9996 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

oleschoenburg mentioned this issue Aug 4, 2022

Add Pause exporting endpoint in gateway #9630

Closed

menski mentioned this issue Aug 5, 2022

Pause individual partitions separately #9961

Open

oleschoenburg self-assigned this Aug 5, 2022

oleschoenburg mentioned this issue Aug 5, 2022

refactor: let dist build the actor scheduler for gateway #10012

Merged

10 tasks

oleschoenburg mentioned this issue Aug 9, 2022

refactor: let dist build gateways broker client #10032

Merged

10 tasks

oleschoenburg assigned npepinpe and unassigned oleschoenburg Aug 31, 2022

deepthidevaki added the version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 label Sep 6, 2022

npepinpe mentioned this issue Sep 7, 2022

Create and configure scheduler in distribution module #10290

Merged

10 tasks

This was referenced Sep 14, 2022

Allow AtomixCluster to be autowired everywhere #10354

Merged

Minimize usage of uninitialized BrokerCfg objects #10356

Closed

Allow reuse of shared actuators depending on BrokerClient #10357

Merged

zeebe-bors-camunda bot closed this as completed in 9dbf225 Sep 15, 2022

Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actuators are cumbersome to maintain and extend #9996

Actuators are cumbersome to maintain and extend #9996

oleschoenburg commented Aug 4, 2022 •

edited

npepinpe commented Aug 17, 2022

npepinpe commented Aug 31, 2022

npepinpe commented Sep 7, 2022 •

edited

npepinpe commented Sep 14, 2022

npepinpe commented Sep 14, 2022

Actuators are cumbersome to maintain and extend #9996

Actuators are cumbersome to maintain and extend #9996

Comments

oleschoenburg commented Aug 4, 2022 • edited

npepinpe commented Aug 17, 2022

npepinpe commented Aug 31, 2022

npepinpe commented Sep 7, 2022 • edited

npepinpe commented Sep 14, 2022

npepinpe commented Sep 14, 2022

oleschoenburg commented Aug 4, 2022 •

edited

npepinpe commented Sep 7, 2022 •

edited