Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actuators are cumbersome to maintain and extend #9996

Closed
2 tasks
oleschoenburg opened this issue Aug 4, 2022 · 5 comments · Fixed by #10357
Closed
2 tasks

Actuators are cumbersome to maintain and extend #9996

oleschoenburg opened this issue Aug 4, 2022 · 5 comments · Fixed by #10357
Assignees
Labels
area/maintainability Marks an issue as improving the maintainability of the project area/observability Marks an issue as observability related area/ux Marks an issue as related to improving the user experience kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@oleschoenburg
Copy link
Member

oleschoenburg commented Aug 4, 2022

Description

We have multiple open issues to extend or improve the actuators:

Implementing actuators can be very cumbersome due to the need for SpringGatewayBridge and SpringBrokerBridge which leads to code duplication (see RebalancingService).

There are other issues, for example the /rebalance actuator only works on a standalone gateway and on brokers with an embedded gateway. There's no technical reason it shouldn't work on standalone brokers without the embedded gateway, the only reason it doesn't work currently is that no one builds a BrokerClient.

I think we should spend some time to improve this. I've tried out a couple of approaches but they require a decent amount of refactoring.

My approach would be this:

  • Let the dist module be in charge of creating Broker , Gateway , BrokerClient , AtomixCluster and ActorScheduler. For StandaloneGateway we are doing most of that already but not for StandaloneBroker.
  • Actuators in dist have direct access to a BrokerClient because it can be injected by Spring, which removes the need for the spring bridges and code duplication.

related to #8845

@oleschoenburg oleschoenburg added kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. area/ux Marks an issue as related to improving the user experience area/observability Marks an issue as observability related area/maintainability Marks an issue as improving the maintainability of the project labels Aug 4, 2022
@oleschoenburg oleschoenburg self-assigned this Aug 5, 2022
zeebe-bors-camunda bot added a commit that referenced this issue Aug 5, 2022
9989: Create engine outside r=Zelldon a=Zelldon

## Description

Create the engine outside of the StreamProcessor, which should allow to also inject other RecordProcessors.
Furthermore we can fill the Engine with the needed TypedRecordProcessorFactory, which means we no longer need that in our builders and contexts.

I hope this helps you in make further progress `@deepthidevaki` 


Sidenote:

This also simplifies the StreamProcessor/RecordProcessor tests, since we do not necessarily need to set up so much anymore.
<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #9725



10012: refactor: let `dist` build the actor scheduler for gateway r=oleschoenburg a=oleschoenburg

## Description

This moves the responsibility of creating an actor scheduler for the gateway to the `dist` module. As an added benefit, this allows us to remove some redundant constructors, simplify BrokerClient to never start the actor scheduler and a deprecated method from `BrokerStartupContext`.

## Related issues

<!-- Which issues are closed by this PR or are related -->

relates to #9996 



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 5, 2022
10007: Add OCI & OpenShift labels to Docker image r=npepinpe a=npepinpe

## Description

This PR adds labels to the Docker image following the OCI and OpenShift specs. This includes modifications to the build process to inject the few values which are dynamic, namely:
  - the created at ISO 8601 timestamp
  - the commit SHA (or revision) of the artifact
  - the semantic version of the artifact

You can find the specs here:

- https://github.com/opencontainers/image-spec/blob/main/annotations.md
- https://docs.openshift.com/container-platform/4.10/openshift_images/create-images.html#images-create-metadata_create-images

On top of adding labels, this modifies the `Dockerfile` a bit and pins the production image to specific sha of the base image, ensuring reproducible builds. In the future we should update this sha when need be, and update the golden file (`docker/test/docker-labels.golden.json`).

This also adds `hadolint` to lint our Dockerfile and applies some of the recommendations to it. A new code quality job is added, `Docker checks`, which runs hadolint and verifies that the labels are as expected. The verification is done via a bash script which grabs the labels from a `docker inspect`, and compares it with an interpolated golden file (since we have a few dynamic values). The comparison is done using `diff`, so the output should be familiar to most.

## Related issues

related to #9940 
blocks #10013 



10012: refactor: let `dist` build the actor scheduler for gateway r=oleschoenburg a=oleschoenburg

## Description

This moves the responsibility of creating an actor scheduler for the gateway to the `dist` module. As an added benefit, this allows us to remove some redundant constructors, simplify BrokerClient to never start the actor scheduler and a deprecated method from `BrokerStartupContext`.

## Related issues

<!-- Which issues are closed by this PR or are related -->

relates to #9996 



Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 15, 2022
10032: refactor: let `dist` build gateways broker client r=oleschoenburg a=oleschoenburg

## Description

Builds the broker client in dist, making it available for injection. Because `BrokerClient` needs to schedule actors, the actor scheduler must be started before. To achieve this, `BrokerClient` gets a `start` method that does the scheduling.

## Related issues

<!-- Which issues are closed by this PR or are related -->

relates to #9996 



Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@npepinpe
Copy link
Member

Semi related: #10101

@npepinpe
Copy link
Member

Should I take over here?

@deepthidevaki deepthidevaki added the version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 label Sep 6, 2022
@npepinpe
Copy link
Member

npepinpe commented Sep 7, 2022

So looking into it, what I see is the following needs to be done to enable the broker client to be used in the broker actuator:

  • the actor scheduler is created in the dist module at the StandaloneBroker level, and injected into the broker as is - see Create and configure scheduler in distribution module #10290
  • the Atomix cluster is created in the dist module at the StandaloneBroker level, and injected into the broker as is - Allow AtomixCluster to be autowired everywhere #10354
  • the broker client must be changed to accept a request timeout instead of the complete GatewayCfg; this will avoid having to pass around the embedded gateway config which may or may not be initialized, and decouple the client a little from the gateway in general

I'm a bit reticent to do more changes, as the final destination isn't clear to me. Where do we want to end up with this?

I have some ideas while I'm working on this which fit with what it's in our team vision document, and I'll expand on them after this is done (or during), but won't include much of this in this issue as we should focus on the hot backups first and foremost.

zeebe-bors-camunda bot added a commit that referenced this issue Sep 12, 2022
10290: Create and configure scheduler in distribution module r=npepinpe a=npepinpe

## Description

As we want to use certain parts of the system in the actuators, we need to pull out of the broker and up into the distribution the creation of such instances. For this PR, we're moving the `ActorScheduler`' configuration and creation into dist, and instead injecting it directly in the broker at construction time.

There is some duplication as of now with the gateway's `ActorSchedulerComponent`, but I'd rather not deal with this now until we have a clearer picture of where we want to end up.

## Related issues

related to #9996 



Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
@npepinpe
Copy link
Member

Working on the broker side, it's clear that doing this will introduce confusion between the broker and the standalone broker. Since we would now create some components outside of the Broker (and broker module), you would expect them to also not be configured by the broker, yet they are still configured via the BrokerCfg. Why is this an issue? Our test rules expect to configure everything in the broker via BrokerCfg, but now that you inject the cluster into the Broker, you have to either:

  • Have some factory which can take a BrokerCfg and produce a cluster, living in the broker module which is used in production only in the dist module, and in tests can be used by the rules we have
  • Duplicate the cluster creation code, such that it lives both in the dist module and in the broker tests
  • Move the configuration out of the broker module (which is what I would expect) and have the cluster services 100% injectable, with controlled implementations for the broker tests

The latter is the correct way, I think, but much more effort, and I'm not super keen on doing this right now due to time constraints 🤷

@npepinpe
Copy link
Member

God this second PR was painful...

zeebe-bors-camunda bot added a commit that referenced this issue Sep 15, 2022
10354: Allow AtomixCluster to be autowired everywhere r=npepinpe a=npepinpe

## Description

This PR extracts building the `AtomixCluster` from the broker into the `dist` module. Unfortunately, because many of our testing utilities break all kinds of abstraction, and our tests still heavily rely on modifying the cluster via the `BrokerCfg`, mapping the `BrokerCfg` to the `AtomixCluster` is still done in the `broker` module. However, everything is prepared to eventually move this out to the dist module itself, such that our configuration would simply instantiate the Atomix `ClusterConfig` type, and later inject a `AtomixCluster` in the broker. See the `StandaloneGateway` as an example; there is plenty of opportunity to DRY it up once both are simply configuring `ClusterConfig` directly, and it will remove intermediate layers of builders/config.

In order to achieve that at the moment, since the network configuration relies on being initialized with the right broker base path, I also had to pull up the working directory into its own configuration, such that it can be autowired as well. This is a little more abstraction than I'd like, but it's due to how the `BrokerCfg` is still the central point for all the components, even those not specific to the broker.

The broker configuration is thus not initialized in the system context or broker anymore, but is given to the broker already initialized. This is done via the Spring configuration `BrokerConfiguration`. It uses two beans to do this, which I dislike, but I'm not sure what's the best way to do this. I generally dislike that the configuration must be initialized anyway, it's very brittle and easy to forget, so I'd rather figure a way to avoid this in the future. I would propose a follow up which removes this, and instead enforces that we create the broker config either passing the base already, or using a factory method which does this. I can also do it here, but it will increase how much changes are here.

I think in the future the cluster services should be started before the broker itself, but we can figure this out later. For now, it means that if the actuator wants to make use of them before the broker finishes starting, it will fail - this is something we can handle semi-gracefully, so I would leave it as is.

## Related issues

related to #9996



Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/maintainability Marks an issue as improving the maintainability of the project area/observability Marks an issue as observability related area/ux Marks an issue as related to improving the user experience kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants