Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

Closed
korthout opened this issue Aug 15, 2022 · 1 comment · Fixed by #10074
Closed

Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

korthout opened this issue Aug 15, 2022 · 1 comment · Fixed by #10074
Assignees
Labels
area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/critical Marks a stop-the-world bug, with a high impact and no existing workaround version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@korthout
Copy link
Member

korthout commented Aug 15, 2022

Describe the bug

The deployment partition (i.e. partition 1) on benchmark cluster medic-cw-32-be18e23b78-benchmark is DEAD, due to a ZeebeDbInconsistentException.

This cluster was already facing other problems, but these appear unrelated

The ZeebeDbInconsistentException is thrown when a partition tries to make a state change that is unexpected and would otherwise lead to inconsistent state. The partition is then marked as DEAD to preserve the consistent state.

See

Looking at the stacktrace (below), it's pretty clear that an attempt was made to delete a non-existing pending deployment from the state. This attempt was made as a direct result of writing a DeploymentDistribution:Completed event.

We check whether a pending deployment exists before writing the COMPLETED event, but not for the specific partition id.

To Reproduce

May be hard to reproduce, but is potentially caused by a retried deployment distribution.

Expected behavior

We should reject the DeploymentDistribution:Complete command when the specific pending deployment (for the deployment key and the partition id) doesn't exist, instead of writing the DeploymentDistribution:Completed event.

Log/Stacktrace

Full Stacktrace

 io.camunda.zeebe.db.ZeebeDbInconsistentException: Key DbCompositeKey{first=DbLong{2251800016784062}, second=DbInt{2}} in ColumnFamily PENDING_DEPLOYMENT does not exist
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.assertKeyExists(TransactionalColumnFamily.java:289) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.lambda$deleteExisting$12(TransactionalColumnFamily.java:199) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.lambda$ensureInOpenTransaction$17(TransactionalColumnFamily.java:301) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.DefaultTransactionContext.runInTransaction(DefaultTransactionContext.java:33) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.ensureInOpenTransaction(TransactionalColumnFamily.java:300) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.deleteExisting(TransactionalColumnFamily.java:196) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.deployment.DbDeploymentState.removePendingDeploymentDistribution(DbDeploymentState.java:70) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.appliers.DeploymentDistributionCompletedApplier.applyState(DeploymentDistributionCompletedApplier.java:27) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.appliers.DeploymentDistributionCompletedApplier.applyState(DeploymentDistributionCompletedApplier.java:15) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.appliers.EventAppliers.applyState(EventAppliers.java:250) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.streamprocessor.writers.ResultBuilderBackedEventApplyingStateWriter.appendFollowUpEvent(ResultBuilderBackedEventApplyingStateWriter.java:41) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.deployment.distribute.CompleteDeploymentDistributionProcessor.processRecord(CompleteDeploymentDistributionProcessor.java:52) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.streamprocessor.TypedRecordProcessor.processRecord(TypedRecordProcessor.java:24) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.Engine.process(Engine.java:115) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.lambda$processCommand$2(ProcessingStateMachine.java:264) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run(ZeebeTransaction.java:84) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.processCommand(ProcessingStateMachine.java:260) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.tryToReadNextRecord(ProcessingStateMachine.java:215) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.readNextRecord(ProcessingStateMachine.java:191) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorJob.invoke(ActorJob.java:92) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorJob.execute(ActorJob.java:45) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorTask.execute(ActorTask.java:119) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.executeCurrentTask(ActorThread.java:106) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.doWork(ActorThread.java:87) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.run(ActorThread.java:198) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT] 

Environment:

Reported errors

@korthout korthout added kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/critical Marks a stop-the-world bug, with a high impact and no existing workaround area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) labels Aug 15, 2022
@korthout korthout self-assigned this Aug 15, 2022
@korthout
Copy link
Member Author

I've marked this as critical as it marks the deployment partition as DEAD, and we should resolve it ASAP.

zeebe-bors-camunda bot added a commit that referenced this issue Aug 15, 2022
10073: Abbreviate Deployment Distribution records in compact logger r=pihme a=korthout

## Description

<!-- Please explain the changes you made here. -->

Many record types are abbreviated in the compact logger, but some of the more common (and also longer) ones aren't. 

See for example, this output after a failed test:

```
2 C DEPLOYMENT              DISTRIBUTE   - #1->-1 P1K2 - process.xml
3 C DEPLOYMENT              DISTRIBUTE   - #1->-1 P1K2 - process.xml
1 C DEPLOYMENT              CREATE       - #1->-1   -1 - 
1 E PROC                    CREATED      - #2->#1 P1K1 - process.xml -> "shouldR..ecovery" (version:1)
1 E DEPLOYMENT              CREATED      - #3->#1 P1K2 - process.xml
1 E DEPLOYMENT_DISTRIBUTION DISTRIBUTING - #4->#1 P1K2 - DeploymentDistributionRecord {"partitionId":2}
1 E DEPLOYMENT_DISTRIBUTION DISTRIBUTING - #5->#1 P1K2 - DeploymentDistributionRecord {"partitionId":3}
2 C DEPLOYMENT              DISTRIBUTE   - #2->-1 P1K2 - process.xml
3 C DEPLOYMENT              DISTRIBUTE   - #2->-1 P1K2 - process.xml
1 C DEPLOYMENT_DISTRIBUTION COMPLETE     - #6->-1 P1K2 - DeploymentDistributionRecord {"partitionId":2}
1 E DEPLOYMENT_DISTRIBUTION COMPLETED    - #7->#6 P1K2 - DeploymentDistributionRecord {"partitionId":2}
2 E DEPLOYMENT              DISTR        - #3->#1 P1K2 - process.xml
2 E DEPLOYMENT              DISTR        - #4->#2 P1K2 - process.xml
```

After this PR, the same log is outputted as:
```
2 C DPLY DISTRIBUTE   - #1->-1 P1K2 - process.xml
3 C DPLY DISTRIBUTE   - #1->-1 P1K2 - process.xml
1 C DPLY CREATE       - #1->-1   -1 - 
1 E PROC CREATED      - #2->#1 P1K1 - process.xml -> "shouldR..ecovery" (version:1)
1 E DPLY CREATED      - #3->#1 P1K2 - process.xml
1 E DSTR DISTRIBUTING - #4->#1 P1K2 - on partition 2
1 E DSTR DISTRIBUTING - #5->#1 P1K2 - on partition 3
2 C DPLY DISTRIBUTE   - #2->-1 P1K2 - process.xml
3 C DPLY DISTRIBUTE   - #2->-1 P1K2 - process.xml
1 C DSTR COMPLETE     - #6->-1 P1K2 - on partition 2
1 E DSTR COMPLETED    - #7->#6 P1K2 - on partition 2
2 E DPLY DISTRIBUTED  - #3->#1 P1K2 - process.xml
2 E DPLY DISTRIBUTED  - #4->#2 P1K2 - process.xml
```

## Related issues

<!-- Which issues are closed by this PR or are related -->

relates to #10064 



Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 15, 2022
10072: Document ProcessingScheduleService.runAtFixedRate r=pihme a=korthout

## Description

<!-- Please explain the changes you made here. -->
The ProcessingScheduleService has a runAtFixedRate which is implemented in a simple way: run the task after a delay and then schedule itself. But there are some quirks around this that should be clear to the caller.

Specifically, that there is an initial delay before the task is run, **and** that due to the way it reschedules the task, time traveling only affects the currently scheduled delay.

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10064 



Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 16, 2022
10075: [Backport stable/8.0] Abbreviate Deployment Distribution records in compact logger r=Zelldon a=backport-action

# Description
Backport of #10073 to `stable/8.0`.

relates to #10064

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 16, 2022
10075: [Backport stable/8.0] Abbreviate Deployment Distribution records in compact logger r=Zelldon a=backport-action

# Description
Backport of #10073 to `stable/8.0`.

relates to #10064

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 17, 2022
10074: Reject duplicate DeploymentDistribution Complete command r=korthout a=korthout

## Description

<!-- Please explain the changes you made here. -->
There existed a special case that could lead to a ZeebeDbInconsistentException:
- a pending deployment is distributed multiple times to another partition by the [DeploymentRedistributor](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/engine/processing/deployment/distribute/DeploymentRedistributor.java).
- the other partition processes the distribution twice and both times sends a `DeploymentDistribution:Complete` command to the deployment partition (i.e. `partitionId: 1`).
- the deployment partition processes the first complete command, and writes `DeploymentDistribution:Completed` event, which is applied to the state.
- applying the completed event results in deleting the Pending Deployment for that partition.
- when it processes the second complete command, there could still be a pending deployment for another partition open for the same deployment, if so, the error happens.
- the second command is not rejected, because there is still a pending deployment for the deployment key, so another completed event is written and applied.
- applying fails this time, because the pending deployment no longer exists.

This PR changes the behavior. It makes sure the second command is rejected because the specific pending deployment no longer exists. In that case, we don't write the completed event a second time.

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #10064



Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 17, 2022
10079: [Backport stable/1.3] Abbreviate Deployment Distribution records in compact logger r=korthout a=korthout

# Description
Manual backport of #10073 to `stable/1.3`.

relates to #10064

I've resolved conflicts in b9beaad

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Aug 18, 2022
10083: Document how to do inter-partition communication r=korthout a=korthout

## Description

<!-- Please explain the changes you made here. -->
During our team meeting, we discussed that we wanted to document how we do (and want to do) inter-partition communication.

This is an attempt to document how we want to do inter-partition communication, and why it should be done this way.

## Related issues

<!-- Which issues are closed by this PR or are related -->

relates to #10064 



Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
Co-authored-by: Nico Korthout <korthout@users.noreply.github.com>
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/critical Marks a stop-the-world bug, with a high impact and no existing workaround version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants