Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

korthout · 2022-08-15T09:51:33Z

Describe the bug

The deployment partition (i.e. partition 1) on benchmark cluster medic-cw-32-be18e23b78-benchmark is DEAD, due to a ZeebeDbInconsistentException.

This cluster was already facing other problems, but these appear unrelated

Benchmark for CW32 with high backpressure, at times no throughput and frequent gateway restarts #10059

The ZeebeDbInconsistentException is thrown when a partition tries to make a state change that is unexpected and would otherwise lead to inconsistent state. The partition is then marked as DEAD to preserve the consistent state.

See

Preconditions for Create, Update and Delete in ColumnFamily #8787

Looking at the stacktrace (below), it's pretty clear that an attempt was made to delete a non-existing pending deployment from the state. This attempt was made as a direct result of writing a DeploymentDistribution:Completed event.

We check whether a pending deployment exists before writing the COMPLETED event, but not for the specific partition id.

To Reproduce

May be hard to reproduce, but is potentially caused by a retried deployment distribution.

Expected behavior

We should reject the DeploymentDistribution:Complete command when the specific pending deployment (for the deployment key and the partition id) doesn't exist, instead of writing the DeploymentDistribution:Completed event.

Log/Stacktrace

Full Stacktrace

 io.camunda.zeebe.db.ZeebeDbInconsistentException: Key DbCompositeKey{first=DbLong{2251800016784062}, second=DbInt{2}} in ColumnFamily PENDING_DEPLOYMENT does not exist
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.assertKeyExists(TransactionalColumnFamily.java:289) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.lambda$deleteExisting$12(TransactionalColumnFamily.java:199) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.lambda$ensureInOpenTransaction$17(TransactionalColumnFamily.java:301) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.DefaultTransactionContext.runInTransaction(DefaultTransactionContext.java:33) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.ensureInOpenTransaction(TransactionalColumnFamily.java:300) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.TransactionalColumnFamily.deleteExisting(TransactionalColumnFamily.java:196) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.deployment.DbDeploymentState.removePendingDeploymentDistribution(DbDeploymentState.java:70) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.appliers.DeploymentDistributionCompletedApplier.applyState(DeploymentDistributionCompletedApplier.java:27) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.appliers.DeploymentDistributionCompletedApplier.applyState(DeploymentDistributionCompletedApplier.java:15) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.state.appliers.EventAppliers.applyState(EventAppliers.java:250) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.streamprocessor.writers.ResultBuilderBackedEventApplyingStateWriter.appendFollowUpEvent(ResultBuilderBackedEventApplyingStateWriter.java:41) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.deployment.distribute.CompleteDeploymentDistributionProcessor.processRecord(CompleteDeploymentDistributionProcessor.java:52) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.streamprocessor.TypedRecordProcessor.processRecord(TypedRecordProcessor.java:24) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.Engine.process(Engine.java:115) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.lambda$processCommand$2(ProcessingStateMachine.java:264) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run(ZeebeTransaction.java:84) ~[zeebe-db-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.processCommand(ProcessingStateMachine.java:260) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.tryToReadNextRecord(ProcessingStateMachine.java:215) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.readNextRecord(ProcessingStateMachine.java:191) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorJob.invoke(ActorJob.java:92) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorJob.execute(ActorJob.java:45) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorTask.execute(ActorTask.java:119) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.executeCurrentTask(ActorThread.java:106) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.doWork(ActorThread.java:87) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.run(ActorThread.java:198) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]

Environment:

Zeebe Version: be18e23

Reported errors

The text was updated successfully, but these errors were encountered:

korthout · 2022-08-15T09:52:10Z

I've marked this as critical as it marks the deployment partition as DEAD, and we should resolve it ASAP.

10073: Abbreviate Deployment Distribution records in compact logger r=pihme a=korthout ## Description  Many record types are abbreviated in the compact logger, but some of the more common (and also longer) ones aren't. See for example, this output after a failed test: ``` 2 C DEPLOYMENT DISTRIBUTE - #1->-1 P1K2 - process.xml 3 C DEPLOYMENT DISTRIBUTE - #1->-1 P1K2 - process.xml 1 C DEPLOYMENT CREATE - #1->-1 -1 - 1 E PROC CREATED - #2->#1 P1K1 - process.xml -> "shouldR..ecovery" (version:1) 1 E DEPLOYMENT CREATED - #3->#1 P1K2 - process.xml 1 E DEPLOYMENT_DISTRIBUTION DISTRIBUTING - #4->#1 P1K2 - DeploymentDistributionRecord {"partitionId":2} 1 E DEPLOYMENT_DISTRIBUTION DISTRIBUTING - #5->#1 P1K2 - DeploymentDistributionRecord {"partitionId":3} 2 C DEPLOYMENT DISTRIBUTE - #2->-1 P1K2 - process.xml 3 C DEPLOYMENT DISTRIBUTE - #2->-1 P1K2 - process.xml 1 C DEPLOYMENT_DISTRIBUTION COMPLETE - #6->-1 P1K2 - DeploymentDistributionRecord {"partitionId":2} 1 E DEPLOYMENT_DISTRIBUTION COMPLETED - #7->#6 P1K2 - DeploymentDistributionRecord {"partitionId":2} 2 E DEPLOYMENT DISTR - #3->#1 P1K2 - process.xml 2 E DEPLOYMENT DISTR - #4->#2 P1K2 - process.xml ``` After this PR, the same log is outputted as: ``` 2 C DPLY DISTRIBUTE - #1->-1 P1K2 - process.xml 3 C DPLY DISTRIBUTE - #1->-1 P1K2 - process.xml 1 C DPLY CREATE - #1->-1 -1 - 1 E PROC CREATED - #2->#1 P1K1 - process.xml -> "shouldR..ecovery" (version:1) 1 E DPLY CREATED - #3->#1 P1K2 - process.xml 1 E DSTR DISTRIBUTING - #4->#1 P1K2 - on partition 2 1 E DSTR DISTRIBUTING - #5->#1 P1K2 - on partition 3 2 C DPLY DISTRIBUTE - #2->-1 P1K2 - process.xml 3 C DPLY DISTRIBUTE - #2->-1 P1K2 - process.xml 1 C DSTR COMPLETE - #6->-1 P1K2 - on partition 2 1 E DSTR COMPLETED - #7->#6 P1K2 - on partition 2 2 E DPLY DISTRIBUTED - #3->#1 P1K2 - process.xml 2 E DPLY DISTRIBUTED - #4->#2 P1K2 - process.xml ``` ## Related issues  relates to #10064 Co-authored-by: Nico Korthout <nico.korthout@camunda.com>

10072: Document ProcessingScheduleService.runAtFixedRate r=pihme a=korthout ## Description  The ProcessingScheduleService has a runAtFixedRate which is implemented in a simple way: run the task after a delay and then schedule itself. But there are some quirks around this that should be clear to the caller. Specifically, that there is an initial delay before the task is run, **and** that due to the way it reschedules the task, time traveling only affects the currently scheduled delay. ## Related issues  related to #10064 Co-authored-by: Nico Korthout <nico.korthout@camunda.com>

10075: [Backport stable/8.0] Abbreviate Deployment Distribution records in compact logger r=Zelldon a=backport-action # Description Backport of #10073 to `stable/8.0`. relates to #10064 Co-authored-by: Nico Korthout <nico.korthout@camunda.com>

10074: Reject duplicate DeploymentDistribution Complete command r=korthout a=korthout ## Description  There existed a special case that could lead to a ZeebeDbInconsistentException: - a pending deployment is distributed multiple times to another partition by the [DeploymentRedistributor](https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/engine/processing/deployment/distribute/DeploymentRedistributor.java). - the other partition processes the distribution twice and both times sends a `DeploymentDistribution:Complete` command to the deployment partition (i.e. `partitionId: 1`). - the deployment partition processes the first complete command, and writes `DeploymentDistribution:Completed` event, which is applied to the state. - applying the completed event results in deleting the Pending Deployment for that partition. - when it processes the second complete command, there could still be a pending deployment for another partition open for the same deployment, if so, the error happens. - the second command is not rejected, because there is still a pending deployment for the deployment key, so another completed event is written and applied. - applying fails this time, because the pending deployment no longer exists. This PR changes the behavior. It makes sure the second command is rejected because the specific pending deployment no longer exists. In that case, we don't write the completed event a second time. ## Related issues  closes #10064 Co-authored-by: Nico Korthout <nico.korthout@camunda.com>

10079: [Backport stable/1.3] Abbreviate Deployment Distribution records in compact logger r=korthout a=korthout # Description Manual backport of #10073 to `stable/1.3`. relates to #10064 I've resolved conflicts in b9beaad Co-authored-by: Nico Korthout <nico.korthout@camunda.com>

10083: Document how to do inter-partition communication r=korthout a=korthout ## Description  During our team meeting, we discussed that we wanted to document how we do (and want to do) inter-partition communication. This is an attempt to document how we want to do inter-partition communication, and why it should be done this way. ## Related issues  relates to #10064 Co-authored-by: Nico Korthout <nico.korthout@camunda.com> Co-authored-by: Nico Korthout <korthout@users.noreply.github.com>

korthout self-assigned this Aug 15, 2022

This was referenced Aug 15, 2022

Document ProcessingScheduleService.runAtFixedRate #10072

Merged

Abbreviate Deployment Distribution records in compact logger #10073

Merged

Reject duplicate DeploymentDistribution Complete command #10074

Merged

backport-action mentioned this issue Aug 15, 2022

[Backport stable/8.0] Abbreviate Deployment Distribution records in compact logger #10075

Merged

korthout mentioned this issue Aug 16, 2022

[Backport stable/1.3] Abbreviate Deployment Distribution records in compact logger #10079

Merged

korthout mentioned this issue Aug 16, 2022

Document how to do inter-partition communication #10083

Merged

10 tasks

korthout mentioned this issue Aug 17, 2022

Document how to write engine tests #10096

Open

zeebe-bors-camunda bot closed this as completed in 8ff06e4 Aug 18, 2022

backport-action mentioned this issue Aug 18, 2022

[Backport stable/8.0] Reject duplicate DeploymentDistribution Complete command #10117

Closed

saig0 added release/8.0.8 labels Sep 1, 2022

deepthidevaki added the release/8.1.0-alpha5 label Sep 6, 2022

Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

korthout commented Aug 15, 2022 •

edited

korthout commented Aug 15, 2022

Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

Delete existing PENDING_DEPLOYMENT causes ZeebeDbInconsistentException #10064

Comments

korthout commented Aug 15, 2022 • edited

korthout commented Aug 15, 2022

korthout commented Aug 15, 2022 •

edited