Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE as MessagePublishProcessor tried to sendCorrelateCommand #9860

Closed
korthout opened this issue Jul 21, 2022 · 2 comments · Fixed by #9966
Closed

NPE as MessagePublishProcessor tried to sendCorrelateCommand #9860

korthout opened this issue Jul 21, 2022 · 2 comments · Fixed by #9966
Assignees
Labels
kind/bug Categorizes an issue or PR as a bug version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@korthout
Copy link
Member

korthout commented Jul 21, 2022

Describe the bug

A NullPointerException (NPE) was thrown while the MessagePublishProcessor tried to send the CORRELATE command. Specifically the flush failed.

To Reproduce

Unknown.

Expected behavior

Correlate command should be sent without an NPE.

Log/Stacktrace

Full logs (internal document)

Full Stacktrace

java.lang.RuntimeException: java.lang.NullPointerException
	at io.camunda.zeebe.streamprocessor.DirectProcessingResult.executePostCommitTasks(DirectProcessingResult.java:69) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.ProcessingStateMachine.lambda$executeSideEffects$11(ProcessingStateMachine.java:442) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.retry.ActorRetryMechanism.run(ActorRetryMechanism.java:36) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.retry.AbortableRetryStrategy.run(AbortableRetryStrategy.java:44) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorJob.invoke(ActorJob.java:99) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorJob.execute(ActorJob.java:47) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorTask.execute(ActorTask.java:120) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.executeCurrentTask(ActorThread.java:106) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.doWork(ActorThread.java:87) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.scheduler.ActorThread.run(ActorThread.java:198) ~[zeebe-scheduler-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
Caused by: java.lang.NullPointerException
	at java.util.Objects.requireNonNull(Unknown Source) ~[?:?]
	at io.camunda.zeebe.broker.transport.commandapi.CommandResponseWriterImpl.tryWriteResponse(CommandResponseWriterImpl.java:99) ~[zeebe-broker-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.streamprocessor.writers.TypedResponseWriterImpl.flush(TypedResponseWriterImpl.java:114) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.engine.processing.message.MessagePublishProcessor.sendCorrelateCommand(MessagePublishProcessor.java:193) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	at io.camunda.zeebe.streamprocessor.DirectProcessingResult.executePostCommitTasks(DirectProcessingResult.java:67) ~[zeebe-workflow-engine-8.1.0-SNAPSHOT.jar:8.1.0-SNAPSHOT]
	... 9 more 

Environment:

@oleschoenburg
Copy link
Member

Had a short look into this, NPE originates here:

The responseWriter.flush() looks out of place, no other method in the same class uses the responseWriter. At the same time, the flush can only result in an NPE if something was written to it since the last reset:

@Override
public boolean flush() {
if (isResponseStaged) {
writer.tryWriteResponse(requestStreamId, requestId);
}
return true;
}

@remcowesterhoud remcowesterhoud self-assigned this Aug 2, 2022
@remcowesterhoud
Copy link
Contributor

The cause for this is a change in the ProcessingStateMachine as a result of the engine abstraction topic. Executing of the side-effects has been changed to this:

        sideEffectsRetryStrategy.runWithRetry(
            () -> {
              // TODO refactor this into two parallel tasks, which are then combined, and on the
              // completion of which the process continues
              final boolean responseSent =
                  currentProcessingResult.writeResponse(context.getCommandResponseWriter());

              if (!responseSent) {
                return false;
              } else {
                return currentProcessingResult.executePostCommitTasks();
              }
            },
            abortCondition);

The call currentProcessingResult.writeResponse(context.getCommandResponseWriter()); will flush the response writer that's passed. This is the same response writer as the one being flushed in the MessagePublishProcessor.

When the side-effects of the MessagePublishProcessor gets executed with the currentProcessingResult.executePostCommitTasks(); call it tried to flush the response writer a 2nd time. Since the 1st flush has reset the CommandResponseWriterImpl the 2nd flush will result in this NPE.

For now I will resolve this by removing the flush from the MessagePublishProcessor. As the engine abstraction topic makes progress this will be resolved in a different way, by using the ProcessingResultBuilder#withResponse method.

@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug version:8.1.0-alpha5 Marks an issue as being completely or in parts released in 8.1.0-alpha5 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants