Clients can set an optional backoff time to the failed tasks. #5629

aivinog1 · 2020-10-19T13:51:44Z

Is your feature request related to a problem? Please describe.
This is the client's part of #164. The idea is that a client (Java/Go/Zbctl) can specify an optional backoff time when sending a JobFailed command. If a backoff time is set then the job will be available after the backoff time. Otherwise, the job will be available immediately (current behavior).

Describe the solution you'd like
Clients can set an optional backoff time.

Describe alternatives you've considered
We could split this task into three for different clients.

tfnick · 2021-04-30T08:06:18Z

this feature is useful

aivinog1 · 2021-12-19T12:36:58Z

Hey @saig0, @npepinpe! I have a question about implementation. So, when I try to use:

CLIENT_RULE
        .getClient()
        .newFailCommand(jobKey)
        .retries(1)
        .retryBackoff(backoffTimeout)
        .requestTimeout(Duration.ofDays(1))
        .send()
        .join();

in this test, the synchronous call became locked for the whole backoffTimeout duration. I think that this happens because we are expecting some response (JobRecord with a fail) and we will wait for it. I don't think that this is the right behavior but maybe I'm wrong 🤔

npepinpe · 2022-02-18T19:04:22Z

👀 I wasn't really involved so far, so I'd have to look into what this means on Monday.

B-hamza · 2022-04-25T15:32:12Z

Oh, that will be cool to set a backoff from the client on failing jobs, is there any update on this issue ? Many thanks.

saig0 · 2022-05-06T03:57:04Z

@aivinog1 do you want to continue working on the topic?
I would like to see the feature in version 8.1 (Q3 2022). If you don't want or can't work on it then I would assign it to someone else.

aivinog1 · 2022-05-06T10:09:20Z

Hey @saig0!
Sure, I'm happy to work on it soon :) But can you help me with this question? :)

saig0 · 2022-05-09T04:49:44Z

@aivinog1 I'm sorry for overseeing your question 🙈

Do you have your changes on a branch? Please share a link. I can't reproduce the behavior otherwise.

In general, the request waits until the response is received (when using .send().join()). But the processor should also send the response if a backoff is set.

aivinog1 · 2022-05-14T06:45:13Z

@saig0 Here is the test that is failing: aivinog1@78f67ef#diff-ef4fccf13ed0e32bc54cf53556fcbeb151f2c6636988f33bc7b254e0c81a4aceR78

aivinog1 · 2022-05-14T10:16:13Z

@saig0 I think that I found the problem. So I'll need to do flush on the responseWriter somehow. I've tried to add responseWriter.flush(); like this, and it works but breaks other cases 🤔 :

17:05:17.324 [Broker-0-StreamProcessor-1] [Broker-0-zb-actors-1] ERROR io.camunda.zeebe.processor - Expected to execute side effects for record 'LoggedEvent [type=0, version=0, streamId=1, position=5, key=-1, timestamp=1652522717321, sourceEventPosition=-1] RecordMetadata{recordType=COMMAND, intentValue=255, intent=CREATE, requestStreamId=1, requestId=1, protocolVersion=3, valueType=PROCESS_INSTANCE_CREATION, rejectionType=NULL_VAL, rejectionReason=, brokerVersion=8.1.0}' successfully, but exception was thrown.
java.lang.NullPointerException: null
	at java.util.Objects.requireNonNull(Objects.java:208) ~[?:?]
	at io.camunda.zeebe.broker.transport.commandapi.CommandResponseWriterImpl.tryWriteResponse(CommandResponseWriterImpl.java:99) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.writers.TypedResponseWriterImpl.flush(TypedResponseWriterImpl.java:114) ~[classes/:?]
	at io.camunda.zeebe.util.retry.ActorRetryMechanism.run(ActorRetryMechanism.java:36) ~[classes/:?]
	at io.camunda.zeebe.util.retry.AbortableRetryStrategy.run(AbortableRetryStrategy.java:44) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:79) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:44) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:103) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:83) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:195) ~[classes/:?]

Also, I think that it works without retryBackoff because retries do so fast so in the end we quickly activate jobs and flush them.

saig0 · 2022-05-17T05:21:38Z

@aivinog1 I see. Thank you for sharing 👍

I was able to fix the behavior. See my commit here: a7195f7

9389: Implement a job backoff timeout on Zeebe Java client side r=saig0 a=aivinog1 ## Description  I implemented the usage of the backoff parameter in the Zeebe Java Client and `@saig0` fixed the problem with flushing responses. ## Related issues  #5629 Co-authored-by: Alexey Vinogradov <vinogradov.a.i.93@gmail.com> Co-authored-by: Philipp Ossler <philipp.ossler@gmail.com>

9417: feat(client-go): add support for backoff timeout for failed jobs in the Go client and zbctl r=pihme a=aivinog1 ## Description  I've added mapping in the Go client and an additional flag that matches job backoff. ## Related issues  closes #5629 Co-authored-by: Alexey Vinogradov <vinogradov.a.i.93@gmail.com>

* fix(test): fix large decision input/output test Since 8.3.0 Zeebe reacts differently on JSON passed to DMN instead of String. Fixing payload to be string in our test. * chore(tests): fix payload file

aivinog1 added the kind/feature label Oct 19, 2020

github-actions bot added the Status: Needs Triage label Oct 19, 2020

saig0 added scope/clients-java Scope: clients/go and removed Status: Needs Triage labels Oct 23, 2020

npepinpe added Status: Backlog and removed Status: Planned labels Jan 12, 2021

npepinpe removed the Status: Backlog label May 6, 2021

jwulf mentioned this issue Dec 14, 2021

Support retry backoff in Zeebe 1.3 camunda-community-hub/zeebe-client-node-js#248

Closed

saig0 mentioned this issue Feb 4, 2022

Implement the circuit breaker pattern #8735

Open

npepinpe mentioned this issue Feb 18, 2022

Exponential backoff strategy for job retries #5171

Closed

npepinpe added the team/process-automation label Apr 25, 2022

saig0 mentioned this issue May 6, 2022

I can set a retry timeout for failed tasks #164

Closed

This was referenced May 17, 2022

Implement a job backoff timeout on Zeebe Java client side #9389

Merged

feat(client-go): add support for backoff timeout for failed jobs in the Go client and zbctl #9417

Merged

ghost closed this as completed in 3ff54bf Jun 2, 2022

remcowesterhoud added the version:8.1.0-alpha2 label Jun 7, 2022

ChrisKujawa added the version:8.1.0 label Oct 4, 2022

pierre-yves-monnet mentioned this issue Oct 21, 2022

Adminstration: set up the back off time worker per worker camunda-community-hub/zeebe-cherry-runtime#54

Open

aanodin mentioned this issue Jan 9, 2023

Allow to set RetryBackOff parameter for FailJobCommand camunda-community-hub/zeebe-client-csharp#481

Closed

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clients can set an optional backoff time to the failed tasks. #5629

Clients can set an optional backoff time to the failed tasks. #5629

aivinog1 commented Oct 19, 2020

tfnick commented Apr 30, 2021

aivinog1 commented Dec 19, 2021 •

edited

Loading

npepinpe commented Feb 18, 2022

B-hamza commented Apr 25, 2022

saig0 commented May 6, 2022

aivinog1 commented May 6, 2022

saig0 commented May 9, 2022

aivinog1 commented May 14, 2022

aivinog1 commented May 14, 2022

saig0 commented May 17, 2022

Clients can set an optional backoff time to the failed tasks. #5629

Clients can set an optional backoff time to the failed tasks. #5629

Comments

aivinog1 commented Oct 19, 2020

tfnick commented Apr 30, 2021

aivinog1 commented Dec 19, 2021 • edited Loading

npepinpe commented Feb 18, 2022

B-hamza commented Apr 25, 2022

saig0 commented May 6, 2022

aivinog1 commented May 6, 2022

saig0 commented May 9, 2022

aivinog1 commented May 14, 2022

aivinog1 commented May 14, 2022

saig0 commented May 17, 2022

aivinog1 commented Dec 19, 2021 •

edited

Loading