Unlock operations get timeout exception when member dies #15328

frapana · 2019-07-18T08:22:10Z

Like #13551, when a client tries to release a lock that is hold by an unresponsive member, it gets an OperationTimeoutException and the lock is not released.

Example (hazelcast.operation.call.timeout.millis was set to 300000, HZ 3.11)

com.hazelcast.core.OperationTimeoutException: UnlockOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2019-07-11 19:02:48.049. Start time: 2019-07-11 18:52:48.046. Total elapsed time: 600004 ms. Last operation heartbeat: never. Last operation heartbeat from member: 2019-07-11 18:52:01.505. Invocation{op=com.hazelcast.concurrent.lock.operations.UnlockOperation{serviceName='hz:impl:lockService', identityHash=2128122754, partitionId=149, replicaIndex=0, callId=-390858, invocationTime=1562871168046 (2019-07-11 18:52:48.046), waitTimeout=-1, callTimeout=300000, namespace=InternalLockNamespace{service='hz:impl:lockService', objectName=triggerAwakeJobEvent}, threadId=1903}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=300000, firstInvocationTimeMs=1562871168046, firstInvocationTime='2019-07-11 18:52:48.046', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 00:00:00.000', target=[10.232.4.135]:56935, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=Connection[id=9, /10.232.4.134:56935->/10.232.4.135:54348, endpoint=[10.232.4.135]:56935, alive=true, type=MEMBER]}|

The text was updated successfully, but these errors were encountered:

mmedenjak · 2019-11-06T15:00:16Z

Hi @frapana !

True, if the member running the operation isn't responsive, we can only log that the operation is unable to be completed. You might want to try and see why the member is unresponsive by profiling and monitoring it. Or, if you expect such pauses, you might want to increase the heartbeat timeout by increasing the com.hazelcast.spi.properties.GroupProperty#OPERATION_CALL_TIMEOUT_MILLIS.

On this note, with Hazelcast 4.0, we have replaced the entire implementation of ILock with the unsafe mode of CP subsystem (https://docs.hazelcast.org/docs/latest-dev/manual/html-single/#removal-of-deprecated-concurrency-api-implementations). If you don't require strong consistency guarantees, that mode might fit your use case and solve the issue. If you do require strong consistency guarantees, you definitely might want to try out using the CP subsystem by turning off unsafe mode and running the appropriate number of members. Can you try it out?

mmedenjak · 2020-02-04T15:05:25Z

Closing as this issue is related to the discontinued lock implementation. Please try out Hazelcast 4.0 and the new lock implementation as it may solve your issue. In case it doesn't, please reopen this or open a new issue. Happy Hazelcasting!

mmedenjak added Source: Community PR or issue was opened by a community user Team: Client Team: Core Module: Invocation System Module: Lock labels Nov 6, 2019

AyberkSorgun added the Type: Defect label Nov 6, 2019

mmedenjak mentioned this issue Nov 6, 2019

com.hazelcast.concurrent.lock.operations.LockOperation tryLock callTimeout and waitTimeout #14535

Closed

mmedenjak closed this as completed Feb 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unlock operations get timeout exception when member dies #15328

Unlock operations get timeout exception when member dies #15328

frapana commented Jul 18, 2019 •

edited

mmedenjak commented Nov 6, 2019

mmedenjak commented Feb 4, 2020

Unlock operations get timeout exception when member dies #15328

Unlock operations get timeout exception when member dies #15328

Comments

frapana commented Jul 18, 2019 • edited

mmedenjak commented Nov 6, 2019

mmedenjak commented Feb 4, 2020

frapana commented Jul 18, 2019 •

edited