Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-13788: Broker keep logging Mark deleting an already mark-deleted position #3575

Open
sijie opened this issue Jan 17, 2022 · 1 comment

Comments

@sijie
Copy link
Member

sijie commented Jan 17, 2022

Original Issue: apache#13788


Describe the bug
There is a subscription keep logging Mark deleting an already mark-deleted position in our production cluster.

[persistent://op/growth/sms_produce_b_api_0-partition-2][growth] Failed to delete message at 71317494:-1: org.apache.bookkeeper.mledger.ManagedLedgerException: java.lang.IllegalArgumentException: Mark deleting an already mark-deleted position

Topic persistent://op/growth/sms_produce_b_api_0 has 3 partitions, but only partition-2 keep logging this issue.

The internal-stats shows cursor's markDeletePosition is after the firstRange of individuallyDeletedMessages:
image

So every time handle message ack it will select the upper endpoint of the firstRange of individuallyDeletedMessages which is ahead of markDeletePosition, so a IllegalArgumentException will be throw at org.apache.bookkeeper.mledger.impl.ManagedCursorImpl#setAcknowledgedPosition.
https://github.com/apache/pulsar/blob/3acdbfe4eac79040be140b423342c28a07dc0327/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L2001-L2021
https://github.com/apache/pulsar/blob/3acdbfe4eac79040be140b423342c28a07dc0327/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L1580-L1583

Arthas watch output

[arthas@261046]$ watch -n 5 -e org.apache.bookkeeper.mledger.impl.ManagedCursorImpl setAcknowledgedPosition {target,params[0],target.markDeletePosition,target.name,target.ledger.name}
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 1) cost in 233 ms, listenerId: 18
method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.setAcknowledgedPosition location=AtExceptionExit
ts=2022-01-17 16:54:37; [cost=0.365992ms] result=@ArrayList[
    @ManagedCursorImpl[ManagedCursorImpl{ledger=op/growth/persistent/sms_produce_a_api_0-partition-2, name=growth, ackPos=71317769:-1, readPos=86445249:1465}],
    @PositionImpl[71257569:341],
    @PositionImpl[71317769:-1],
    @String[growth],
    @String[op/growth/persistent/sms_produce_a_api_0-partition-2],
]
method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.setAcknowledgedPosition location=AtExceptionExit
ts=2022-01-17 16:54:44; [cost=0.094147ms] result=@ArrayList[
    @ManagedCursorImpl[ManagedCursorImpl{ledger=op/growth/persistent/sms_produce_a_api_0-partition-2, name=growth, ackPos=71317769:-1, readPos=86445249:1466}],
    @PositionImpl[71257569:341],
    @PositionImpl[71317769:-1],
    @String[growth],
    @String[op/growth/persistent/sms_produce_a_api_0-partition-2],
]
method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.setAcknowledgedPosition location=AtExceptionExit
ts=2022-01-17 16:54:47; [cost=0.09929ms] result=@ArrayList[
    @ManagedCursorImpl[ManagedCursorImpl{ledger=op/growth/persistent/sms_produce_a_api_0-partition-2, name=growth, ackPos=71317769:-1, readPos=86445249:1467}],
    @PositionImpl[71257569:341],
    @PositionImpl[71317769:-1],
    @String[growth],
    @String[op/growth/persistent/sms_produce_a_api_0-partition-2],
]
method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.setAcknowledgedPosition location=AtExceptionExit
ts=2022-01-17 16:54:48; [cost=0.056561ms] result=@ArrayList[
    @ManagedCursorImpl[ManagedCursorImpl{ledger=op/growth/persistent/sms_produce_a_api_0-partition-2, name=growth, ackPos=71317769:-1, readPos=86445249:1468}],
    @PositionImpl[71257569:341],
    @PositionImpl[71317769:-1],
    @String[growth],
    @String[op/growth/persistent/sms_produce_a_api_0-partition-2],
]

I don't know how to reproduce this situation yet, please let me know if need more details.

Additional context

  • pulsar version: 2.8.1
  • OS: CentOS Linux release 7.9.2009 (Core)
  • kernel: 3.10.0-1160.45.1.el7.x86_64
@sijie sijie added the type/bug label Jan 17, 2022
@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Feb 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant