Skip to content

[ML] Fix race condition in updating lastMarkDeleteEntry field #15031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Apr 5, 2022

Motivation

  • missed updates to lastMarkDeleteEntry can lead to the subscription and consuming getting stuck. This seems to impact both Key_shared and Shared subscriptions.

Modifications

  • when updating lastMarkDeleteEntry field, make sure that the new value is later than the previous value.

- missed updates can lead to the subscription and consuming getting stuck
@lhotari lhotari added type/bug The PR fixed a bug or issue reported a bug area/broker labels Apr 5, 2022
@lhotari lhotari self-assigned this Apr 5, 2022
@lhotari lhotari added the doc-not-needed Your PR changes do not impact docs label Apr 5, 2022
@apache apache deleted a comment from github-actions bot Apr 5, 2022
@apache apache deleted a comment from github-actions bot Apr 5, 2022
final Map<String, Long> properties) {
LAST_MARK_DELETE_ENTRY_UPDATER.updateAndGet(this, last -> {
if (last != null && last.newPosition.compareTo(newPosition) > 0) {
// keep current value, don't update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about adding a DEBUG statement here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's address logging in the previous case and use a similar resolution here

lastMarkDeleteEntry = mdEntry;
LAST_MARK_DELETE_ENTRY_UPDATER.updateAndGet(this, last -> {
if (last != null && last.newPosition.compareTo(mdEntry.newPosition) > 0) {
// keep the current value since it's later then the mdEntry.newPosition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about adding a DEBUG statement here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there would be a lot of logging since it's very common that it would happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we can add DEBUG in the other branch, the "new" code that you are adding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like this?

                if (log.isDebugEnabled()) {
                    log.debug("Keeping {} since it's newer than {}", last, mdEntry);
                }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or is this better

                if (log.isDebugEnabled()) {
                    log.debug("Ignoring update {} to lastMarkDeleteEntry {} since current value is later", mdEntry, last);
                }

@lhotari
Copy link
Member Author

lhotari commented Apr 5, 2022

btw. The problem scenario looked like very similar as in #14261. However there was a difference.

Copy link
Member

@dave2wave dave2wave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@eolivelli eolivelli merged commit ad2f397 into apache:master Apr 5, 2022
@lhotari lhotari added cherry-picked/branch-2.7 Archived: 2.7 is end of life cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 labels Apr 6, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Apr 6, 2022

Verified

This commit was signed with the committer’s verified signature.
EndBug Federico Grandi
…#15031)

- missed updates can lead to the subscription and consuming getting stuck

(cherry picked from commit ad2f397)
(cherry picked from commit f657bf8)
lhotari added a commit to datastax/pulsar that referenced this pull request Apr 6, 2022
…#15031)

- missed updates can lead to the subscription and consuming getting stuck

(cherry picked from commit ad2f397)
(cherry picked from commit 6f74686)
Lannnnh pushed a commit to Lannnnh/pulsar that referenced this pull request Apr 6, 2022
…#15031)

- missed updates can lead to the subscription and consuming getting stuck
lhotari added a commit to lhotari/pulsar that referenced this pull request Apr 7, 2022
@lhotari
Copy link
Member Author

lhotari commented Apr 7, 2022

I made a follow-up PR #15067 to improve the solution. Please review.

lhotari added a commit that referenced this pull request Apr 14, 2022
…15067)

- follow up on #15031 
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method
lhotari added a commit that referenced this pull request Apr 14, 2022
…15067)

- follow up on #15031
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method

(cherry picked from commit a19a30a)
lhotari added a commit that referenced this pull request Apr 14, 2022
…15067)

- follow up on #15031
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method

(cherry picked from commit a19a30a)
lhotari added a commit that referenced this pull request Apr 14, 2022
…15067)

- follow up on #15031
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method

(cherry picked from commit a19a30a)
lhotari added a commit that referenced this pull request Apr 14, 2022
…15067)

- follow up on #15031
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method

(cherry picked from commit a19a30a)
lhotari added a commit to datastax/pulsar that referenced this pull request Apr 14, 2022
 (apache#15067)

- follow up on apache#15031
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method

(cherry picked from commit a19a30a)
aparajita89 pushed a commit to aparajita89/pulsar that referenced this pull request Apr 18, 2022
 (apache#15067)

- follow up on apache#15031 
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
…#15031)

- missed updates can lead to the subscription and consuming getting stuck
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
 (apache#15067)

- follow up on apache#15031 
* [ML] Fix race in persisting mark delete position
* [ML] Resetting should reset lastMarkDeleteEntry
* [ML] Reset fields in initializeCursorPosition method
@Technoboy-
Copy link
Contributor

Why does this patch not add a test ?
@eolivelli @lhotari

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-2.7 Archived: 2.7 is end of life cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 doc-not-needed Your PR changes do not impact docs release/2.7.5 release/2.8.4 release/2.9.3 release/2.10.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants