Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not take snapshot on followers because the position doesn't exist #7911

Closed
deepthidevaki opened this issue Sep 27, 2021 · 2 comments · Fixed by #9624
Closed

Could not take snapshot on followers because the position doesn't exist #7911

deepthidevaki opened this issue Sep 27, 2021 · 2 comments · Fixed by #9624
Assignees
Labels
kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user version:1.3.12 version:8.1.0-alpha3 Marks an issue as being completely or in parts released in 8.1.0-alpha3 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Milestone

Comments

@deepthidevaki
Copy link
Contributor

Describe the bug

Found the following error in a cluster

java.lang.IllegalStateException: Failed to take snapshot. Expected to find an indexed entry for determined snapshot position 243718629 (processedPosition = 243718629, exportedPosition=243718630), but found no matching indexed entry which contains this position.

On further investigation, this happened because the follower received a new snapshot from the leader, and it reset the log. After that it did not receive any new events because of network issues. Then later the follower tries to take a snapshot, but since the log is empty it cannot find the index of the position. In this case this is not actually an error because nothing has been processed/exported yet. So ideally this error case should be handled differently on follower and leader.

The impact of this is nothing. The follower will eventually takes the snapshot when it starts receiving the new events.

Expected behavior

  • If possible, not log this as an error on the follower. But this is most likely an error (inconsistent state) on the leader.
  • It would be good if the follower does not attempt to take a snapshot if its log is not up-to-date with the processed and exported positions in the latest snapshot.

Log/Stacktrace

https://console.cloud.google.com/errors/CJu-z5WFm_fKVw?service=zeebe-broker&version=release-1-2-0&time=P7D&project=zeebe-io

Environment:

  • Zeebe Version: release candidate 1.2.0
@deepthidevaki deepthidevaki added kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user labels Sep 27, 2021
@npepinpe
Copy link
Member

npepinpe commented Oct 4, 2021

Assumption is it's noise, and an expected case on the follower which we should handle as it's not an error - but it should be harmless, and will recover by itself. Although it might be difficult to differentiate between the cases where it's expected, and where it isn't.

EDIT: one option could be to log as warning on the follower, as it will most likely recover, but should be treated as an error if it occurs many times.

@npepinpe npepinpe added this to the 8.1 milestone Jun 24, 2022
@npepinpe
Copy link
Member

Still happening as of today (last occurred on 22nd of June). To avoid worrying users for a non-issue, let's fix it.

@deepthidevaki deepthidevaki self-assigned this Jun 27, 2022
zeebe-bors-camunda bot added a commit that referenced this issue Jun 28, 2022
9635: [Backport stable/1.3] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/1.3`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Jun 28, 2022
9636: [Backport stable/8.0] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/8.0`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Jun 29, 2022
9636: [Backport stable/8.0] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/8.0`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Jun 29, 2022
9635: [Backport stable/1.3] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/1.3`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
@oleschoenburg oleschoenburg added the version:8.1.0-alpha3 Marks an issue as being completely or in parts released in 8.1.0-alpha3 label Jul 5, 2022
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user version:1.3.12 version:8.1.0-alpha3 Marks an issue as being completely or in parts released in 8.1.0-alpha3 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
4 participants