Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(broker): do not log error if follower fails to take snapshot when log is not uptodate #9624

Merged
merged 1 commit into from
Jun 28, 2022

Conversation

deepthidevaki
Copy link
Contributor

@deepthidevaki deepthidevaki commented Jun 28, 2022

Description

Previously, this case was logged as error. Now AsyncSnapshotDirector treat it as a special case, log it in debug level.

Alternative options where to prevent taking snapshot in such cases. But it is very difficult to identify this case. All options prevented taking snapshots in scenarios where we want to take snapshot.

  1. One option was to return -1 as processed position in ReplayStateMachine when no events after snapshot position is replayed. But this also prevents taking snapshot when there are no new records, but the exporter position increases.
  2. Another options was to skip snapshot in StateController, when the determined snapshot position <= position in latest snapshot. This would not fix the bug, because if the exporter position has changed, it will still try to take snapshot and enter the same situation where it cannot find the indexed entry. Other options were attempted, but all failed in one or the other edge case.

So, the easiest solution is to treat it as a special case in AsyncSnapshotDirector.

Related issues

closes #7911

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement
  • If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 28, 2022

Unit Test Results

   783 files  ±  0     783 suites  ±0   1h 33m 26s ⏱️ + 4m 44s
5 609 tests  - 54  5 602 ✔️  - 54  7 💤 ±0  0 ±0 
5 781 runs   - 54  5 774 ✔️  - 54  7 💤 ±0  0 ±0 

Results for commit 372367d. ± Comparison against base commit 5828b99.

♻️ This comment has been updated with latest results.

Copy link
Member

@lenaschoenburg lenaschoenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deepthidevaki, nice to have fewer error logs!

Just a couple of optional suggestions, looks good otherwise.

… log is not up-to-date

Alternative options where to prevent taking snapshot in such cases. But it is very difficult to identify this case. All options prevented taking snapshots in scenarios where we want to take snapshot.
1. One option was to return -1 as processed position in ReplayStateMachine when no events after snapshot position is replayed. But this also prevents taking snapshot when there are no new records, but the exporter position increases.
2. Another options was to skip snapshot in StateController, when the determined snapshot position <= position in latest snapshot. This would not fix the bug, because if the exporter position has changed, it will still try to take snapshot and enter the same situation where it cannot find the indexed entry. Other options were attempted, but all failed one or the other edge case.

So, the easiest solution is to treat it as a special case in AsyncSnapshotDirector.
@deepthidevaki
Copy link
Contributor Author

bors merge

@zeebe-bors-camunda
Copy link
Contributor

Build succeeded:

@backport-action
Copy link
Collaborator

Successfully created backport PR #9635 for stable/1.3.

@backport-action
Copy link
Collaborator

Successfully created backport PR #9636 for stable/8.0.

zeebe-bors-camunda bot added a commit that referenced this pull request Jun 28, 2022
9635: [Backport stable/1.3] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/1.3`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this pull request Jun 28, 2022
9636: [Backport stable/8.0] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/8.0`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this pull request Jun 29, 2022
9636: [Backport stable/8.0] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/8.0`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this pull request Jun 29, 2022
9635: [Backport stable/1.3] fix(broker): do not log error if follower fails to take snapshot when log is not uptodate r=deepthidevaki a=backport-action

# Description
Backport of #9624 to `stable/1.3`.

relates to #7911

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Could not take snapshot on followers because the position doesn't exist
4 participants