Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix partitions initial offsets in Kafka connector #25769

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

anestoruk
Copy link
Contributor

Hello,

I've discovered a bug in Kafka initial offsets functionality that was introduced some time ago. Steps to reproduce this bug:

  1. Start Kafka source Jet job with initial offsets pointing to the last offset for each partition for given topic.*
  2. Suspend the job before any new records are added to the Kafka topic and then consumed by the Jet job (consuming at least 1 record from each partition will result in proper behavior, because internally stored offsets array will be updated to correct values).
  3. Resume the job. After doing it, Kafka processor will set consumer positions to the previous offsets incremented by 1, this will exceed the size of the partition and will result in consumer reading records from beginning of each partition.

*To be precise, it's not necessary to provide initial offsets pointing specifically to the end of each partition, because in theory any initial offset could be used and the wrong behavior would occur (i.e. after the job is resumed, the processor will add +1 to the initial offsets), but in reality using any values other than the "last offset" of a given partition will (in most cases) cause the Jet job to consume some records before we have a chance to suspend the job.

I made a fix for this problem by decrementing the value that is put into the StreamKafkaP's internal offsets map at the time when seekToInitialOffsets() is being executed.

Checklist:

  • Labels (Team:, Type:, Source:, Module:) and Milestone set
  • Label Add to Release Notes or Not Release Notes content set
  • Request reviewers if possible
  • Send backports/forwardports if fix needs to be applied to past/future releases
  • New public APIs have @Nonnull/@Nullable annotations
  • New public APIs have @since tags in Javadoc

…om incorrect offsets after suspending and resuming Jet job
@hz-devops-test hz-devops-test added the Source: Community PR or issue was opened by a community user label Oct 19, 2023
@devOpsHazelcast
Copy link
Collaborator

Can one of the admins verify this patch?

3 similar comments
@devOpsHazelcast
Copy link
Collaborator

Can one of the admins verify this patch?

@devOpsHazelcast
Copy link
Collaborator

Can one of the admins verify this patch?

@devOpsHazelcast
Copy link
Collaborator

Can one of the admins verify this patch?

@TomaszGaweda
Copy link
Contributor

run-lab-run

@frant-hartm frant-hartm added this to the 5.4.0 milestone Oct 19, 2023
@frant-hartm frant-hartm changed the title Kafka - Partitions initial offsets bug fix Fix partitions initial offsets in Kafka connector Oct 22, 2023
@AyberkSorgun AyberkSorgun modified the milestones: 5.4.0, 5.5.0 Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: Jet Issues/PRs for Jet Source: Community PR or issue was opened by a community user Team: Integration Type: Defect
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants