Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Stop generate compaction plan when channel checkpoint lag is too big #30996

Open
1 task done
congqixia opened this issue Mar 4, 2024 · 6 comments
Open
1 task done
Assignees
Labels
kind/enhancement Issues or changes related to enhancement stale indicates no udpates for 30 days

Comments

@congqixia
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

Stop generate compaction plan when channel checkpoint lag is too big

Why is this needed?

In 2.3, sync and compaction is mutually exclusive. When datanode is trying to catch up dml channel from position behind now a lot, compaction execution will block this procedure hand make is harder to catch up.
In 2.4(master), the segment view is not complete when checkpoint lag is large since there may be still some segment need to be flushed after.

It's always a better choice to wait channel checkpoint is up-to-date when trying to generate compaction plan.

Anything else?

No response

@congqixia congqixia added the kind/enhancement Issues or changes related to enhancement label Mar 4, 2024
@congqixia congqixia self-assigned this Mar 4, 2024
congqixia added a commit to congqixia/milvus that referenced this issue Mar 4, 2024
See also milvus-io#30996

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
congqixia added a commit to congqixia/milvus that referenced this issue Mar 4, 2024
See also milvus-io#30996

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
@xiaofan-luan
Copy link
Contributor

What will happen in milvus 3.0?

  1. flush is handled by lognode, there is no reason for compaction to understand about checkpoint?
  2. flush and compaction will not depend on each other.
  3. whether compaction should triggered could be purely depend on compaction service busy or not and data stats.

on 2.3, easier could be datanode directly reject compaction if it is still catch up checkpoints? @congqixia

@congqixia
Copy link
Contributor Author

@xiaofan-luan
IMHO, for 3.0 implementation the compaction shall be scheduled the the meta holder, lognode/datacoord
And the executor shall be dumb worker and has no knowledge about channel checkpoint.
This enhancement has one benefit or takeaway for new architecture that maybe it's a better choice not to trigger compaction for each new flush segment since the data view might be changed soon.

Since datanode is dumb worker role in this case, maybe it's better to let Datacoord choose whether the compaction shall be executed or not?

sre-ci-robot pushed a commit that referenced this issue Mar 5, 2024
See also #30996

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Copy link

stale bot commented Apr 14, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Apr 14, 2024
@xiaofan-luan
Copy link
Contributor

Anything we need to improve on this issue?

@stale stale bot removed the stale indicates no udpates for 30 days label Apr 14, 2024
@xiaofan-luan
Copy link
Contributor

Maybe we should simply avoid pick segment to compact if they their checkpoint fall behind channel checkpoint?

Copy link

stale bot commented May 18, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement stale indicates no udpates for 30 days
Projects
None yet
Development

No branches or pull requests

2 participants