Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: datacoord meta collections is not a realtime complete snapshot of collections in cluster #32325

Open
1 task done
wayblink opened this issue Apr 16, 2024 · 4 comments
Open
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@wayblink
Copy link
Contributor

wayblink commented Apr 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

For me, It is quite weird and misleading that collections in datacoord.meta is not a complete snapshot of all collections.

There is AddCollection but no RemoveCollection interface. No signal is send from rootCoord to datacoord when a collection is dropped. This will bring some misunderstand in developing, for example:

Will keep warning failed to describe index after the collection is dropped because meta.collection is not removed and will still be scanned.

[2024/04/16 18:14:38.520 +08:00] [INFO] [datacoord/compaction_trigger.go:305] [triggerClusteringCompaction] [collectionID=449120529723687285]
[2024/04/16 18:14:40.540 +08:00] [WARN] [datacoord/index_service.go:739] ["DescribeIndex fail"] [traceID=45e22e05015c5011b2e8e4af7d7a27c0] [collectionID=449120529723687285] [indexName=] [error="index not found[indexName=]"]
[2024/04/16 18:14:40.540 +08:00] [WARN] [datacoord/metrics_info.go:66] ["failed to describe index, ignore to report index metrics"] [traceID=45e22e05015c5011b2e8e4af7d7a27c0] [collection=449120529723687285] [error="index not found[indexName=]"]
[2024/04/16 18:14:43.539 +08:00] [WARN] [datacoord/index_service.go:739] ["DescribeIndex fail"] [traceID=8f7b283894fc63b6b542db6e73c9dfc6] [collectionID=449120529723687285] [indexName=] [error="index not found[indexName=]"]
[2024/04/16 18:14:43.539 +08:00] [WARN] [datacoord/metrics_info.go:66] ["failed to describe index, ignore to report index metrics"] [traceID=8f7b283894fc63b6b542db6e73c9dfc6] [collection=449120529723687285] [error="index not found[indexName=]"]
[2024/04/16 18:14:46.540 +08:00] [WARN] [datacoord/index_service.go:739] ["DescribeIndex fail"] [traceID=529e95a735c076dce6d37f1c4dc10de5] [collectionID=449120529723687285] [indexName=] [error="index not found[indexName=]"]
[2024/04/16 18:14:46.540 +08:00] [WARN] [datacoord/metrics_info.go:66] ["failed to describe index, ignore to report index metrics"] [traceID=529e95a735c076dce6d37f1c4dc10de5] [collection=449120529723687285] [error="index not found[indexName=]"]

I 'd like to refactor it in a better way.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@wayblink wayblink added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 16, 2024
@xiaofan-luan
Copy link
Contributor

Agreed.
Datacoord meta works like a cache but we should think a way to improve

@yanliang567
Copy link
Contributor

/assign @wayblink
/unassign

@sre-ci-robot sre-ci-robot assigned wayblink and unassigned yanliang567 Apr 17, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 17, 2024
Copy link

stale bot commented May 18, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 18, 2024
@yanliang567
Copy link
Contributor

keep it

@stale stale bot removed the stale indicates no udpates for 30 days label May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants