Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: failed to load collection #30979

Open
1 task done
yunurs opened this issue Mar 1, 2024 · 6 comments
Open
1 task done

[Bug]: failed to load collection #30979

yunurs opened this issue Mar 1, 2024 · 6 comments
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@yunurs
Copy link

yunurs commented Mar 1, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.3.3
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka): pulsar   
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): CentoOS
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

i bulkload 4000million vector to miluvs and create scann index successful,but load failed

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

[2024/03/01 10:15:13.529 +08:00] [INFO] [task/scheduler.go:428] ["failed to promote task"] [taskID=1708284636676] [collectionID=446170141023103363] [replicaID=447805184585498632] [source=segment_checker] [error="failed to get shard delegator: channel=by-dev-rootcoord-dml_6_446170141023103363v1: channel not found"] [errorVerbose="failed to get shard delegator: channel=by-dev-rootcoord-dml_6_446170141023103363v1: channel not found\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrChannelNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:490\n | [...repeated from below...]\nWraps: (2) failed to get shard delegator\nWraps: (3) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.wrapWithField\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:760\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrChannelNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:488\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).checkSegmentTaskStale\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:802\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).checkStale\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:741\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).check\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:664\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).promote\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:427\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).tryPromoteAll.func1\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:390\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskQueue).Range\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:121\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).tryPromoteAll\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:389\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).schedule\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:529\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).Dispatch\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:445\n | github.com/milvus-io/milvus/internal/querycoordv2/dist.(*distHandler).handleDistResp\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/dist/dist_handler.go:110\n | github.com/milvus-io/milvus/internal/querycoordv2/dist.(*distHandler).start\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/dist/dist_handler.go:86\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (4) channel=by-dev-rootcoord-dml_6_446170141023103363v1\nWraps: (5) channel not found\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) merr.milvusError"]

[2024/03/01 10:15:13.530 +08:00] [WARN] [task/scheduler.go:394] ["failed to promote task"] [taskID=1708284636676] [error="failed to get shard delegator: channel=by-dev-rootcoord-dml_6_446170141023103363v1: channel not found"] [errorVerbose="failed to get shard delegator: channel=by-dev-rootcoord-dml_6_446170141023103363v1: channel not found\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrChannelNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:490\n | [...repeated from below...]\nWraps: (2) failed to get shard delegator\nWraps: (3) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.wrapWithField\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:760\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrChannelNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:488\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).checkSegmentTaskStale\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:802\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).checkStale\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:741\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).check\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:664\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).promote\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:427\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).tryPromoteAll.func1\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:390\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskQueue).Range\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:121\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).tryPromoteAll\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:389\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).schedule\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:529\n | github.com/milvus-io/milvus/internal/querycoordv2/task.(*taskScheduler).Dispatch\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/scheduler.go:445\n | github.com/milvus-io/milvus/internal/querycoordv2/dist.(*distHandler).handleDistResp\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/dist/dist_handler.go:110\n | github.com/milvus-io/milvus/internal/querycoordv2/dist.(*distHandler).start\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/dist/dist_handler.go:86\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (4) channel=by-dev-rootcoord-dml_6_446170141023103363v1\nWraps: (5) channel not found\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) merr.milvusError"]

Anything else?

No response

@yunurs yunurs added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 1, 2024
@bfurtaw
Copy link

bfurtaw commented Mar 1, 2024

I am also experiencing this issue but with milvus 2.3.2. No rhyme or reason for which machine will throw the error. Milvus might be running out of memory even. Log file standalone.log stacktrace attached.
message (1).txt

@yanliang567
Copy link
Contributor

@bfurtaw @yunurs we need the completed milvus pods logs for investigation. please refer this doc to export the whole Milvus logs. For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.

/assign @yunurs
/unassign

@sre-ci-robot sre-ci-robot assigned yunurs and unassigned yanliang567 Mar 2, 2024
@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 2, 2024
@flyBirdBoy
Copy link

flyBirdBoy commented Mar 13, 2024

I also encountered the above problem today. Based on docker installation, after the server was powered off and restarted, the load process of collection could not be recovered normally when restoring milvus' docker service. After checking the log information, the prompt was consistent with the theme

Copy link

stale bot commented Apr 13, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Apr 13, 2024
@yanliang567
Copy link
Contributor

please offer the full milvus pods logs for investigation, thanks

@stale stale bot removed the stale indicates no udpates for 30 days label Apr 15, 2024
Copy link

stale bot commented May 18, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants