Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: newly inserted data does not display the correct segment row numbers #32161

Open
1 task done
douglarek opened this issue Apr 11, 2024 · 10 comments
Open
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@douglarek
Copy link

douglarek commented Apr 11, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.3.13
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):    kafka
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: enough
- GPU: 
- Others:

Current Behavior

Executing flush on attu, after writing to disk, shows no data.

image

But it also shows that the actual number of inserted entries is 445,000.

image

And what's even more absurd is that although it clearly states 1-22 data entries, only 16 are actually displayed.

image

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

I'm not sure if this is an issue with attu(version 2.3.10) or not. For attu 2.3.8 and 2.3.10, the segment section always displays 0 rows.

@douglarek douglarek added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 11, 2024
@douglarek douglarek changed the title [Bug]: newly inserted data cannot be written to disk [Bug]: newly inserted data does not display the correct segment row numbers Apr 11, 2024
@yanliang567
Copy link
Contributor

/assign @shanghaikid
please help to check if it is a attu issue.
/unassign

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 11, 2024
@xiaofan-luan
Copy link
Contributor

@douglarek I'm assuming this is more like a bug on your usage.
Could you share you code about how you implement?

@douglarek
Copy link
Author

douglarek commented Apr 12, 2024

Could you share you code about how you implement?

How do I understand the code you mentioned sharing, in terms of which aspect of the code?

By the way, I have a collection of 500,000 vectors. Reloading and releasing them is not an issue, but the segmentation display is problematic.

image

Is it normal that a collection of 500,000 vectors doesn't generate any segments?

image

And next, how should I go about debugging to identify where the issue lies?

@douglarek
Copy link
Author

douglarek commented Apr 12, 2024

Update: It is now essentially confirmed that the issue is likely not related to Attu. After killing the querynod with excessively high memory usage, the displayed count of loaded collections directly shows 0, making retrieval impossible. I'm curious if we've ever conducted high-dimensional collection tests at a large data scale.

@xiaofan-luan
Copy link
Contributor

@douglarek
you have to do a manual flush or wait for 24 hours or fill the size of one segment before you can see it.
But that don't affect you search.
I you want to know the exact number of collection, use count.

@douglarek
Copy link
Author

@douglarek you have to do a manual flush or wait for 24 hours or fill the size of one segment before you can see it. But that don't affect you search. I you want to know the exact number of collection, use count.

Why do I have to manually flush? Since Mivus 2.x, the official recommendation is not to manually flush, and what is the mechanism for waiting 24 hours?

@xiaofan-luan
Copy link
Contributor

if you want to check the correct num entities. then flush could help.
better way is to use count

@douglarek
Copy link
Author

if you want to check the correct num entities. then flush could help. better way is to use count

Please take a look at the first image I posted in this issue. Even though I flushed it and it shows how many segments there are, each segment has zero rows, which is unsettling.

@xiaofan-luan
Copy link
Contributor

what is the number you get if you use pymilvus client?

if it's also 0, could you try to use birdwatcher and check the how entity is there i your meta of your segment?

Copy link

stale bot commented May 18, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants