Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [major compaction] Search failed with error "User Field(embeddings) is not loaded" when set vector field as clustering key field and enable partition key field together #32343

Open
1 task done
binbinlv opened this issue Apr 16, 2024 · 1 comment
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@binbinlv
Copy link
Contributor

binbinlv commented Apr 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:lru_dev branch latest
- Deployment mode(standalone or cluster):both
- MQ type(rocksmq, pulsar or kafka):   all 
- SDK version(e.g. pymilvus v2.0.0rc2): dev latest
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Search failed with error "User Field(embeddings) is not loaded" when set vector field as clustering key field and enable partition key field

RPC error: [search], <MilvusException: (code=65535, message=fail to search on QueryNode 1: worker(1) query failed:  => User Field(embeddings) is not loaded)>, <Time:{'RPC start': '2024-04-16 16:18:30.022587', 'RPC error': '2024-04-16 16:18:30.747710'}>
Traceback (most recent call last):
  File "/home/major_compaction/./prepare_data_insert_search_vector_clustring_after_index_load_cloud.py", line 134, in <module>
    res1 = hello_milvus.search(vectors[:nq], "embeddings", default_search_params, 10, "count >= 0")
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/orm/collection.py", line 799, in search
    resp = conn.search(
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/decorators.py", line 147, in handler
    raise e from e
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/decorators.py", line 143, in handler
    return func(*args, **kwargs)
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/decorators.py", line 182, in handler
    return func(self, *args, **kwargs)
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/decorators.py", line 122, in handler
    raise e from e
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/decorators.py", line 87, in handler
    return func(*args, **kwargs)
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/client/grpc_handler.py", line 797, in search
    return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs)
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/client/grpc_handler.py", line 738, in _execute_search
    raise e from e
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/client/grpc_handler.py", line 731, in _execute_search
    check_status(response.status)
  File "/home/major_compaction/testenv/lib/python3.10/site-packages/pymilvus/client/utils.py", line 62, in check_status
    raise MilvusException(status.code, status.reason, status.error_code)
pymilvus.exceptions.MilvusException: <MilvusException: (code=65535, message=fail to search on QueryNode 1: worker(1) query failed:  => User Field(embeddings) is not loaded)>

Expected Behavior

Search successfully

Steps To Reproduce

import os
import time
import random
import string
import numpy as np
from pymilvus import (
    connections,
    utility,
    FieldSchema, CollectionSchema, DataType,
    Collection,
)

fmt = "\n=== {:30} ===\n"
dim = 128

print(fmt.format("start connecting to Milvus"))
host = os.environ.get('MILVUS_HOST')
if host == None:
    host = ""
print(fmt.format(f"Milvus host: {host}"))
connections.connect()

default_fields = [
    FieldSchema(name="count", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="key", dtype=DataType.INT64, is_partition_key=True),
    FieldSchema(name="random", dtype=DataType.DOUBLE),
    FieldSchema(name="var", dtype=DataType.VARCHAR, max_length=10000, is_primary=False),
    FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim, is_clustering_key=True)
]
default_schema = CollectionSchema(fields=default_fields, description="test clustering-key collection")
collection_name = "major_compaction_collection_enable_scalar_clustering_key_after_index"

if utility.has_collection(collection_name):
   collection = Collection(name=collection_name)
   collection.drop()
   print("drop the original collection")
hello_milvus = Collection(name=collection_name, schema=default_schema)

print("Starting major compaction")
start = time.time()
hello_milvus.compact(is_major=True)
res = hello_milvus.get_compaction_state(is_major=True)
print(res)
print("Waiting for major compaction complete")
hello_milvus.wait_for_compaction_completed(is_major=True)
end = time.time()
print("Major compaction complete in %f s" %(end - start))
res = hello_milvus.get_compaction_state(is_major=True)
print(res)


nb = 1000

rng = np.random.default_rng(seed=19530)
random_data = rng.random(nb).tolist()

vec_data = [[random.random() for _ in range(dim)] for _ in range(nb)]
_len = int(20)
_str = string.ascii_letters + string.digits
_s = _str
print("_str size ", len(_str))

for i in range(int(_len / len(_str))):
    _s += _str
    print("append str ", i)
values = [''.join(random.sample(_s, _len - 1)) for _ in range(nb)]
index = 0
while index < 100:
    # insert data
    data = [
        [index * nb + i for i in range(nb)],
        [random.randint(0,100) for i in range(nb)],
        random_data,
        values,
        vec_data,
    ]
    start = time.time()
    res = hello_milvus.insert(data)
    end = time.time() - start
    print("insert %d %d done in %f" % (index, nb, end))
    index += 1
    hello_milvus.flush()

print(f"Number of entities in Milvus: {hello_milvus.num_entities}")  # check the num_entites

# 4. create index
print(fmt.format("Start Creating index AUTOINDEX"))
index = {
    "index_type": "AUTOINDEX",
    "metric_type": "L2",
    "params": {},
}

print("creating index")
hello_milvus.create_index("embeddings", index)
print("waiting for index completed")
utility.wait_for_index_building_complete(collection_name)
res = utility.index_building_progress(collection_name)
print(res)

print(fmt.format("Load"))
hello_milvus.load()

res = utility.get_query_segment_info(collection_name)

print("before major compaction")
print(res)

# major compact

print("Starting major compaction")
start = time.time()
hello_milvus.compact(is_major=True)
res = hello_milvus.get_compaction_state(is_major=True)
print(res)
print("Waiting for major compaction complete")
hello_milvus.wait_for_compaction_completed(is_major=True)
end = time.time()
print("Major compaction complete in %f s" %(end - start))
res = hello_milvus.get_compaction_state(is_major=True)
print(res)

res = utility.get_query_segment_info(collection_name)
print("after major compaction")
print(res)

nb = 1
vectors = [[random.random() for _ in range(dim)] for _ in range(nb)]

nq = 1

default_search_params = {"metric_type": "L2", "params": {}}
res1 = hello_milvus.search(vectors[:nq], "embeddings", default_search_params, 10, "count >= 0")

print(res1[0].ids)

Milvus Log

https://grafana-4am.zilliz.cc/explore?orgId=1&panes=%7B%22h8f%22:%7B%22datasource%22:%22vhI6Vw67k%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bcluster%3D%5C%22devops%5C%22,namespace%3D%5C%22chaos-testing%5C%22,pod%3D~%5C%22major-compaction-eaqtp.%2A%5C%22%7D%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22vhI6Vw67k%22%7D%7D%5D,%22range%22:%7B%22from%22:%22now-1h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1

Anything else?

collection name: major_compaction_collection_enable_scalar_clustering_key_after_index

@binbinlv binbinlv added kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Apr 16, 2024
@binbinlv binbinlv added this to the 2.4.1 milestone Apr 16, 2024
Copy link

stale bot commented May 18, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants