Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot append fields of type "dense vectorfield type #658

Open
walkingmug opened this issue Feb 1, 2024 · 0 comments
Open

Cannot append fields of type "dense vectorfield type #658

walkingmug opened this issue Feb 1, 2024 · 0 comments

Comments

@walkingmug
Copy link

Description:
When trying to append a pandas dataframe of type "dense_vector" to an existing elastic index with the same field type, an error occurs.

Reproduction:

  1. Install requirements:
    pip install elasticsearch eland pandas numpy
  2. Imports:
from elasticsearch import Elasticsearch
import eland as ed
import pandas as pd
import numpy as np
  1. Connect to Elasticsearch:
client = Elasticsearch(HOST, timeout=120)
  1. Create vector dataframes:
vector1 = np.random.rand(512)
vector2 = np.random.rand(512)
df_1 = pd.DataFrame({
    'vector_column': [vector1, vector2]
})

vector3 = np.random.rand(512)
vector4 = np.random.rand(512)
df_2 = pd.DataFrame({
    'vector_column': [vector3, vector4]
})
  1. ✅ Upload first dataframe:
# upload df_1 to elasticsearch
ed.pandas_to_eland(
  pd_df=df_1,
  es_client=client,
  es_dest_index='test-upload',
  es_if_exists="append",
  es_refresh=True,
  es_type_overrides={
      "vector_column": {
          "type": "dense_vector",
          "dims": 512,
          "index": True,
          "similarity": "cosine"
      },
  },
  chunksize=100
)
  1. ❌ Append second dataframe to first dataframe:
# upload df_2 to elasticsearch
ed.pandas_to_eland(
  pd_df=df_2,
  es_client=client,
  es_dest_index='test-upload',
  es_if_exists="append",
  es_refresh=True,
  es_type_overrides={
      "vector_column": {
          "type": "dense_vector",
          "dims": 512,
          "index": True,
          "similarity": "cosine"
      },
  },
  chunksize=100
)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-16-b0e5aa8d561e>](https://localhost:8080/#) in <cell line: 2>()
      1 # upload df_2 to elasticsearch
----> 2 ed.pandas_to_eland(
      3   pd_df=df_2,
      4   es_client=client,
      5   es_dest_index='test-upload',

1 frames
[/usr/local/lib/python3.10/dist-packages/eland/field_mappings.py](https://localhost:8080/#) in verify_mapping_compatibility(ed_mapping, es_mapping, es_type_overrides)
    919         key_type = es_type_overrides.get(key, key_def["type"])
    920         es_key_type = es_props[key]["type"]
--> 921         if key_type != es_key_type and es_key_type not in ES_COMPATIBLE_TYPES.get(
    922             key_type, ()
    923         ):

TypeError: unhashable type: 'dict'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant