Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text_expansion from InferenceConfig is not supported upon creating a Pipeline #667

Open
adrianpbv opened this issue Sep 13, 2023 · 2 comments
Labels
Area: Specification Related to the API spec used to generate client code

Comments

@adrianpbv
Copy link

adrianpbv commented Sep 13, 2023

Java API client version

8.9.1

Java version

8

Elasticsearch Version

8.9.1

Problem description

I'm trying to create a pipeline that ingest data with the ELSER model. In the class PutPipelineRequest, InferenceConfig doesn’t support “text_expansion” value from the json file, an error is thrown:

co.elastic.clients.json.JsonpMappingException: Error deserializing co.elastic.clients.elasticsearch.ingest.InferenceConfig: Unknown field 'text_expansion' (JSON path: processors[0].inference.inference_config.text_expansion)

It seems that InferenceConfig only supports regression and classification types. As the ELSER model was recently introduced this is missed from the JavaApiClient v8.9.1, that is supposed to be updated with features released in Elastic v8.9.1.

The code to create the pipeline programmatically in java is:

InputStream input = new ClassPathResource("elser_pipeline_config.json").getInputStream();

PutPipelineRequest request = PutPipelineRequest.of(pl->pl
                    .id("my_pipeline_test")
                    .withJson(input)
            );
elasticsearchClient.ingest().putPipeline(request);

// elser_pipeline_config.json :
{
  "description":"My test elser pipeline",
  "processors": [
    {
      "inference": {
        "model_id": ".elser_model_1",
        "target_field": "ml",
        "field_map": { 
          "text": "text_field"
        },
        "inference_config": {
          "text_expansion": { 
            "results_field": "tokens"
          }
        }
      }
    }
  ]
}

As a temporal solution I created this workaround if someone runs into the same error.

@wnm3
Copy link

wnm3 commented Apr 8, 2024

I have a similar problem trying to use the intfloat__multilingual-e5-base which requires an inferenceConfig for text_embedding -- see this thread for details.

Thank you for the workaround.

@l-trotta l-trotta added the Area: Specification Related to the API spec used to generate client code label Apr 23, 2024
@l-trotta
Copy link
Contributor

Hello, thank you for reporting this. The definition of both the InferencePipelineAggregation and Inference config seems outdated, we will update it in the API specification used to produce the Java code. Once it's fixed the Java client code will be updated to solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Specification Related to the API spec used to generate client code
Projects
None yet
Development

No branches or pull requests

3 participants