Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: different Elastic configurations #2821

Draft
wants to merge 1 commit into
base: 13.0
Choose a base branch
from

Conversation

grossmannmartin
Copy link
Member

@grossmannmartin grossmannmartin commented Sep 20, 2023

Q A
Description, reason for the PR This PR is RFC (Request for Comments) and is not intended to be merged.
New feature Yes/No
BC breaks Yes/No
Fixes issues ...
Have you read and signed our License Agreement for contributions? Yes/No

I have tried to present several different ways to configure elastic search indices. Currently, we use different JSON files for each domain. We have several problems with it:

  • it's tedious to add a new field (We need to update a number of files depending on the count of the domains)
  • many copy-pasted code
  • changes in the field may be easily forgotten in some file
  • due to file names, it's hard to find the right one (see Unfriendly filepath for Elasticsearch configuration #1286)
  • it's not possible to set different index settings for different environments

I have these proposals:

1. Use a single Yaml file for configuration

see project-base/app/src/Resources/definition/product.yaml

Configuration is split into three main parts:

  • index
    • setting for the index (number of shards, number of replicas, ...)
  • analysis
    • analysis setting – analyzers, filters, tokenizers, ...
  • mappings
    • field mapping. What fields and with what type do we use

only one file per index type is used. Differences for the domains/locales/environments may be achieved with the @something suffix.
For example, the analysis may be different for different languages, so

analysis@locale-en:
    ....

will be used for every domain with English locale, while

analysis@domain-1:
    ....

will be used only for the first domain.

The @ suffixes are allowed in the root configurations (index, analysis, mappings) and in the individual fields in the mappings configuration

mappings:
    searching_names:
        type: text
    searching_names@domain-1:
       type: text
       analyzer: stemming

Suffixes have the priority domain - locale - environment - nothing. So the setting for the domain has a higher priority over locale, and so on.
Configurations are always overwritten, so when the same configuration exists for domain and locale, only the domain will be used and nothing from the locale will be merged.

Since YAML can be easily converted to JSON, it's possible to use almost any configuration, that is in the current index definition, but the YAML allows us to use only a single (and more readable) file while allowing the differences for domains.

2. Use a PHP configuration

This idea uses PHP for configuration. This expects some kind of builder.
We can use a builder object to addFields and set analyzers, and configurations.

We can use PHP to adjust the configuration based on the domain/environment/locale.

The configuration example is available in the project-base/app/src/Resources/definition/ProductIndexMapping.php file.

This configuration may be a little chatty and it may be harder to add uncommon configurations...

3. Use Attributes on the object

We may use class, which may be then used in the productExportRepository instead of the array.
Configuration for the elastic could be added as attributes to the fields.
That way we ensure that everything we export is properly mapped.
It's a similar concept as we use to configure Doctrine.

An example is available in project-base/app/src/Model/Product/Elasticsearch/ElasticProduct.php

It may be harder to add an uncommon configuration (need to add a new attribute).
I have currently no idea how to implement differences for the different domains/locales.


🌐 Live Preview:

@TomasLudvik
Copy link
Member

Hi, I have looked at this and think the best way might be a combination of JSON or YAML and PHP. I would like most of the configuration to stay as it is but only for one (for all domains) and there were optional PHP files with changes for different languages/domains in a similar way as it is in frontend API - we have types.yaml and then we do some remapping in Mapper classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants