Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch index migration fails during snapshots #3101

Open
techi602 opened this issue Apr 11, 2024 · 1 comment
Open

Elasticsearch index migration fails during snapshots #3101

techi602 opened this issue Apr 11, 2024 · 1 comment
Labels
Help wanted We need your help or opinion how to resolve this

Comments

@techi602
Copy link
Contributor

If you change the index schema and run php phing elasticsearch-export in the same time when snapshot happens the migration will fail. This is because you can not delete index during snapshot. (AWS is doing hourly snapshots and you can not specify exact timing therefore the framework should be prepared for such scenario)

{"error":{"root_cause":[{"type":"snapshot_in_progress_exception","reason":"Cannot delete indices that are being snapshotted:

https://github.com/shopsys/framework/blob/15.0/src/Component/Elasticsearch/IndexFacade.php#L228

This results into two indexes (old and new) pointing to the same alias.
This state cause the shopsys app to be broken for multiple reasons:
1 - calling another export will fail due to multiple indexes exist. This requires manually deleting the old index by devops.
2 - queries to alias returns duplicate and deprecated results (because both indexes are queried)
3 - updates will fail because of error: no write index is defined for alias [product_1]. The write index may be explicitly disabled using is_write_index=false or the alias points to multiple indices without one being designated as a write index

To Reproduce

  1. Change any index schema (Resources/definition/product/1.json)
  2. run php phing elasticsearch-export during snapshot

Expected behavior

The right way to safely handle elasticsearch migrations during snapshot is to rather delete the alias instead of the index. (And delete the index safely later)
Alias can be safely deleted from the old index during snapshot. (Unlike deleting index)
You should also swap the aliases (remove alias from old index, add alias to the new index) in the same request so there is no inconsistent state.
Deleting old indexes should be handled separately and safely skipped during snapshots.

@pk16011990 pk16011990 added the Help wanted We need your help or opinion how to resolve this label Apr 26, 2024
@pk16011990
Copy link
Member

Hi, I understand the problem. The existing approach cannot be used in this infrastructure design.

However, as we don't currently have a similar infrastructure, we're finding it challenging to address this effectively. We're fully committed to finding a solution that takes into account other requirements, such as failover capability or minimum deployment time.

We would be happy if you create a PR (I'll be glad to include it in the next version), or we can agree on a paid solution on our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Help wanted We need your help or opinion how to resolve this
Projects
None yet
Development

No branches or pull requests

2 participants