Elasticsearch index migration fails during snapshots #3101

techi602 · 2024-04-11T12:55:48Z

If you change the index schema and run php phing elasticsearch-export in the same time when snapshot happens the migration will fail. This is because you can not delete index during snapshot. (AWS is doing hourly snapshots and you can not specify exact timing therefore the framework should be prepared for such scenario)

{"error":{"root_cause":[{"type":"snapshot_in_progress_exception","reason":"Cannot delete indices that are being snapshotted:

https://github.com/shopsys/framework/blob/15.0/src/Component/Elasticsearch/IndexFacade.php#L228

This results into two indexes (old and new) pointing to the same alias.
This state cause the shopsys app to be broken for multiple reasons:
1 - calling another export will fail due to multiple indexes exist. This requires manually deleting the old index by devops.
2 - queries to alias returns duplicate and deprecated results (because both indexes are queried)
3 - updates will fail because of error: no write index is defined for alias [product_1]. The write index may be explicitly disabled using is_write_index=false or the alias points to multiple indices without one being designated as a write index

To Reproduce

Change any index schema (Resources/definition/product/1.json)
run php phing elasticsearch-export during snapshot

Expected behavior

The right way to safely handle elasticsearch migrations during snapshot is to rather delete the alias instead of the index. (And delete the index safely later)
Alias can be safely deleted from the old index during snapshot. (Unlike deleting index)
You should also swap the aliases (remove alias from old index, add alias to the new index) in the same request so there is no inconsistent state.
Deleting old indexes should be handled separately and safely skipped during snapshots.

The text was updated successfully, but these errors were encountered:

pk16011990 · 2024-04-30T13:54:17Z

Hi, I understand the problem. The existing approach cannot be used in this infrastructure design.

However, as we don't currently have a similar infrastructure, we're finding it challenging to address this effectively. We're fully committed to finding a solution that takes into account other requirements, such as failover capability or minimum deployment time.

We would be happy if you create a PR (I'll be glad to include it in the next version), or we can agree on a paid solution on our side.

pk16011990 added the Help wanted We need your help or opinion how to resolve this label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch index migration fails during snapshots #3101

Elasticsearch index migration fails during snapshots #3101

techi602 commented Apr 11, 2024

pk16011990 commented Apr 30, 2024

Elasticsearch index migration fails during snapshots #3101

Elasticsearch index migration fails during snapshots #3101

Comments

techi602 commented Apr 11, 2024

To Reproduce

Expected behavior

pk16011990 commented Apr 30, 2024