Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow to do a dry-run with many indexes #1631

Open
jgough opened this issue Dec 14, 2021 · 2 comments
Open

Very slow to do a dry-run with many indexes #1631

jgough opened this issue Dec 14, 2021 · 2 comments

Comments

@jgough
Copy link

jgough commented Dec 14, 2021

I'm finding that running Curator with --dry-run can be extremely slow.

As far as I can tell, this is because my cluster has many indexes and it takes roughly 10s to get the full list of indexes. It does this before EVERY action it attempts, so if I have 100 actions then the dry-run takes over 15 minutes to run - most of which is querying elasticsearch for the same list of indexes over and over and over.

In normal mode this makes sense as previous actions can have knock-on effects that affect later actions. However in --dry-run mode it maybe would make a lot more sense to cache the list of indexes and not keep querying Elasticsearch for this? It would be a huge speedup and make testing curator actions this way less of a chore.

@untergeek
Copy link
Member

There is no effective way to accurately "cache" the list of indices in the way described as it would mean making fake code for dry runs. Were it not so, if action 1 deletes indices, "cached" indices would still show up in action 2 that might have otherwise been deleted. I've considered this approach many times, and cannot comfortably jump the hurdle of "what if the cluster state changes from some other action while action X was being performed," especially if action X was a long-running one, like a force-merge or a snapshot.

I won't close the issue, but I cannot see an easy way of doing this. As a long-time Elasticsearch user and engineer, I also worry if I hear a cluster has so many indices that it takes 10s to get the full list of indices. Part of the time in Curator is extracting the index stats at index collection time. One of the only ways to change this behavior and make things faster would be to have Curator only pull the stats on indices it needs to pull stats from. This is a major re-code, but more possible than what you describe, and it would speed things up some.

@jgough
Copy link
Author

jgough commented Mar 3, 2023

Thanks for your reply. Note I'm not talking about accurately simulating what would happen with a delete index dry run, I'm just saying that the cluster state remains unchanged with a dry run so there's no need to refresh the list of indicies every time. I don't know specifically the internals of curator but if it were able to grab this list at the start and re-use this then it would speed things up considerably. Even with 2s to retrieve and 100 actions we're still talking over 3 minutes to run - assuming no other actions at all.

I've successfully sped things up significantly here with a caching reverse proxy in front of ES when I want to do dry runs. It cuts the time down by half but it still could be faster though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants