Only Allow Ready Hashring Replicas in Generated ConfigMap #89

michael-burt · 2022-05-23T18:19:34Z

This PR adds a CLI flag (allow-only-ready-replicas) which filters out pods that are not in the Ready condition from the hashring configuration. I have been testing this at scale for about 6 weeks and it does lead to a reduction in the frequency of replication and forwarding errors in the hashring.

…licas in the generated hashring ConfigMap.

matej-g · 2022-08-22T12:12:54Z

Hey @michael-burt sorry this did not get proper attention, I'd actually like to add this change. Do you have any objections?

michael-burt · 2022-08-23T15:55:31Z

Hi @matej-g , I have no objections to merging this, however there is a big caveat. In a high-churn environments where pods frequently get rescheduled, this PR causes the hashring configuration to change frequently. This causes a frequent redistribution of MTS across the hashring pods which can lead to a large increase in memory consumption. This scenario contributes to cascading failures where hashring pods get OOMKilled and removed from the hashring configuration, causing a redistribution of MTS which causes further load on remaining healthy pods, eventually causing them to OOMKill as well.

In the end, I found it was more stable to just run a static hashring config and remove this operator from our environment completely. I think we should include a caveat in the documentation describing these failure modes if we are going to merge this.

The failures I describe above were observed in an environment with ~50M MTS and 30-40 ingestor pods with a replication factor of 3.

matej-g · 2022-08-24T15:42:26Z

Hey @michael-burt, thanks for your response, I actually looked at the code now and I get what you're saying. My initial impression was that this change affects only cases where we scale up and add new replicas to the hashring, in which case I think it makes sense to wait for the replicas to be ready before they are added.

However, for the second case where we check replica readiness on each sync, as you said I don't believe this is good idea in general. As you pointed, any intermittent issue where a receiver replica(s) goes down would mean hashring change, which would mean frequent hashring changes and problems you mentioned, which is also what we want to avoid. Even though there are some changes in upstream (to use more uniform algorithm for redistributing serie in the hashring as well as to not flush TSDB), I still don't think removing replicas on every un-readiness is good.

WDYT about only including the first change (on scaling up, wait for new replicas to be ready) but keep the behavior with regards to existing replicas? I'd be happy to take over this PR, possibly document this behavior a bit better as well.

matej-g · 2022-08-26T10:00:35Z

@michael-burt I'll take this over, I'll probably open a new PR but I'll reuse some bits from this, thank you!

michael-burt · 2022-08-26T20:53:58Z

Sounds good, thanks @matej-g

michael-burt mentioned this pull request May 23, 2022

Fix hashring for all possible failure scenarios #80

Closed

Add allow-only-ready-replicas flag to only include Ready hashring rep…

b7f21d0

…licas in the generated hashring ConfigMap.

michael-burt force-pushed the allow-only-ready-replicas branch from 7dff837 to b7f21d0 Compare May 23, 2022 20:17

michael-burt closed this Jul 15, 2022

matej-g reopened this Aug 22, 2022

matej-g closed this Aug 26, 2022

matej-g mentioned this pull request Aug 26, 2022

Add feature to wait on ready replicas on scaling up #91

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only Allow Ready Hashring Replicas in Generated ConfigMap #89

Only Allow Ready Hashring Replicas in Generated ConfigMap #89

michael-burt commented May 23, 2022 •

edited

matej-g commented Aug 22, 2022

michael-burt commented Aug 23, 2022 •

edited

matej-g commented Aug 24, 2022 •

edited

matej-g commented Aug 26, 2022

michael-burt commented Aug 26, 2022

Only Allow Ready Hashring Replicas in Generated ConfigMap #89

Only Allow Ready Hashring Replicas in Generated ConfigMap #89

Conversation

michael-burt commented May 23, 2022 • edited

matej-g commented Aug 22, 2022

michael-burt commented Aug 23, 2022 • edited

matej-g commented Aug 24, 2022 • edited

matej-g commented Aug 26, 2022

michael-burt commented Aug 26, 2022

michael-burt commented May 23, 2022 •

edited

michael-burt commented Aug 23, 2022 •

edited

matej-g commented Aug 24, 2022 •

edited