Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote write 2.0 - add flag to override remote write header value (for rollout/rollback) #13922

Draft
wants to merge 1 commit into
base: remote-write-2.0
Choose a base branch
from

Conversation

alexgreenbank
Copy link

Prototype for a flag to allow the Remote Write header X-Prometheus-Remote-Write-Version to be overridden with a specified value to aid rollout/rollback situations.

For example, say you are rolling out Remote Write 2.x amongst many servers behind a load balancer, you don't want any one to respond to a request saying it can support "2.0;snappy" and then the subsequent 2.0 request is sent to a server that has not yet been upgraded.

The plan would be:

    1. Existing servers that don't support Remote Write 2.0
    1. Rollout servers running with --remote-write-format 1 and --remote-write-header-advertise-override "0.1.0"
    1. When all servers behind the load balancer have been upgraded you can restart individual servers and remove the --remote-write-advertise-override flag.

Between steps 1 and 2 no servers will announce they support Remote Write 2.0.
Between steps 2 and 3 all servers will support Remote Write 2.0 even if they do not announce it.

The flag would also be required if servers are upgraded to support a new compression/encoding, e.g. if 2.0;gzip were added.

    1. Existing servers all supporting base Remote Write 2.0 (e.g. 2.0;snappy,0.1.0)
    1. Upgrade servers with gzip support but run with --remote-write-header-advertise-override "2.0;snappy,0.1.0"
    1. When all servers behind the load balancer have been upgraded to the version that supports gzip you can restart individual servers and remove the --remote-write-advertise-override flag.

At the end all servers will then return 2.0;gzip,2.0;snappy,0.1.0 (assuming that is the preferred order).

Still to do:

  • Add some more tests on the write_handler_test.go side

Signed-off-by: Alex Greenbank <alex.greenbank@grafana.com>
@bwplotka
Copy link
Member

Nice, thanks! But I wonder if we can revisit the semantics of format and this flag to give easy, safe and explicit configuration. E.g. in this PR user might now know:

  • Is it for sending or writing?
  • Why not simply setting rw 1.0 for this case?

Perhaps a following settings might help:

  1. Sending RW protocol + compression preference (array of enums) like scrape protocol
  2. Receiving RW protocol + compression preferences

Perhaps a config file as suggested by @cstyan is also better.

@bwplotka
Copy link
Member

Then "migration" does not need any specific header values - user does not need to know it. User would just set 1.0:snappy preference (only one thing) to clients to limit announcing 2.0 capability and it's good to go?

@cstyan
Copy link
Member

cstyan commented Apr 15, 2024

Yep I think if we do anything here this should be a config file flag as it would be something users might want to change during runtime.

However, I don't know if we really need something like this in Prometheus itself. Prometheus still isn't really trying to solve HA/multiple backends domain of problems, and the remote write receiver isn't meant to be a replacement to scrape as the main way of getting data into the TSDB.

Maybe we instead should propose --web.enable-remote-write-receiver be renamed --web.enable-remote-write-receiver-1.0 and we also add --web.enable-remote-write-receiver-2.0, or have a new config file section with a big notice ONLY USED IF --web.enable-remote-write-receiver is enabled that specifies which rw proto version are accepted.

@bwplotka
Copy link
Member

Well the supported protocol "preference" for server side is generally solving all the use cases and it's not too difficult to grasp, so single flag --web.remote-write-receive.protocols="1.0,2.0;snappy,2.0;gzip" could work? 🤔

And for sending/client side doing this in remote write config section of YAML.

@bwplotka
Copy link
Member

Alternative approach #13968

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants