Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade go-msgpack to 2.1.0 #575

Closed
wants to merge 1 commit into from
Closed

upgrade go-msgpack to 2.1.0 #575

wants to merge 1 commit into from

Conversation

raskchanky
Copy link
Contributor

This is a bit of a yak shave, but it starts with me working on hashicorp/vault#21460
I keep running into issues on that PR because raft-wal imports go-msgpack v1.1.5 which has been retracted. That leads to all sorts of weird go import loops and terrible things.

I tried upgrading go-msgpack to the latest (v2.1.0) in raft-wal, but raft-wal doesn't actually use that lib. It's only there because it's brought in by this library. So I figured maybe upgrading go-msgpack here is the right thing to do.

Then I ran go mod tidy, which might have been a mistake. It gave me this helpful message:

CleanShot 2023-09-27 at 16 48 10

So I ran go mod tidy -compat=1.17 and here is the result. If there's any part of this that I screwed up, I'm happy to undo it. My main goal was just to get go-msgpack off of the retracted version.

@raskchanky raskchanky requested a review from a team as a code owner September 28, 2023 00:07
@raskchanky raskchanky requested review from freddygv and banks and removed request for a team September 28, 2023 00:07
@banks
Copy link
Member

banks commented Sep 28, 2023

Hmmm. I wonder why raft-wal indirectly imports 1.15 🤔

In Consul we have explicit excludes for 1.1.5 and 1.1.6 because the broke Consul's wire compatibility by merging upstream time encoding changes (the whole reason we have a fork of the codec in the first place). @swenson did an awesome job updating that lib and then making 2.0.0 as a backwards-compatible version that pulls in other upstream improvements.

That said, I'm not 100% sure what the impact of this upgrade for all of Raft is for Consul, or for other users of this library who might have the same compatibility issue with wire or on-disk encoded messages. 2.1.0 AFAIK does have compatibility for decoding those old messages, but I'm not sure if Consul folks are ready to upgrade yet as they are pinned to go-msgpack 0.5.5 still and may not have tested the upgraded version.

Does us changing this impact others? Why is raft-wal pulling in 1.1.5 at all (it doesn't need it? Can we fix this for vault just by excluding 1.1.5? (Vault is currently pinned to v1.1.5 which uses the new time encoding but it should be safe to upgrade Vault in general and I think we only use this for on-the-wire encoding of Raft RPCs via NetworkTransport in Vault whereas Consul also uses it for all internal RPCs as well as some persistent Log operation and Snapshot message encodings.)

All that said, it should be safe to upgrade this and is desirable, I'm just not sure if that has immediate implications for other library users that they would like more control over (e.g. time to plan and test the change to 2.1.0 encoding more widely). If upgrading wouldn't impact Consul or other users anyway as they would remain pinned to the older version then this seems lower risk.

If @mkeeler @dhiaayachi @swenson or anyone else have opinions on this I'd be interested.

@tgross
Copy link
Member

tgross commented Sep 28, 2023

Nomad hit problems the wire format when we tried to upgrade to go-msgpack v2 as well, and had to revert: hashicorp/nomad#17047 We haven't had time to go back and figure out what the underlying problem is, given that the wire formats are supposedly identical.

@banks
Copy link
Member

banks commented Sep 28, 2023

@tgross I recall now from internal discussion, v2.1.0 is backwards compatible for decoding but encoding defaults to the new Time encoding which is what broke Consul and Nomad all those years ago in the first place. See in the release notes:

If users want to encode into the old binary marshalling time.Time format, it is still available with the codec.BasicHandle.TimeNotBuiltin option. Setting this option could help with migrating to the v2 versions.

So if Nomad or Consul or any other user of this library that uses our NetworkTransport and predates go-msgpack 1.1.5 wants to upgrade, they need to do one of two things:

  • upgrade, but keep the old time encoding with that option perpetually
  • upgrade, and make the time encoding optional so that you can upgrade all your cluster nodes (and any other nodes they talk to using the same codec over RPC) and only then flip the config to start encoding with the new format.

For this reason I don't think we should upgrade the new version in the raft library in general without a bit more thought. I suspect internally Nomad and Consul are not ready to upgrade. They wouldn't have to I guess, but they would need to carefully manage the versions and exclusions to ensure they stay correct. For me the bigger argument is that other users of this lib who aren't aware of all this might be caught out by it too? How would the PR impact other users who might not be aware of the upgrade issues? Could they get automatically bumped up without realising the incompatibility? I've not figured that out yet.

@swenson
Copy link

swenson commented Sep 28, 2023

@banks @tgross thanks for the information.

@raskchanky I was planning on tackling this during my next sustaining rotation in a few weeks. :) I'd probably go with adding an option to the raft library to change the default time encoding, but I haven't dug in quite yet.

I also am going to do some benchmarking so we can see the performance impact of upgrading -- I had removed a bunch of the unsafe fastpath code paths, since they were something that can break with no notice if Go changes their internals.

@raskchanky
Copy link
Contributor Author

Ok so it sounds like this was not the right version to change to. My main priority is just to get off of the version that was retracted. I'll close this and open a new PR with a different version that hopefully won't cause any issues.

@raskchanky raskchanky closed this Oct 9, 2023
@raskchanky raskchanky deleted the upgrade-go-msgpack branch October 10, 2023 03:11
@raskchanky
Copy link
Contributor Author

Turns out I mostly had this wrong anyway. This library is depending on go-msgpack v0.5.5, which is fine. It turns out, after much wringing of hands and gnashing of teeth, I discovered that https://github.com/benmathews/bench (which is a dep of raft-wal) is what was bringing in the bad version of go-msgpack. I have a PR up on that repo to update their deps, and assuming that's merged, I can update raft-wal to use the newer version of it. This library can remain totally unaffected. Thanks for the great discussion here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants