Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: remove invalid string2code mappings #137

Merged
merged 4 commits into from Apr 21, 2022
Merged

Conversation

lidel
Copy link
Member

@lidel lidel commented Mar 23, 2022

Problem: string2codec maps are a mess

The current state is not sustainable, string mappings are invalid or missing, and we don't have correct ones for core IPLD formats like dag-pb, dag-json and dag-cbor 🙈

Solution: cleanup

Problem 2: cleanup may introduce silent bugs into people's codebases

da55247 demonstrates the map changes we would have to do to restore sanity: fixes mapping, and adds IPLD formats along with non-dag generic variants for Protobuf, JSON and CBOR, so go-cid finally match the source of truth.

This approach would change the meaning of code 0x70 and 0x71 and strings cbor and protobuf:

  • protobuf string now correctly points at code 0x50 (was 0x70, which is dag-pb)
    • 0x70 code is mapped to dag-pb (was protobuf before)
  • cbor string now correctly points at code 0x51 (was 0x71, which is dag-cbor)
    • 0x71 code is now mapped to dag-cbor (was cbor before)

Pretty risky – people dont read release notes, this could cause silent errors in people's software.
Also, we have proper code2string mappings in go-multicodec (generated, and not written by hand)

Solution

Remove invalid/redundant string2code mappings from go-ipfs, point ecosystem at using go-multicodec directly.

BREAKING CHANGE

Implemented Approach B: Remove invalid mappings entirely 💚

I applied this in 6ca2617

We already have go-multicodec which is generated from the source of truth.

We remove hardcoded maps, then people will be forced to refactor their code to use go-multicodec (multiformats/go-multicodec#65) – forcing people to refactor makes it less likely they will just blindly bump the dependency without reading release notes.


Refactor guide

Converting code to string and vice versa will no longer be done with go-cid, go-multicodec should be used instead

Old way (go-cid)

cid "github.com/ipfs/go-cid"

// string → code
code := cid.Codecs["libp2p-key"]

// code → string
string := cid.CodecToStr[code]

New way (go-multicodec)

mc "github.com/multiformats/go-multicodec"

// string → code
var mc.Code code 
code.Set("libp2p-key")

// code → string
code.String()

Codec table was missing dag-json and it had invalid code for dag-cbor.
It also had invalid string representation of dag-pb -- it was using
'protobuf' which is a totally different code.

I fixed those bugs, and added generic variants for Protobuf, JSON and
CBOR, so go-cid finally match the source of truth at
https://github.com/multiformats/multicodec/blob/master/table.csv
@lidel lidel marked this pull request as ready for review March 23, 2022 22:36
@BigLep
Copy link

BigLep commented Mar 24, 2022

2022-03-24 conversation:

  1. We want to delete the table. We generate this programmatically already in https://github.com/multiformats/go-multicodec

@BigLep BigLep added this to the go-ipfs 0.13 milestone Mar 24, 2022
@lidel lidel changed the title fix: codec table cleanup (dag-pb, dag-json and dag-cbor) refactor: remove invalid codec mapping table Mar 24, 2022
@lidel
Copy link
Member Author

lidel commented Mar 24, 2022

I went with Approach B (Remove mappings).
Filled multiformats/go-multicodec#65 to figure out the best way existing users could migrate, will ship this as soon we figure that out.

@lidel lidel changed the title refactor: remove invalid codec mapping table refactor: remove invalid string2code mapping tables Mar 25, 2022
@lidel
Copy link
Member Author

lidel commented Mar 25, 2022

Based on multiformats/go-multicodec#65 (comment) I wrote migration guide (will include it in release notes).

This is now ready to merge as soon I get some reviews.

@lidel lidel changed the title refactor: remove invalid string2code mapping tables refactor: remove invalid string2code mappings Mar 25, 2022
Copy link
Contributor

@mvdan mvdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM with the understanding that we're okay with the breaking change. An alternative would be to have go-cid import the new go-multicodec, filling the global map at init time - it would avoid breaking users, but longer term it's a worse approach.

If you're breaking users, have you thought about removing the constants and pointing users towards go-multicodec directly? It might be a bit too much in terms of breakage, but if we're already replacing the from/to string APIs, replacing the constants as well would seem consistent.

@aschmahmann
Copy link

@mvdan the reason to break users with the maps but not the constants is because the maps are wrong. The most egregious issue is that "cbor" is mapped to dag-cbor (0x71) instead of to cbor (0x51). There's also the protobuf vs dag-pb issue, but that's less terrible. If we just fixed the maps then people would silently break which is IMO the worst of the options.

If you're breaking users, have you thought about removing the constants and pointing users towards go-multicodec directly?

Why would you choose to break users when you could just not at trivial cost? We could add Deprecated markers pointing people at go-multicodec though.

It might be a bit too much in terms of breakage, but if we're already replacing the from/to string APIs, replacing the constants as well would seem consistent.

With some basic searching:

So breaking the maps seems to cause much less trouble for people than also breaking the constants.

@mvdan
Copy link
Contributor

mvdan commented Mar 25, 2022

ACK, I hadn't caught the detail about the bad map entries.

Copy link

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. While generally, it'd be nice to see some PRs in the downstream breakages (e.g. go-ipfs) to see how this plays out I suspect it's fine here to merge. This repo doesn't get too many changes so it's fair to assume we won't make more before we deal with the breakages in go-ipfs and related repos and we can figure this out later.

Whether you want to tag now, before we have PRs resolving the breakages in the few effected projects is your call.

cid.go Outdated Show resolved Hide resolved
cid.go Show resolved Hide resolved
@lidel
Copy link
Member Author

lidel commented Mar 28, 2022

note to self: before I merge this, I'll test this in go-ipfs, see how painful refactor is etc
(I'll do that as part of review of ipfs/kubo#8568)

@lidel lidel added the status/blocked Unable to be worked further until needs are met label Mar 28, 2022
lidel added a commit to libp2p/go-libp2p-core that referenced this pull request Mar 30, 2022
The mappings in go-cid were maintained by hand and are invalid.
More details in ipfs/go-cid#137

This is switching to go-multicodec which has correct mappings
that are generated, not written by hand.
@lidel
Copy link
Member Author

lidel commented Mar 31, 2022

lidel added a commit to libp2p/go-libp2p-core that referenced this pull request Mar 31, 2022
The mappings in go-cid were maintained by hand and are invalid.
More details in ipfs/go-cid#137

This is switching to go-multicodec which has correct mappings
that are generated, not written by hand.

Co-authored-by: Daniel Martí <mvdan@mvdan.cc>
@BigLep
Copy link

BigLep commented Mar 31, 2022

2022-03-31 conversation: the work isn't blocked, but it won't get merged and released until the libp2p resource manager work for ipfs.

@lidel lidel removed the status/blocked Unable to be worked further until needs are met label Apr 21, 2022
@lidel
Copy link
Member Author

lidel commented Apr 21, 2022

This is unblocked and ok to merge:

TLDR: if you've found this PR because you need to map CODE to STRING or STRING TO CODE, see usage https://github.com/multiformats/go-multicodec#usage

@github-actions
Copy link

Suggested version: v0.2.0
Comparing to: v0.1.0 (diff)

Changes in go.mod file(s):

diff --git a/go.mod b/go.mod
index e307ff9..6e3b580 100644
--- a/go.mod
+++ b/go.mod
@@ -4,7 +4,17 @@ require (
 	github.com/multiformats/go-multibase v0.0.3
 	github.com/multiformats/go-multihash v0.0.15
 	github.com/multiformats/go-varint v0.0.6
+)
+
+require (
+	github.com/klauspost/cpuid/v2 v2.0.4 // indirect
+	github.com/minio/blake2b-simd v0.0.0-20160723061019-3f5f724cb5b1 // indirect
+	github.com/minio/sha256-simd v1.0.0 // indirect
+	github.com/mr-tron/base58 v1.2.0 // indirect
+	github.com/multiformats/go-base32 v0.0.3 // indirect
+	github.com/multiformats/go-base36 v0.1.0 // indirect
 	golang.org/x/crypto v0.0.0-20210506145944-38f3c27a63bf // indirect
+	golang.org/x/sys v0.0.0-20210309074719-68d13333faf2 // indirect
 )
 
-go 1.15
+go 1.17

gorelease says:

# github.com/ipfs/go-cid
## incompatible changes
CodecToStr: removed
Codecs: removed
## compatible changes
DagJSON: added

# summary
Suggested version: v0.2.0

gocompat says:

(empty)

@lidel lidel merged commit b2064d7 into master Apr 21, 2022
@lidel lidel deleted the fix/codec-tables branch April 21, 2022 20:39
@MarcoPolo MarcoPolo mentioned this pull request Jul 7, 2022
41 tasks
marten-seemann pushed a commit to libp2p/go-libp2p that referenced this pull request Aug 17, 2022
The mappings in go-cid were maintained by hand and are invalid.
More details in ipfs/go-cid#137

This is switching to go-multicodec which has correct mappings
that are generated, not written by hand.

Co-authored-by: Daniel Martí <mvdan@mvdan.cc>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

5 participants