Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: CID+Block=Multiblock #37

Open
Gozala opened this issue Sep 18, 2020 · 2 comments
Open

Proposal: CID+Block=Multiblock #37

Gozala opened this issue Sep 18, 2020 · 2 comments
Labels
status/ready Ready to be worked

Comments

@Gozala
Copy link
Contributor

Gozala commented Sep 18, 2020

We already have multibase, multihash that all in nutshell are metada+data. We don not however have similar thing for blocks, so it becomes impossible to derive what codec to use to decode it.

In the past when I was working on https://github.com/gozala/ipdf/ I came up with cid+block thing that I called inline blocks, so that graphs could contain encrypted and concealed sub-graphs that would only reveal themselves to the key holder.

CAR format seems to also pair CID+blocks.

And this thread #36 (comment) I think also illustrates lack of such abstraction.

Ironically JS Block instance also contains CID+Block but when you encoded you can no longer decode it back without additionally providing 'codec' information.

I think if we do formalize such a building block it would allow for a nice and compos-able libraries around it.

@mikeal
Copy link
Contributor

mikeal commented Sep 19, 2020

So, conceptually we have this, which is that a block is a pair of [CID, Data].

As far as a standardized binary representation, the reason we haven’t needed it yet is because the block store abstractions are already key/value storage engines. We have something that sort of matches this description in the CAR file format because we needed a binary representation of a block.

Since CID’s can already be linearly parsed, the most compact binary representation would be [ CID, VARINT(DIGEST_LENGTH), DIGEST ].

If we’re going to standardize this, one thing you’re going to want to include is the proper representation of an inline CID w/ identity multihash.

@rvagg
Copy link
Member

rvagg commented Sep 24, 2020

I think it was Steven that pointed out that the CAR format is almost a complete pattern if it weren't for the header not having a prefix. I suspect if we had a formality around a CID+Block representation then it may have been more natural to make it a simple stream of CID+Binary patterns, including the header, with no particular special handling of the header at the encode/decode layer.

I'm not sure how you would create such a specification without having something to apply it to, though. Unless we pick up the CAR version and say "this thing is now a spec on its own, go over here to see how [ varint | CID | binary ] is a fully specified beast that we call Bob". Beyond that though, can you outline some tools that would make use of this thing that would make it worth specifying?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

3 participants