New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sstable: add writer option to remove a common prefix #3242
base: master
Are you sure you want to change the base?
Conversation
I'm guessing the elided-prefix prop is what's making the readers 16 bytes bigger, via the extra string header in the props? We don't strictly need this prop so I could... skip it? I figured it'd be nice to have so we could tell by looking at an sst if someone hacked some prefix out of its content during creation, but I suppose when everything works as intended we'll know this based on other metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion (waiting on @dt, @itsbilal, and @RaduBerinde)
sstable/writer.go
line 1237 at r1 (raw file):
} key.UserKey = key.UserKey[len(w.elidePrefix):] }
what about the encoded keyspan.Span.End
that is contained in value
? Don't we need to confirm that it has the elidePrefix
and strip the prefix?
Previously, sumeerbhola wrote…
It would be unfortunate to decode/reencode the values and account for all cases in here, when we just encoded them before calling this function. I wonder if we should replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion (waiting on @itsbilal and @RaduBerinde)
sstable/writer.go
line 1237 at r1 (raw file):
It would be unfortunate to decode/reencode the values and account for all cases in here, when we just encoded them before calling this function.
IIRC, there are two paths:
- where fragmentation is done in the writer. These call into
addRangeKeySpan
. - fragmentation is done outside the writer. Compactions and suffix rewriting are using this path. The compaction case is a very shallow use
rangekey.Encode(&v, tw.AddRangeKey)
, so we could expose aWriter.AddRangeKeyPrefragmented(*keyspan.Span)
and do the encoding here. Suffix rewriting is similarly shallow. I think this is similar to what you are suggesting, except we can make do with one method.
No description provided.