Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec: reference to vbyte implementation #6

Open
barthanssens opened this issue Aug 11, 2016 · 0 comments
Open

Spec: reference to vbyte implementation #6

barthanssens opened this issue Aug 11, 2016 · 0 comments
Assignees
Labels

Comments

@barthanssens
Copy link

The spec references

[VByte] Hugh E. Williams and Justin Zobel. Compressing integers for fast file access.

The paper http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.3782&rep=rep1&type=pdf mentions

  • using bits 7-1 as payload and bit 0 (rightmost) bit to indicate if there are more bytes
  • 0 indicates last byte of multi-byte

However, it seems that the hdt-java implementation uses another approach and references http://nlp.stanford.edu/IR-book/html/htmledition/variable-byte-codes-1.html

  • using bits 6-0 as payload and bit 7 (leftmost) bit to indicate if there are more bytes
  • 1 indicates last byte of multi-byte

So perhaps the spec should mention the nlp.stanford page, instead of the original paper ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants