Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_bytes_le on BigUInt strips trailing zero(es) #201

Open
aknarts opened this issue Apr 4, 2021 · 3 comments
Open

to_bytes_le on BigUInt strips trailing zero(es) #201

aknarts opened this issue Apr 4, 2021 · 3 comments

Comments

@aknarts
Copy link

aknarts commented Apr 4, 2021

Found a strange edge case with one number I was working with, this is a binary saved sha1 hash for those curious and is throwing off everything I am doing. Providing a python3 example code to show what the value should be.

Basically this should result in vector of length 20 but ends up being length 19 because of the last element being 0 I suspect this line being the issue https://github.com/rust-num/num-bigint/blob/master/src/biguint/convert.rs#L598. I have no inner knowledge how you store your data so I have no idea how and why the check is done this way.

Reproduction

let num = BigUint::parse_bytes(b"4888659412758205060988492900023565086307197678", 10).unwrap();
println!("{:?}", num.to_bytes_le())

[238, 34, 129, 137, 3, 17, 254, 16, 242, 218, 149, 34, 154, 39, 150, 206, 14, 55, 219]

Working example in python

(4888659412758205060988492900023565086307197678).to_bytes(160//8, 'little')

b'\xee"\x81\x89\x03\x11\xfe\x10\xf2\xda\x95"\x9a'\x96\xce\x0e7\xdb\x00'

@cuviper
Copy link
Member

cuviper commented Apr 5, 2021

BigUint always trims zeros from the most-significant digits, both for the internal representation and for converted output. One reason to justify this on conversions is that we don't want different output on targets with different internal digit sizes -- such zero-padding may be different for 32 or 64-bit digits. For your 160-bit hash, that will internally have either 5 32-bit digits (exactly 160 bits) or 3 64-bit digits (192 bits) -- so if we didn't trim zeros, you would get 24 bytes out of the latter with 5 zeros.

Python can even tell you that your number's bit_length() is 152, which is exactly 19 bytes. It only gave you 20 because you explicitly requested so with 160//8, but it's also happy if you try 19 there.

You can pad your Rust byte vector to the desired size with bytes.resize(160 / 8, 0).

@aknarts
Copy link
Author

aknarts commented Apr 5, 2021

Thanks for the quick reply. I did end up fixing it by padding the vector just as you said. This just caught me off guard because I could not find any mention of this anywhere in the docs, though I am not sure where I would put this. If this is expected behavior feel free to close the issue.

@benma
Copy link

benma commented Aug 15, 2021

It would be great if this library added the functions to_bytes_le_padded() and to_bytes_be_padded() which takes a length argument and pads the vector with zeroes. There are different options for what to do if the vector is bigger than the requested size, but any of those would be helpful so one does not have to manually pad the output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants