Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified AsRef impl for U-type macros #196

Merged
merged 4 commits into from
Sep 9, 2019
Merged

Conversation

fubuloubu
Copy link
Contributor

  • Potential bug with current implementation that led to unused API
  • Replaced with useful internal getter for &[u64] types

Fixes: #195

- Potential bug with current implementation that led to unused API
- Replaced with useful internal getter for &[u64] types

Fixes: paritytech#195
@parity-cla-bot
Copy link

It looks like @fubuloubu signed our Contributor License Agreement. 👍

Many thanks,

Parity Technologies CLA Bot

uint/src/uint.rs Outdated Show resolved Hide resolved
Co-Authored-By: Wei Tang <accounts@that.world>
@dvdplm dvdplm requested a review from ordian August 7, 2019 12:53
Copy link
Member

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't know the reason why it was implemented the way it was (cc @debris, paritytech/primitives#25), but this feels more consistent with fixed-hash. OTOH, it's probably a breaking change...

uint/src/uint.rs Outdated Show resolved Hide resolved
@ordian
Copy link
Member

ordian commented Aug 7, 2019

Or maybe we can keep both implementations, actually (AsRef<$name> and AsRef<[u8]>).

Co-Authored-By: Andronik Ordian <write@reusable.software>
@fubuloubu
Copy link
Contributor Author

@ordian looks like the corresponding function in fixed-hash was removed at some point: 50ecdb3

I question if that function was ever even useful based on the commit message.

@fubuloubu
Copy link
Contributor Author

Also seems like U-types are missing AsMut trait implementations, which would be helpful.

@ordian ordian requested a review from Robbepop August 7, 2019 13:55
Copy link
Contributor

@Robbepop Robbepop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good to remove the old AsRef implementation - as far as I can remember I did the same some time ago for the fixed_hash crate and type.
However, while the AsRef<[u8]> makes total sense for the hash types I am not sure about the Uint types here since they have endianess that is not communicated through this API.
Maybe it would be better to add digits_be and digits_le instead.

@fubuloubu
Copy link
Contributor Author

Hmm, I was trying to leverage the AsRef trait in a more generic fashion, and this was going to help me solve that in a way I don't think your suggestion would allow. What I'm really trying to do is convert the integer into a stream of bits using BitVec, so maybe there's an oppurtunity for an API that returns bitvec::bits::Bits we could discuss separately? BitVec has endianness baked in.

For my own knowledge, why does integer types have the concept of endian-ness whereas bytes types do not?

@Robbepop
Copy link
Contributor

Robbepop commented Aug 7, 2019

For my own knowledge, why does integer types have the concept of endian-ness whereas bytes types do not?

That's because integer types do have the property of total order.
So if you look at an integer instance on a bit-level: 0xdead_beef_0101_1234 you need to know if the most significant bit is in the leftmost d or the rightmost 4 pack of 4 bits.
So if you "serialize" an integer type into a stream of bytes or bits you need to tell in which endian these bits got streamed, otherwise you are missing this information and cannot deserialize it properly at the destination.
If an API does not communicate properly which endianess it is using you can not really guarantee anything about serialization or deserialization of those bytes and bits.

@fubuloubu
Copy link
Contributor Author

fubuloubu commented Aug 7, 2019

Just noticed that macro seems to assume Little Endian. Is this correct?

( @construct $(#[$attr:meta])* $visibility:vis struct $name:ident ( $n_words:tt ); ) => {
/// Little-endian large integer type
#[repr(C)]
$(#[$attr])*
#[derive(Copy, Clone, Eq, PartialEq, Hash)]
$visibility struct $name (pub [u64; $n_words]);
impl AsRef<[u64]> for $name {
#[inline]
fn as_ref(&self) -> &[u64] {
&self.0
}
}

@ordian
Copy link
Member

ordian commented Aug 7, 2019

Just noticed that macro seems to assume Little Endian. Is this correct?

Yes, this is intended for U-types.

What I'm really trying to do is convert the integer into a stream of bits using BitVec, so maybe there's an oppurtunity for an API that returns bitvec::bits::Bits we could discuss separately?

bitvec::bits::Bits is implemented for [u64; n] and since U-types wraps [u64; n] with pub, you can do

use ethereum_types::U256;
use bitvec::prelude::*;

fn main() {
    let n = U256::from(0b0101);
    let bits = n.0.as_bitslice::<LittleEndian>();
    println!("{}{}{}{}", bits[3] as u8, bits[2] as u8, bits[1] as u8, bits[0] as u8);
}

@fubuloubu
Copy link
Contributor Author

My use case is trying to genericize U256 to be any integer type with a static size

@Robbepop
Copy link
Contributor

Robbepop commented Aug 7, 2019

My use case is trying to genericize U256 to be any integer type with a static size

You mean represent them at runtime? - if yes, then there are certain crates already available - shameless pug: https://crates.io/crates/apint
For a use-case of apint crate go here: https://github.com/Robbepop/stevia/blob/master/bitvec/src/vec.rs

If no: Then that's what U-types are already doing for you.

@fubuloubu
Copy link
Contributor Author

So, is the lack of specification of endian-ness the blocker of this PR? If so, could we just assume it is LE (as it currently is) for implementing the AsRef trait, given that it appears most other trait implementations assume LE?

@Robbepop
Copy link
Contributor

Robbepop commented Aug 8, 2019

So, is the lack of specification of endian-ness the blocker of this PR? If so, could we just assume it is LE (as it currently is) for implementing the AsRef trait, given that it appears most other trait implementations assume LE?

From a software technical point of view I would not recommend doing so since the AsRef is meant to be used as a very generic trait implementation that generally has no notion of endianess. This would be different if the type would be something like LittleEndianSlice instead of a generic U-int.

On the other handside I am not the one in charge for parity-common, so I do not want to be the show stopper.

@fubuloubu
Copy link
Contributor Author

Is there a better type I could use for the conversion that's in primitive-types? I can't find another way to do this without using nightly

@ordian
Copy link
Member

ordian commented Aug 9, 2019

@fubuloubu if you want to convert between little-endian U-types and big-endian H-types, take a look at this macro in ethereum-types:

macro_rules! impl_uint_conversions {
($hash: ident, $uint: ident) => {
impl BigEndianHash for $hash {
type Uint = $uint;
fn from_uint(value: &$uint) -> Self {
let mut ret = $hash::zero();
value.to_big_endian(ret.as_bytes_mut());
ret
}
fn into_uint(&self) -> $uint {
$uint::from(self.as_ref() as &[u8])
}
}
}
}
impl_uint_conversions!(H64, U64);
impl_uint_conversions!(H128, U128);
impl_uint_conversions!(H256, U256);
impl_uint_conversions!(H512, U512);

If you want to use a little-endian bit iterator on U-types, you can create a trait

trait AsBitsliceLittleEndian {
    fn as_bitslice_le(&self) -> &BitSlice<LittleEndian, u64>;
}

and implemented it for U-types e.g. via macro.

Does that make sense?

@fubuloubu
Copy link
Contributor Author

fubuloubu commented Aug 9, 2019

@ordian wouldn't it be simpler to implement bitvec::bits::Bits trait via macro for U-types (and H-types)? Bitslice is fairly unambiguous about ordering since it's an iterator over bits, referencing whatever way they are stored in memory (BigEndian or LittleEndian, of which U-types are stored LE)

I can implement this trait in parity-common if that is acceptable.

@ordian
Copy link
Member

ordian commented Aug 9, 2019

@ordian wouldn't it be simpler to implement bitvec::bits::Bits trait via macro for U-types (and H-types)? Bitslice is fairly unambiguous about ordering since it's an iterator over bits, referencing whatever way they are stored in memory (BigEndian or LittleEndian, of which U-types are stored LE)

But that means you can create uint.as_bitslice::<BigEndian>(), which doesn't make much sense, since uint's words are little-endian. That's why I suggested a limited AsBitsliceLittleEndian trait.
Unless you're using endian-ness only to ensure serialization order? Can you elaborate on your use-case?

@fubuloubu
Copy link
Contributor Author

Unless you're using endian-ness only to ensure serialization order

Yes, I want to iterate over the bits of a key from MSB to LSB in order to determine the ordering of hashes in a sparse merkle tree for that given key.

Copy link
Member

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand, I agree with @Robbepop that AsRef doesn't communicate clearly the word order for uint-types, otoh, one can already access the underlying array with .0, and with a bit of documentation added to impl AsRef it doesn't make things any worse (we could also add an example with bitvec to uint/examples). @sorpaas WDYT?

uint/src/uint.rs Show resolved Hide resolved
@Robbepop
Copy link
Contributor

Robbepop commented Aug 9, 2019

What would be the downside of having explicit methods that do what you want AsRef to do?
Something that resembles the to_be_bytes etc. methods from the primitive u32 etc. types.
So for U-types we could for example have:

  • fn to_be_digits(&self) -> impl Iterator<u64>
  • fn to_le_digits(&self) -> impl Iterator<u64>
  • fn to_be_bytes(&self) -> impl Iterator<u8>
  • fn to_le_bytes(&self) -> impl Iterator<u8>

Note we use Iterator here instead of slices in order to not leak to much details about the underlying implementation. This guarantees stable interfaces even though we might change the internal u64 to something else in the future.

@fubuloubu
Copy link
Contributor Author

The methods would not be available as a trait, meaning it would not be possible to leverage them in a generic fashion.

@ordian
Copy link
Member

ordian commented Aug 9, 2019

Note we use Iterator here instead of slices in order to not leak to much details about the underlying implementation. This guarantees stable interfaces even though we might change the internal u64 to something else in the future.

The underlying array is marked as pub, so if we change the internal u64 to something else, it would be a breaking change anyway. Also, AFAIK, bitvec::bits::Bits is only implemented for fixed-sized arrays.

Co-Authored-By: Andronik Ordian <write@reusable.software>
@Robbepop
Copy link
Contributor

Robbepop commented Aug 9, 2019

The methods would not be available as a trait, meaning it would not be possible to leverage them in a generic fashion.

You could simply expose them as traits:

pub trait UintEndianEx {
    fn to_be_digits(&self) -> impl Iterator<u64>
    fn to_le_digits(&self) -> impl Iterator<u64>
    fn to_be_bytes(&self) -> impl Iterator<u8>
    fn to_le_bytes(&self) -> impl Iterator<u8>
}

And then implement them for all U-types.

@fubuloubu
Copy link
Contributor Author

I'm still a bit new to Rust, but you would have to impl UintEndianEx for every U-type, and then pull that trait in from primitive-types in order to reference it, correct? I was trying to avoid the additional dependency so it would be useful for other libraries.

@ordian
Copy link
Member

ordian commented Sep 6, 2019

@fubuloubu what do you think is the best way forward here?

@fubuloubu
Copy link
Contributor Author

I agree with your assessment that having AsRef<[u8]> return the internal value directly (which is assumed LE) and leaving it up to the programmer to ensure they use it correctly would be the most ideal. That is already implemented here in this PR.

Another PR could add a better API for accessing it in LE or BE slices, although many external libraries (such as bitvec mentioned above) have easy APIs for performing this functionality already. However, these alternative suggestions do not fit the AsRef trait signature, and therefore would not work to meeting the issue as described in #195.

Copy link
Contributor

@dvdplm dvdplm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think uint is not properly engineered for handling LE/BE. I also agree with @ordian that at a minimum we should be explicit about out assumptions and I think this PR is better than what we have. Merging.

@dvdplm dvdplm merged commit 5bc6781 into paritytech:master Sep 9, 2019
@fubuloubu fubuloubu deleted the as-ref branch September 9, 2019 12:29
ordian added a commit that referenced this pull request Sep 16, 2019
* master:
  Bump version (#219)
  Find PendingIterator in Transaction Pool (#218)
  Modified AsRef impl for U-type macros (#196)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add AsRef<[u64]> trait + impl to all U-type integers
6 participants