Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polishing/stabilization roadmap? #60

Open
Kixunil opened this issue Jan 5, 2022 · 9 comments
Open

Polishing/stabilization roadmap? #60

Kixunil opened this issue Jan 5, 2022 · 9 comments

Comments

@Kixunil
Copy link

Kixunil commented Jan 5, 2022

It'd be great to polish this crate and make it stable with idiomatic API and good documentation. Since this crate appears stale I'm willing to help maintain it to make the progress a bit faster.

I'm still unsure what all those traits do, they seem to be meant as extension traits?
Thinking about ideal API, I identified these basic building blocks:

  • u5 type
  • converting stream of u8 to stream of u5 and back
  • calculating checksum
  • converting stream of u8 to ASCII and back

I think each deserves its own module, maybe even crate. The commonly used things can still be flattened for consumers.

u5 type

I think this is good as is. Could add impl TryFrom<*> for u5 and arithmetic functions for completeness.

u8-u5 conversion

The API should enable to perform this conversion without allocation by having wrappers around iterators.
Convenience functions for slices and Vecs can be added but not sure how important given .iter() and .collect() are easy to call.
This should also provide free-standing functions to calculate encoded/decoded lengths.

checksum calculation

The API should be similar to hashers with possible convenience to compute from iterators.
It should also provide a wrapper for the case when checksum is calculated together with conversions.

u8-ASCII conversion

Should be similar to u8-u5 conversion. Might be useful to provide wrappers returning both ASCII u8 and char to avoid overhead when writing to writer. Also maybe provide a method to construct String without unnecessary UTF-8 check (using unsafe internally).

Facade

Convenience functions for converting stream of u8 to strings (with/without checksum) and back should be provided.

Also for consideration: exploit the invariant of u5 to skip bounds checks when converting to char.

tcharding added a commit to tcharding/rust-bech32 that referenced this issue Mar 3, 2023
Currently we use `Error::InvalidData(u8)` to attempt to cache an invalid
value encountered while converting to a `u8`. This is buggy because we
cast to the `u8` to stash the value so we loose valuable bits.

Implement `TryFrom` by doing:
- Add an `Error::Overflow` variant.
- Use `Error::Overflow` for conversion _outside_ of `TryFrom` impls
- Add a new `TryFromError` and nest it in `Error::TryFrom`
- Implement `TryFrom<T> for u5` impls for all standard integer types.

Inspired by issue rust-bitcoin#60 and the stdlib macros of the same name.
tcharding added a commit to tcharding/rust-bech32 that referenced this issue Mar 3, 2023
Currently we use `Error::InvalidData(u8)` to attempt to cache an invalid
value encountered while converting to a `u8`. This is buggy because we
cast to the `u8` to stash the value so we loose valuable bits.

Implement `TryFrom` by doing:
- Add an `Error::Overflow` variant.
- Use `Error::Overflow` for conversion _outside_ of `TryFrom` impls
- Add a new `TryFromError` and nest it in `Error::TryFrom`
- Implement `TryFrom<T> for u5` impls for all standard integer types.

Inspired by issue rust-bitcoin#60 and the stdlib macros of the same name.
apoelstra added a commit that referenced this issue Mar 4, 2023
25462c9 Implement TryFrom various integer types (Tobin C. Harding)
7ab12fc Use Error variants locally (Tobin C. Harding)

Pull request description:

  Patch 1 is preparatory clean up.

  From the commit log of patch 2:

      Currently we use `Error::InvalidData(u8)` to attempt to cache an invalid
      value encountered while converting to a `u8`. This is buggy because we
      cast to the `u8` to stash the value so we loose valuable bits.

      Implement `TryFrom` by doing:
      - Add an `Error::Overflow` variant.
      - Use `Error::Overflow` for conversion _outside_ of `TryFrom` impls
      - Add a new `TryFromError` and nest it in `Error::TryFrom`
      - Implement `TryFrom<T> for u5` impls for all standard integer types.

      Inspired by issue #60 and the stdlib macros of the same name.

  Please note, there is expected error work to come soon on this crate, can we keep perfect error code out of scope for this PR please (obviously if it has gaping problems then yell at me)? Note that the same derives are used here as for `Error` even though I think they are wrong.

  Done as part of stabilising (#60).

ACKs for top commit:
  apoelstra:
    ACK 25462c9

Tree-SHA512: 830c6d0e1a8fa8badc983497b9eb6f0e38f0d81dc4f33dd5cd882d3e34836cfca277b4b1af9d98ea26e8cc0e53df9e708d8bc7d84757104cd05054fcbec169e0
@tcharding
Copy link
Member

Would be great to get your input @Kixunil on how close you think the new code is to being ready for v1.0.0.

Current plan is (only from informal discussion between @apoelstra and myself, so still requires input from @clarkmoody):

  1. Merge Re-write crate level API #128
  2. Release v0.10.0
  3. Let that sit in the wild for six months or so

Seems we are then pretty close to doing a v1.0.0

@Kixunil
Copy link
Author

Kixunil commented Sep 22, 2023

Great timing, I'm back!

I only took a quick look now and I still have a bunch of things.

  • Why do we have WriteBase256? Looks like it should be just the Write trait, possibly from lgio
  • _unchecked in Rust usually means "UB if wrong" or "panic if wrong" but has different semantics here. Rename it?
  • Didn't look into error types yet but I think we should have some global standard around them.
  • Should bech32::segwit go to bitcoin::address and Fe32 into its own crate?
  • We probably should have buffered writing like in hex

@apoelstra
Copy link
Member

apoelstra commented Sep 22, 2023

I also want to

  • update rust--codex32 to use this crate (which has a pretty different usage pattern than rust-bitcoin) to see how it works
  • similarly update rust-miniscript to use it
  • implement error correction

@Kixunil

  • WriteBase256 is a holdover from an earlier attempt at making this library nostd. It will be removed in the next major rev (or do we need to deprecate it? Need to check, but I don't think it got into a non-alpha release.)
  • I don't think _unchecked means "UB if wrong" unless it's combined with an unsafe marker. Without unsafe I understand it to mean "does the wrong thing if the input is invalid". Panicking is one form of "does the wrong thing" but I think anything is acceptable. (I guess this is "undefined behavior" in a sense :)).
  • There is an issue in rust-bitcoin about error types I believe. This crate may be a good one to test ideas on because it's so limited in scope. But I also don't want to hold up this library on that.
  • No, bech32::segwit is an implementation of the segwit stuff in BIP 173/350 which is explicitly a goal of this crate. Fe32 is a dead simple type that I don't think is worth having a whole extra crate for. Especially as this particular representation is designed for use with BCH codes and it is lacking much of the generic functionality you might expect from a GF32 element.
  • +1 to buffered writing :). This was a lot of non-API-visible work that we didn't get to.

Edit also, regarding the segwit module, I don't think we can actually separate it out into rust-bitcoin without a lot of pain. The segwit-specific stuff is actually pretty invasive and layer-violating since it introduces a "witness version" character which is encoded differently at the bit level.

@Kixunil
Copy link
Author

Kixunil commented Sep 22, 2023

  • "UB if wrong" unless it's combined with an unsafe marker.

Indeed, what I meant is the terminology is mostly used in that context. But maybe it's really fine, I just find it a bit more scary than it needs to be.

BCH

Screw this name, my mind got instantly triggered... :D

segwit-specific stuff is actually pretty invasive and layer-violating since it introduces a "witness version" character which is encoded differently at the bit level.

😢

@tcharding
Copy link
Member

Kiiiiiiiix! Welcome back!

@tcharding
Copy link
Member

As part of the stabalization process would it be useful to hack up code that exercises every error code path and check the output to see if it is correct and useful for debugging? Or is this going too far, is this a v2.0 thing?

@tcharding
Copy link
Member

tcharding commented Oct 17, 2023

@clarkmoody is there anything you would like to see done before we release v1.0.0 of this crate? Said another way, do you have any concerns, what so ever, with what we did over the last few months, things you'd like me to look into? Cheers

FTR @apoelstra is going to look into the error detection before we consider pushing out the v1.0

@Kixunil
Copy link
Author

Kixunil commented Oct 21, 2023

@tcharding I wouldn't do automated string comparison but writing it as a demo to manually check sounds like a good idea.

going to look into the error detection before we consider pushing out the v1.0

If the internals of errors are private we don't have to rush this. But I really want it soon. ;)

@tcharding
Copy link
Member

tcharding commented Oct 22, 2023

writing it as a demo to manually check sounds like a good idea

Please see #151

Note, I had in mind while doing this that stabalizing this crate is a test run for doing rust-bitcoin so if this error demo stuff is useful we should do it there as well. It will be quite a lot of work but I think its useful, it made me find #150 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants