Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encode_engine_slice should probably return a &str instead of usize #209

Open
upsuper opened this issue Dec 29, 2022 · 4 comments
Open

encode_engine_slice should probably return a &str instead of usize #209

upsuper opened this issue Dec 29, 2022 · 4 comments

Comments

@upsuper
Copy link

upsuper commented Dec 29, 2022

I suggest that encode_engine_slice should have a signature like

pub fn encode_engine_slice<'a, E: Engine, T: AsRef<[u8]>>(
    input: T,
    output_buf: &'a mut [u8],
    engine: &E
) -> &'a mut str;

Apparently the encoded result is always a valid string, but the current API leaves a dilemma for some users:

  • encode_engine: provides convenient str but allocates,
  • use encode_engine_slice: doesn't allocate but doesn't provide str
    • which means you either need to run a meaningless check, or use an unsafe.

If encode_engine_slice can return a str, that dilemma would be resolved. The length can still be derived from .len() of the returned str.

There is also a precedent in the Rust std: char::encode_utf8 encodes into a buffer and returns a str.

This is a breaking change, though.

@marshallpierce
Copy link
Owner

Unfortunately this isn't really viable as far as I can tell.

  • This library doesn't use unsafe, so doing str::from_utf8_unchecked for the user is out.
  • Doing the check safely for all users is also not a great idea, as it's unneeded overhead for users that don't care.
  • char::encode_utf8 isn't very helpful in this case. While we could conceivably create a mapping of base64 tokens to chars and then use that, it's variable length, which would destroy the performance of encoding.

@upsuper
Copy link
Author

upsuper commented Dec 30, 2022

It seems that encode_engine already does a UTF-8 check on the output.

I wonder if I can build a crate around ASCII chars which exposes a safe API to construct ASCII strings, would you accept a PR making use of that to eliminate the checks?

@Nugine
Copy link

Nugine commented Dec 30, 2022

base64_simd::Base64::encode_as_str is exactly what you want.
base64-simd also supports encoding/decoding with an uninitialized buffer, which is relatively difficult for a 100%-safe library.

@upsuper
Copy link
Author

upsuper commented Dec 30, 2022

@Nugine That looks great. I'll give it a try. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants