Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API rework #207

Merged
merged 17 commits into from Jan 6, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
@@ -1,6 +1,6 @@
[package]
name = "base64"
version = "0.20.0"
version = "0.21.0-beta.1"
authors = ["Alice Maz <alice@alicemaz.com>", "Marshall Pierce <marshall@mpierce.org>"]
description = "encodes and decodes base64 as bytes or utf8"
repository = "https://github.com/marshallpierce/rust-base64"
Expand Down
14 changes: 0 additions & 14 deletions README.md
Expand Up @@ -14,20 +14,6 @@ e.g. `decode_engine_slice` decodes into an existing `&mut [u8]` and is pretty fa
whereas `decode_engine` allocates a new `Vec<u8>` and returns it, which might be more convenient in some cases, but is
slower (although still fast enough for almost any purpose) at 2.1 GiB/s.

## Example

```rust
use base64::{encode, decode};

fn main() {
let a = b"hello world";
let b = "aGVsbG8gd29ybGQ=";

assert_eq!(encode(a), b);
assert_eq!(a, &decode(b).unwrap()[..]);
}
```

See the [docs](https://docs.rs/base64) for all the details.

## FAQ
Expand Down
113 changes: 96 additions & 17 deletions RELEASE-NOTES.md
@@ -1,22 +1,90 @@
# 0.20.1
marshallpierce marked this conversation as resolved.
Show resolved Hide resolved

## Breaking changes

- `FastPortable` was only meant to be an interim name, and shouldn't have shipped in 0.20. It is now `GeneralPurpose` to
make its intended usage more clear.
- `GeneralPurpose` and its config are now `pub use`'d in the `engine` module for convenience.
- Change a few `from()` functions to be `new()`. `from()` causes confusing compiler errors because of confusion
with `From::from`, and is a little misleading because some of those invocations are not very cheap as one would
usually expect from a `from` call.
- `encode*` and `decode*` top level functions are now methods on `Engine`.
- `DEFAULT_ENGINE` was replaced by `engine::general_purpose::STANDARD`
- Predefined engine consts `engine::general_purpose::{STANDARD, STANDARD_NO_PAD, URL_SAFE, URL_SAFE_NO_PAD}`
- These are `pub use`d into `engine` as well
- The `*_slice` decode/encode functions now return an error instead of panicking when the output slice is too small
- As part of this, there isn't now a public way to decode into a slice _exactly_ the size needed for inputs that
aren't multiples of 4 tokens. If adding up to 2 bytes to always be a multiple of 3 bytes for the decode buffer is
a problem, file an issue.

## Other changes

- `decoded_len_estimate()` is provided to make it easy to size decode buffers correctly.

## Migration

### Functions

| < 0.20 function | 0.21 equivalent |
|-------------------------|-----------------------------|
| `encode()` | `engine::STANDARD.encode()` |
| `encode_config()` | `engine.encode()` |
| `encode_config_buf()` | `engine.encode_string()` |
| `encode_config_slice()` | `engine.encode_slice()` |
| `decode()` | `engine::STANDARD.decode()` |
| `decode_config()` | `engine.decode()` |
| `decode_config_buf()` | `engine.decode_vec()` |
| `decode_config_slice()` | `engine.decode_slice()` |

The short-lived 0.20 functions were the 0.13 functions with `config` replaced with `engine`.

### Padding

If applicable, use the preset engines `engine::STANDARD`, `engine::STANDARD_NO_PAD`, `engine::URL_SAFE`,
or `engine::URL_SAFE_NO_PAD`.
The `NO_PAD` ones require that padding is absent when decoding, and the others require that
canonical padding is present .

If you need the < 0.20 behavior that did not care about padding, or want to recreate < 0.20.0's predefined `Config`s
precisely, see the following table.

| 0.13.1 Config | 0.20.0+ alphabet | `encode_padding` | `decode_padding_mode` |
|-----------------|------------------|------------------|-----------------------|
| STANDARD | STANDARD | true | Indifferent |
| STANDARD_NO_PAD | STANDARD | false | Indifferent |
| URL_SAFE | URL_SAFE | true | Indifferent |
| URL_SAFE_NO_PAD | URL_SAFE | false | Indifferent |

# 0.20.0

### Breaking changes
## Breaking changes

- Update MSRV to 1.57.0
- Decoding can now either ignore padding, require correct padding, or require no padding. The default is to require correct padding.
- The `NO_PAD` config now requires that padding be absent when decoding.
- Decoding can now either ignore padding, require correct padding, or require no padding. The default is to require
correct padding.
- The `NO_PAD` config now requires that padding be absent when decoding.

## 0.20.0-alpha.1

### Breaking changes
- Extended the `Config` concept into the `Engine` abstraction, allowing the user to pick different encoding / decoding implementations.
- What was formerly the only algorithm is now the `FastPortable` engine, so named because it's portable (works on any CPU) and relatively fast.
- This opens the door to a portable constant-time implementation ([#153](https://github.com/marshallpierce/rust-base64/pull/153), presumably `ConstantTimePortable`?) for security-sensitive applications that need side-channel resistance, and CPU-specific SIMD implementations for more speed.
- Standard base64 per the RFC is available via `DEFAULT_ENGINE`. To use different alphabets or other settings (padding, etc), create your own engine instance.
- `CharacterSet` is now `Alphabet` (per the RFC), and allows creating custom alphabets. The corresponding tables that were previously code-generated are now built dynamically.
- Since there are already multiple breaking changes, various functions are renamed to be more consistent and discoverable.

- Extended the `Config` concept into the `Engine` abstraction, allowing the user to pick different encoding / decoding
implementations.
- What was formerly the only algorithm is now the `FastPortable` engine, so named because it's portable (works on
any CPU) and relatively fast.
- This opens the door to a portable constant-time
implementation ([#153](https://github.com/marshallpierce/rust-base64/pull/153),
presumably `ConstantTimePortable`?) for security-sensitive applications that need side-channel resistance, and
CPU-specific SIMD implementations for more speed.
- Standard base64 per the RFC is available via `DEFAULT_ENGINE`. To use different alphabets or other settings (
padding, etc), create your own engine instance.
- `CharacterSet` is now `Alphabet` (per the RFC), and allows creating custom alphabets. The corresponding tables that
were previously code-generated are now built dynamically.
- Since there are already multiple breaking changes, various functions are renamed to be more consistent and
discoverable.
- MSRV is now 1.47.0 to allow various things to use `const fn`.
- `DecoderReader` now owns its inner reader, and can expose it via `into_inner()`. For symmetry, `EncoderWriter` can do the same with its writer.
- `DecoderReader` now owns its inner reader, and can expose it via `into_inner()`. For symmetry, `EncoderWriter` can do
the same with its writer.
- `encoded_len` is now public so you can size encode buffers precisely.

# 0.13.1
Expand All @@ -28,8 +96,11 @@
- Config methods are const
- Added `EncoderStringWriter` to allow encoding directly to a String
- `EncoderWriter` now owns its delegate writer rather than keeping a reference to it (though refs still work)
- As a consequence, it is now possible to extract the delegate writer from an `EncoderWriter` via `finish()`, which returns `Result<W>` instead of `Result<()>`. If you were calling `finish()` explicitly, you will now need to use `let _ = foo.finish()` instead of just `foo.finish()` to avoid a warning about the unused value.
- When decoding input that has both an invalid length and an invalid symbol as the last byte, `InvalidByte` will be emitted instead of `InvalidLength` to make the problem more obvious.
- As a consequence, it is now possible to extract the delegate writer from an `EncoderWriter` via `finish()`, which
returns `Result<W>` instead of `Result<()>`. If you were calling `finish()` explicitly, you will now need to
use `let _ = foo.finish()` instead of just `foo.finish()` to avoid a warning about the unused value.
- When decoding input that has both an invalid length and an invalid symbol as the last byte, `InvalidByte` will be
emitted instead of `InvalidLength` to make the problem more obvious.

# 0.12.2

Expand All @@ -47,23 +118,31 @@
- A minor performance improvement in encoding

# 0.11.0

- Minimum rust version 1.34.0
- `no_std` is now supported via the two new features `alloc` and `std`.

# 0.10.1

- Minimum rust version 1.27.2
- Fix bug in streaming encoding ([#90](https://github.com/marshallpierce/rust-base64/pull/90)): if the underlying writer didn't write all the bytes given to it, the remaining bytes would not be retried later. See the docs on `EncoderWriter::write`.
- Fix bug in streaming encoding ([#90](https://github.com/marshallpierce/rust-base64/pull/90)): if the underlying writer
didn't write all the bytes given to it, the remaining bytes would not be retried later. See the docs
on `EncoderWriter::write`.
- Make it configurable whether or not to return an error when decoding detects excess trailing bits.

# 0.10.0

- Remove line wrapping. Line wrapping was never a great conceptual fit in this library, and other features (streaming encoding, etc) either couldn't support it or could support only special cases of it with a great increase in complexity. Line wrapping has been pulled out into a [line-wrap](https://crates.io/crates/line-wrap) crate, so it's still available if you need it.
- `Base64Display` creation no longer uses a `Result` because it can't fail, which means its helper methods for common
configs that `unwrap()` for you are no longer needed
- Remove line wrapping. Line wrapping was never a great conceptual fit in this library, and other features (streaming
encoding, etc) either couldn't support it or could support only special cases of it with a great increase in
complexity. Line wrapping has been pulled out into a [line-wrap](https://crates.io/crates/line-wrap) crate, so it's
still available if you need it.
- `Base64Display` creation no longer uses a `Result` because it can't fail, which means its helper methods for
common
configs that `unwrap()` for you are no longer needed
- Add a streaming encoder `Write` impl to transparently base64 as you write.
- Remove the remaining `unsafe` code.
- Remove whitespace stripping to simplify `no_std` support. No out of the box configs use it, and it's trivial to do yourself if needed: `filter(|b| !b" \n\t\r\x0b\x0c".contains(b)`.
- Remove whitespace stripping to simplify `no_std` support. No out of the box configs use it, and it's trivial to do
yourself if needed: `filter(|b| !b" \n\t\r\x0b\x0c".contains(b)`.
- Detect invalid trailing symbols when decoding and return an error rather than silently ignoring them.

# 0.9.3
Expand Down
40 changes: 18 additions & 22 deletions benches/benchmarks.rs
@@ -1,36 +1,34 @@
#[macro_use]
extern crate criterion;

use base64::display;
use base64::{
decode, decode_engine_slice, decode_engine_vec, encode, encode_engine_slice,
encode_engine_string, write,
display,
engine::{Engine, STANDARD},
write,
};

use base64::engine::DEFAULT_ENGINE;
use criterion::{black_box, Bencher, BenchmarkId, Criterion, Throughput};
use rand::{Rng, SeedableRng};
use std::io::{self, Read, Write};

fn do_decode_bench(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

b.iter(|| {
let orig = decode(&encoded);
let orig = STANDARD.decode(&encoded);
black_box(&orig);
});
}

fn do_decode_bench_reuse_buf(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

let mut buf = Vec::new();
b.iter(|| {
decode_engine_vec(&encoded, &mut buf, &DEFAULT_ENGINE).unwrap();
STANDARD.decode_vec(&encoded, &mut buf).unwrap();
black_box(&buf);
buf.clear();
});
Expand All @@ -39,28 +37,28 @@ fn do_decode_bench_reuse_buf(b: &mut Bencher, &size: &usize) {
fn do_decode_bench_slice(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

let mut buf = Vec::new();
buf.resize(size, 0);
b.iter(|| {
decode_engine_slice(&encoded, &mut buf, &DEFAULT_ENGINE).unwrap();
STANDARD.decode_slice(&encoded, &mut buf).unwrap();
black_box(&buf);
});
}

fn do_decode_bench_stream(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size * 3 / 4);
fill(&mut v);
let encoded = encode(&v);
let encoded = STANDARD.encode(&v);

let mut buf = Vec::new();
buf.resize(size, 0);
buf.truncate(0);

b.iter(|| {
let mut cursor = io::Cursor::new(&encoded[..]);
let mut decoder = base64::read::DecoderReader::from(&mut cursor, &DEFAULT_ENGINE);
let mut decoder = base64::read::DecoderReader::new(&mut cursor, &STANDARD);
decoder.read_to_end(&mut buf).unwrap();
buf.clear();
black_box(&buf);
Expand All @@ -71,7 +69,7 @@ fn do_encode_bench(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size);
fill(&mut v);
b.iter(|| {
let e = encode(&v);
let e = STANDARD.encode(&v);
black_box(&e);
});
}
Expand All @@ -80,7 +78,7 @@ fn do_encode_bench_display(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size);
fill(&mut v);
b.iter(|| {
let e = format!("{}", display::Base64Display::from(&v, &DEFAULT_ENGINE));
let e = format!("{}", display::Base64Display::new(&v, &STANDARD));
black_box(&e);
});
}
Expand All @@ -90,7 +88,7 @@ fn do_encode_bench_reuse_buf(b: &mut Bencher, &size: &usize) {
fill(&mut v);
let mut buf = String::new();
b.iter(|| {
encode_engine_string(&v, &mut buf, &DEFAULT_ENGINE);
STANDARD.encode_string(&v, &mut buf);
buf.clear();
});
}
Expand All @@ -101,9 +99,7 @@ fn do_encode_bench_slice(b: &mut Bencher, &size: &usize) {
let mut buf = Vec::new();
// conservative estimate of encoded size
buf.resize(v.len() * 2, 0);
b.iter(|| {
encode_engine_slice(&v, &mut buf, &DEFAULT_ENGINE);
});
b.iter(|| STANDARD.encode_slice(&v, &mut buf).unwrap());
}

fn do_encode_bench_stream(b: &mut Bencher, &size: &usize) {
Expand All @@ -114,7 +110,7 @@ fn do_encode_bench_stream(b: &mut Bencher, &size: &usize) {
buf.reserve(size * 2);
b.iter(|| {
buf.clear();
let mut stream_enc = write::EncoderWriter::from(&mut buf, &DEFAULT_ENGINE);
let mut stream_enc = write::EncoderWriter::new(&mut buf, &STANDARD);
stream_enc.write_all(&v).unwrap();
stream_enc.flush().unwrap();
});
Expand All @@ -125,7 +121,7 @@ fn do_encode_bench_string_stream(b: &mut Bencher, &size: &usize) {
fill(&mut v);

b.iter(|| {
let mut stream_enc = write::EncoderStringWriter::from(&DEFAULT_ENGINE);
let mut stream_enc = write::EncoderStringWriter::new(&STANDARD);
stream_enc.write_all(&v).unwrap();
stream_enc.flush().unwrap();
let _ = stream_enc.into_inner();
Expand All @@ -139,7 +135,7 @@ fn do_encode_bench_string_reuse_buf_stream(b: &mut Bencher, &size: &usize) {
let mut buf = String::new();
b.iter(|| {
buf.clear();
let mut stream_enc = write::EncoderStringWriter::from_consumer(&mut buf, &DEFAULT_ENGINE);
let mut stream_enc = write::EncoderStringWriter::from_consumer(&mut buf, &STANDARD);
stream_enc.write_all(&v).unwrap();
stream_enc.flush().unwrap();
let _ = stream_enc.into_inner();
Expand Down
8 changes: 4 additions & 4 deletions examples/base64.rs
Expand Up @@ -61,21 +61,21 @@ fn main() {
};

let alphabet = opt.alphabet.unwrap_or_default();
let engine = engine::fast_portable::FastPortable::from(
let engine = engine::GeneralPurpose::new(
&match alphabet {
Alphabet::Standard => alphabet::STANDARD,
Alphabet::UrlSafe => alphabet::URL_SAFE,
},
engine::fast_portable::PAD,
engine::general_purpose::PAD,
);

let stdout = io::stdout();
let mut stdout = stdout.lock();
let r = if opt.decode {
let mut decoder = read::DecoderReader::from(&mut input, &engine);
let mut decoder = read::DecoderReader::new(&mut input, &engine);
io::copy(&mut decoder, &mut stdout)
} else {
let mut encoder = write::EncoderWriter::from(&mut stdout, &engine);
let mut encoder = write::EncoderWriter::new(&mut stdout, &engine);
io::copy(&mut input, &mut encoder)
};
if let Err(e) = r {
Expand Down
2 changes: 1 addition & 1 deletion fuzz/fuzzers/decode_random.rs
Expand Up @@ -11,5 +11,5 @@ fuzz_target!(|data: &[u8]| {

// The data probably isn't valid base64 input, but as long as it returns an error instead
// of crashing, that's correct behavior.
let _ = decode_engine(data, &engine);
let _ = engine.decode(data);
});
6 changes: 3 additions & 3 deletions fuzz/fuzzers/roundtrip.rs
Expand Up @@ -2,10 +2,10 @@
#[macro_use] extern crate libfuzzer_sys;
extern crate base64;

use base64::engine::DEFAULT_ENGINE;
use base64::{Engine as _, engine::STANDARD};

fuzz_target!(|data: &[u8]| {
let encoded = base64::encode_engine(data, &DEFAULT_ENGINE);
let decoded = base64::decode_engine(&encoded, &DEFAULT_ENGINE).unwrap();
let encoded = STANDARD.encode(data);
let decoded = STANDARD.decode(&encoded).unwrap();
assert_eq!(data, decoded.as_slice());
});
10 changes: 5 additions & 5 deletions fuzz/fuzzers/roundtrip_no_pad.rs
Expand Up @@ -3,15 +3,15 @@
extern crate libfuzzer_sys;
extern crate base64;

use base64::engine::{self, fast_portable};
use base64::{Engine as _, engine::{self, general_purpose}};

fuzz_target!(|data: &[u8]| {
let config = fast_portable::FastPortableConfig::new()
let config = general_purpose::GeneralPurposeConfig::new()
.with_encode_padding(false)
.with_decode_padding_mode(engine::DecodePaddingMode::RequireNone);
let engine = fast_portable::FastPortable::from(&base64::alphabet::STANDARD, config);
let engine = general_purpose::GeneralPurpose::new(&base64::alphabet::STANDARD, config);

let encoded = base64::encode_engine(data, &engine);
let decoded = base64::decode_engine(&encoded, &engine).unwrap();
let encoded = engine.encode(data);
let decoded = engine.decode(&encoded).unwrap();
assert_eq!(data, decoded.as_slice());
});