Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect invalid padding #198

Merged
merged 10 commits into from Dec 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
70 changes: 58 additions & 12 deletions README.md
Expand Up @@ -8,7 +8,11 @@ Made with CLion. Thanks to JetBrains for supporting open source!

It's base64. What more could anyone want?

This library's goals are to be *correct* and *fast*. It's thoroughly tested and widely used. It exposes functionality at multiple levels of abstraction so you can choose the level of convenience vs performance that you want, e.g. `decode_engine_slice` decodes into an existing `&mut [u8]` and is pretty fast (2.6GiB/s for a 3 KiB input), whereas `decode_engine` allocates a new `Vec<u8>` and returns it, which might be more convenient in some cases, but is slower (although still fast enough for almost any purpose) at 2.1 GiB/s.
This library's goals are to be *correct* and *fast*. It's thoroughly tested and widely used. It exposes functionality at
multiple levels of abstraction so you can choose the level of convenience vs performance that you want,
e.g. `decode_engine_slice` decodes into an existing `&mut [u8]` and is pretty fast (2.6GiB/s for a 3 KiB input),
whereas `decode_engine` allocates a new `Vec<u8>` and returns it, which might be more convenient in some cases, but is
slower (although still fast enough for almost any purpose) at 2.1 GiB/s.

## Example

Expand All @@ -32,7 +36,8 @@ See the [docs](https://docs.rs/base64) for all the details.

Remove non-base64 characters from your input before decoding.

If you have a `Vec` of base64, [retain](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.retain) can be used to strip out whatever you need removed.
If you have a `Vec` of base64, [retain](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.retain) can be used to
strip out whatever you need removed.

If you have a `Read` (e.g. reading a file or network socket), there are various approaches.

Expand All @@ -43,15 +48,45 @@ If you have a `Read` (e.g. reading a file or network socket), there are various

[line-wrap](https://crates.io/crates/line-wrap) does just that.

### I want canonical base64 encoding/decoding.

First, don't do this. You should no more expect Base64 to be canonical than you should expect compression algorithms to
produce canonical output across all usage in the wild (hint: they don't).
However, [people are drawn to their own destruction like moths to a flame](https://eprint.iacr.org/2022/361), so here we
are.

There are two opportunities for non-canonical encoding (and thus, detection of the same during decoding): the final bits
of the last encoded token in two or three token suffixes, and the `=` token used to inflate the suffix to a full four
tokens.

The trailing bits issue is unavoidable: with 6 bits available in each encoded token, 1 input byte takes 2 tokens,
with the second one having some bits unused. Same for two input bytes: 16 bits, but 3 tokens have 18 bits. Unless we
decide to stop shipping whole bytes around, we're stuck with those extra bits that a sneaky or buggy encoder might set
to 1 instead of 0.

The `=` pad bytes, on the other hand, are entirely a self-own by the Base64 standard. They do not affect decoding other
than to provide an opportunity to say "that padding is incorrect". Exabytes of storage and transfer have no doubt been
wasted on pointless `=` bytes. Somehow we all seem to be quite comfortable with, say, hex-encoded data just stopping
when it's done rather than requiring a confirmation that the author of the encoder could count to four. Anyway, there
are two ways to make pad bytes predictable: require canonical padding to the next multiple of four bytes as per the RFC,
or, if you control all producers and consumers, save a few bytes by requiring no padding (especially applicable to the
url-safe alphabet).

All `Engine` implementations must at a minimum support treating non-canonical padding of both types as an error, and
optionally may allow other behaviors.

## Rust version compatibility

The minimum required Rust version is 1.57.0.
The minimum supported Rust version is 1.57.0.

# Contributing

Contributions are very welcome. However, because this library is used widely, and in security-sensitive contexts, all PRs will be carefully scrutinized. Beyond that, this sort of low level library simply needs to be 100% correct. Nobody wants to chase bugs in encoding of any sort.
Contributions are very welcome. However, because this library is used widely, and in security-sensitive contexts, all
PRs will be carefully scrutinized. Beyond that, this sort of low level library simply needs to be 100% correct. Nobody
wants to chase bugs in encoding of any sort.

All this means that it takes me a fair amount of time to review each PR, so it might take quite a while to carve out the free time to give each PR the attention it deserves. I will get to everyone eventually!
All this means that it takes me a fair amount of time to review each PR, so it might take quite a while to carve out the
free time to give each PR the attention it deserves. I will get to everyone eventually!

## Developing

Expand All @@ -63,13 +98,22 @@ rustup run nightly cargo bench

## no_std

This crate supports no_std. By default the crate targets std via the `std` feature. You can deactivate the `default-features` to target core instead. In that case you lose out on all the functionality revolving around `std::io`, `std::error::Error` and heap allocations. There is an additional `alloc` feature that you can activate to bring back the support for heap allocations.
This crate supports no_std. By default the crate targets std via the `std` feature. You can deactivate
the `default-features` to target `core` instead. In that case you lose out on all the functionality revolving
around `std::io`, `std::error::Error`, and heap allocations. There is an additional `alloc` feature that you can activate
to bring back the support for heap allocations.

## Profiling

On Linux, you can use [perf](https://perf.wiki.kernel.org/index.php/Main_Page) for profiling. Then compile the benchmarks with `rustup nightly run cargo bench --no-run`.
On Linux, you can use [perf](https://perf.wiki.kernel.org/index.php/Main_Page) for profiling. Then compile the
benchmarks with `rustup nightly run cargo bench --no-run`.

Run the benchmark binary with `perf` (shown here filtering to one particular benchmark, which will make the results easier to read). `perf` is only available to the root user on most systems as it fiddles with event counters in your CPU, so use `sudo`. We need to run the actual benchmark binary, hence the path into `target`. You can see the actual full path with `rustup run nightly cargo bench -v`; it will print out the commands it runs. If you use the exact path that `bench` outputs, make sure you get the one that's for the benchmarks, not the tests. You may also want to `cargo clean` so you have only one `benchmarks-` binary (they tend to accumulate).
Run the benchmark binary with `perf` (shown here filtering to one particular benchmark, which will make the results
easier to read). `perf` is only available to the root user on most systems as it fiddles with event counters in your
CPU, so use `sudo`. We need to run the actual benchmark binary, hence the path into `target`. You can see the actual
full path with `rustup run nightly cargo bench -v`; it will print out the commands it runs. If you use the exact path
that `bench` outputs, make sure you get the one that's for the benchmarks, not the tests. You may also want
to `cargo clean` so you have only one `benchmarks-` binary (they tend to accumulate).

```bash
sudo perf record target/release/deps/benchmarks-* --bench decode_10mib_reuse
Expand All @@ -81,7 +125,10 @@ Then analyze the results, again with perf:
sudo perf annotate -l
```

You'll see a bunch of interleaved rust source and assembly like this. The section with `lib.rs:327` is telling us that 4.02% of samples saw the `movzbl` aka bit shift as the active instruction. However, this percentage is not as exact as it seems due to a phenomenon called *skid*. Basically, a consequence of how fancy modern CPUs are is that this sort of instruction profiling is inherently inaccurate, especially in branch-heavy code.
You'll see a bunch of interleaved rust source and assembly like this. The section with `lib.rs:327` is telling us that
4.02% of samples saw the `movzbl` aka bit shift as the active instruction. However, this percentage is not as exact as
it seems due to a phenomenon called *skid*. Basically, a consequence of how fancy modern CPUs are is that this sort of
instruction profiling is inherently inaccurate, especially in branch-heavy code.

```text
lib.rs:322 0.70 : 10698: mov %rdi,%rax
Expand All @@ -103,10 +150,10 @@ You'll see a bunch of interleaved rust source and assembly like this. The sectio
0.00 : 106ab: je 1090e <base64::decode_config_buf::hbf68a45fefa299c1+0x46e>
```


## Fuzzing

This uses [cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz). See `fuzz/fuzzers` for the available fuzzing scripts. To run, use an invocation like these:
This uses [cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz). See `fuzz/fuzzers` for the available fuzzing scripts.
To run, use an invocation like these:

```bash
cargo +nightly fuzz run roundtrip
Expand All @@ -115,7 +162,6 @@ cargo +nightly fuzz run roundtrip_random_config -- -max_len=10240
cargo +nightly fuzz run decode_random
```


## License

This project is dual-licensed under MIT and Apache 2.0.
Expand Down
6 changes: 5 additions & 1 deletion RELEASE-NOTES.md
@@ -1,8 +1,12 @@
# 0.20.0

## Next
## 0.20.0-beta.1

### Breaking changes

- Update MSRV to 1.57.0
- Decoding can now either ignore padding, require correct padding, or require no padding. The default is to require correct padding.
- The `NO_PAD` config now requires that padding be absent when decoding.

## 0.20.0-alpha.1

Expand Down
3 changes: 2 additions & 1 deletion fuzz/Cargo.toml
Expand Up @@ -11,7 +11,8 @@ cargo-fuzz = true
[dependencies]
rand = "0.6.1"
rand_pcg = "0.1.1"
ring = "0.13.5"
sha2 = "0.10.6"

[dependencies.base64]
path = ".."
[dependencies.libfuzzer-sys]
Expand Down
9 changes: 6 additions & 3 deletions fuzz/fuzzers/roundtrip_no_pad.rs
@@ -1,11 +1,14 @@
#![no_main]
#[macro_use] extern crate libfuzzer_sys;
#[macro_use]
extern crate libfuzzer_sys;
extern crate base64;

use base64::engine::fast_portable;
use base64::engine::{self, fast_portable};

fuzz_target!(|data: &[u8]| {
let config = fast_portable::FastPortableConfig::new().with_encode_padding(false);
let config = fast_portable::FastPortableConfig::new()
.with_encode_padding(false)
.with_decode_padding_mode(engine::DecodePaddingMode::RequireNone);
let engine = fast_portable::FastPortable::from(&base64::alphabet::STANDARD, config);

let encoded = base64::encode_engine(&data, &engine);
Expand Down
23 changes: 16 additions & 7 deletions fuzz/fuzzers/utils.rs
@@ -1,18 +1,20 @@
extern crate rand;
extern crate rand_pcg;
extern crate ring;
extern crate sha2;

use base64::{alphabet, engine::fast_portable};
use base64::{alphabet, engine::{self, fast_portable}};
use self::rand::{Rng, SeedableRng};
use self::rand_pcg::Pcg32;
use self::ring::digest;
use self::sha2::Digest as _;

pub fn random_engine(data: &[u8]) -> fast_portable::FastPortable {
// use sha256 of data as rng seed so it's repeatable
let sha = digest::digest(&digest::SHA256, data);
let mut hasher = sha2::Sha256::new();
hasher.update(data);
let sha = hasher.finalize();

let mut seed: [u8; 16] = [0; 16];
seed.copy_from_slice(&sha.as_ref()[0..16]);
seed.copy_from_slice(&sha.as_slice()[0..16]);

let mut rng = Pcg32::from_seed(seed);

Expand All @@ -22,9 +24,16 @@ pub fn random_engine(data: &[u8]) -> fast_portable::FastPortable {
alphabet::STANDARD
};

let encode_padding = rng.gen();
let decode_padding = if encode_padding {
engine::DecodePaddingMode::RequireCanonical
} else {
engine::DecodePaddingMode::RequireNone
};
let config = fast_portable::FastPortableConfig::new()
.with_encode_padding(rng.gen())
.with_decode_allow_trailing_bits(rng.gen());
.with_encode_padding(encode_padding)
.with_decode_allow_trailing_bits(rng.gen())
.with_decode_padding_mode(decode_padding);

fast_portable::FastPortable::from(&alphabet, config)
}
21 changes: 15 additions & 6 deletions src/decode.rs
Expand Up @@ -13,6 +13,7 @@ use std::error;
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum DecodeError {
/// An invalid byte was found in the input. The offset and offending byte are provided.
/// Padding characters (`=`) interspersed in the encoded form will be treated as invalid bytes.
InvalidByte(usize, u8),
/// The length of the input is invalid.
/// A typical cause of this is stray trailing whitespace or other separator bytes.
Expand All @@ -22,9 +23,12 @@ pub enum DecodeError {
InvalidLength,
/// The last non-padding input symbol's encoded 6 bits have nonzero bits that will be discarded.
/// This is indicative of corrupted or truncated Base64.
/// Unlike InvalidByte, which reports symbols that aren't in the alphabet, this error is for
/// Unlike `InvalidByte`, which reports symbols that aren't in the alphabet, this error is for
/// symbols that are in the alphabet but represent nonsensical encodings.
InvalidLastSymbol(usize, u8),
/// The nature of the padding was not as configured: absent or incorrect when it must be
/// canonical, or present when it must be absent, etc.
InvalidPadding,
}

impl fmt::Display for DecodeError {
Expand All @@ -35,6 +39,7 @@ impl fmt::Display for DecodeError {
Self::InvalidLastSymbol(index, byte) => {
write!(f, "Invalid last symbol {}, offset {}.", byte, index)
}
Self::InvalidPadding => write!(f, "Invalid padding"),
}
}
}
Expand All @@ -46,6 +51,7 @@ impl error::Error for DecodeError {
Self::InvalidByte(_, _) => "invalid byte",
Self::InvalidLength => "invalid length",
Self::InvalidLastSymbol(_, _) => "invalid last symbol",
Self::InvalidPadding => "invalid padding",
}
}

Expand Down Expand Up @@ -192,10 +198,12 @@ pub fn decode_engine_slice<E: Engine, T: AsRef<[u8]>>(
#[cfg(test)]
mod tests {
use super::*;
use crate::{encode::encode_engine_string, tests::assert_encode_sanity};

use crate::engine::Config;
use crate::tests::random_engine;
use crate::{
alphabet,
encode::encode_engine_string,
engine::{fast_portable, fast_portable::FastPortable, Config},
tests::{assert_encode_sanity, random_engine},
};
use rand::{
distributions::{Distribution, Uniform},
Rng, SeedableRng,
Expand Down Expand Up @@ -350,12 +358,13 @@ mod tests {

#[test]
fn decode_engine_estimation_works_for_various_lengths() {
let engine = FastPortable::from(&alphabet::STANDARD, fast_portable::NO_PAD);
for num_prefix_quads in 0..100 {
for suffix in &["AA", "AAA", "AAAA"] {
let mut prefix = "AAAA".repeat(num_prefix_quads);
prefix.push_str(suffix);
// make sure no overflow (and thus a panic) occurs
let res = decode_engine(prefix, &DEFAULT_ENGINE);
let res = decode_engine(prefix, &engine);
assert!(res.is_ok());
}
}
Expand Down
2 changes: 1 addition & 1 deletion src/encode.rs
Expand Up @@ -123,7 +123,7 @@ pub fn encode_engine_string<E: Engine, T: AsRef<[u8]>>(
/// &base64::engine::DEFAULT_ENGINE);
///
/// // shorten our vec down to just what was written
/// buf.resize(bytes_written, 0);
/// buf.truncate(bytes_written);
///
/// assert_eq!(s, base64::decode(&buf).unwrap().as_slice());
/// ```
Expand Down