Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(encode): vec alloc & utf8 unchecked #180

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

AaronO
Copy link

@AaronO AaronO commented Mar 6, 2022

Improves throughput by 10-20%

We avoid zeroing the allocated vec since we'll write into it and we don't need to check utf8 when building the String since all alphabets emit ascii/utf8

Benches

❯ rustup run nightly cargo bench -- '^encode_small_input/encode/'
   Compiling base64 v0.20.0-alpha.1 (/Users/aaron/git/rust-base64)
    Finished bench [optimized + debuginfo] target(s) in 3.43s
     Running unittests (target/release/deps/base64-a91f049e61d3804b)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 105 filtered out; finished in 0.00s

     Running unittests (target/release/deps/benchmarks-0b92500c69ee3caa)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.

Gnuplot not found, using plotters backend
encode_small_input/encode/3
                        time:   [31.275 ns 31.304 ns 31.337 ns]
                        thrpt:  [91.300 MiB/s 91.394 MiB/s 91.480 MiB/s]
                 change:
                        time:   [-12.333% -12.192% -12.057%] (p = 0.00 < 0.05)
                        thrpt:  [+13.710% +13.884% +14.068%]
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
encode_small_input/encode/50
                        time:   [46.873 ns 46.923 ns 46.977 ns]
                        thrpt:  [1015.0 MiB/s 1016.2 MiB/s 1017.3 MiB/s]
                 change:
                        time:   [-10.129% -9.9667% -9.8193%] (p = 0.00 < 0.05)
                        thrpt:  [+10.889% +11.070% +11.271%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
encode_small_input/encode/100
                        time:   [62.462 ns 62.484 ns 62.520 ns]
                        thrpt:  [1.4896 GiB/s 1.4905 GiB/s 1.4910 GiB/s]
                 change:
                        time:   [-12.490% -12.286% -12.124%] (p = 0.00 < 0.05)
                        thrpt:  [+13.797% +14.007% +14.273%]
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe
encode_small_input/encode/500
                        time:   [214.46 ns 214.56 ns 214.69 ns]
                        thrpt:  [2.1690 GiB/s 2.1703 GiB/s 2.1713 GiB/s]
                 change:
                        time:   [-11.604% -11.382% -11.148%] (p = 0.00 < 0.05)
                        thrpt:  [+12.547% +12.843% +13.128%]
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe
encode_small_input/encode/3072
                        time:   [1.0080 us 1.0097 us 1.0116 us]
                        thrpt:  [2.8282 GiB/s 2.8334 GiB/s 2.8382 GiB/s]
                 change:
                        time:   [-16.244% -16.081% -15.926%] (p = 0.00 < 0.05)
                        thrpt:  [+18.942% +19.163% +19.394%]
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Improves throughput by 10-20%

We avoid zeroing the allocated vec since we'll write into it and we don't need to check utf8 when building the String since all alphabets emit ascii/utf8
let mut buf = vec![0; encoded_size];
let mut buf = Vec::with_capacity(encoded_size);
unsafe {
buf.set_len(encoded_size);
Copy link

@jonasbb jonasbb Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safety documentation of set_len is pretty clear that this is UB. Namely it states

The elements at old_len..new_len must be initialized.

I think it might be safe to call after encode_with_padding. If you need uninitialized memory you probably want to use MaybeUninit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants