Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of encoding #64

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

taiki-e
Copy link

@taiki-e taiki-e commented Aug 28, 2021

This change makes encoding about 7x faster than the current implementation, on my machine.

Before:

hex_encode              time:   [77.149 us 77.581 us 78.015 us]

After:

hex_encode              time:   [10.214 us 10.270 us 10.336 us]                        
Benchmarks per commit

Current main branch (aa8f300)

hex_encode              time:   [77.149 us 77.581 us 78.015 us]

First commit of this PR: Adjust #[inline] on encoding functions (97abf03)

hex_encode              time:   [37.251 us 37.412 us 37.583 us]                        
                        change: [-52.047% -51.748% -51.412%] (p = 0.00 < 0.05)
                        Performance has improved.

Second commit of this PR: Use encode_to_slice in encode (0129182)

hex_encode              time:   [27.029 us 27.230 us 27.476 us]                        
                        change: [-27.711% -27.244% -26.782%] (p = 0.00 < 0.05)
                        Performance has improved.

Third commit of this PR: Use chunks_exact_mut instead of generate_iter (19b333d)

hex_encode              time:   [10.214 us 10.270 us 10.336 us]                        
                        change: [-62.198% -61.839% -61.444%] (p = 0.00 < 0.05)
                        Performance has improved.

This also adds encode_to_slice_upper, which is needed to improve the performance of encode_upper. (fixes #45).

let data = data.as_ref();
let mut out = vec![0; data.len() * 2];
encode_to_slice(data, &mut out).unwrap();
String::from_utf8(out).unwrap()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using from_utf8_unckecked here, it can improve performance by about 8%. (It is safe because we emit only hex characters.)
However, I didn't apply that change because I don't know the policy regarding the unsafe code in this crate.
If It is okay with using unsafe code, I'll add that change.

hex_encode              time:   [9.4555 us 9.4828 us 9.5106 us]                        
                        change: [-9.2447% -8.3200% -7.4582%] (p = 0.00 < 0.05)
                        Performance has improved.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some overlap with #66 here I think. In particular I suspect the combination of ExactSizeIterator will allow us to collect into a String efficiently without indirecting through Vec.

@yjhmelody
Copy link

Nice PR! But the owner of the repo seems to be inactive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

encode_to_slice with _upper and _lower variants
3 participants