Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize name parsing #1388

Merged
merged 4 commits into from Mar 8, 2021
Merged

Conversation

saethlin
Copy link
Contributor

@saethlin saethlin commented Feb 24, 2021

This is deliberately a draft. Looking to investigate an approach like flatvec too, but my instinct is that this approach is better (at the expense of making Name larger) because it avoid allocations entirely for a lot of common domain names; 4 labels or less all 16 bytes or less. We're going with the flatvec-style approach to make better use of the expanded stack footprint.

Before:

test bench_parse_real_message          ... bench:       4,219 ns/iter (+/- 38)

After:

test bench_parse_real_message          ... bench:       1,780 ns/iter (+/- 16)

@codecov
Copy link

codecov bot commented Feb 24, 2021

Codecov Report

Merging #1388 (c6ac502) into main (a6eb537) will decrease coverage by 0.01%.
The diff coverage is 96.47%.

@@            Coverage Diff             @@
##             main    #1388      +/-   ##
==========================================
- Coverage   85.21%   85.20%   -0.01%     
==========================================
  Files         153      152       -1     
  Lines       15016    15020       +4     
==========================================
+ Hits        12795    12797       +2     
- Misses       2221     2223       +2     

@saethlin
Copy link
Contributor Author

saethlin commented Mar 1, 2021

At this point, I'm stuck.

This implementation grows the size of Name significantly, because it adds stack buffers to avoid heap allocations. Avoiding allocations is the primary cause of improved performance here.

However, this size increase has caused ResolveErrorKind to grow significantly (the size was already too large, this PR makes it much worse). We could paper over the impact on the size of ResolveError by giving it the same #[cold] + Box treatment, but that doesn't address the lint, which is firing on an enum whose representation is public, so boxing its components would be a breaking change.

I also cannot change the representation of Name to use the flatvec approach, which would significantly improve the stack footprint. Name needs to contain a collection of Label somewhere because of its Index impl (Name::iter is fine because it yields slices of bytes). If I had my way I'd toss the Index impls.

@bluejekyll Do you have any suggestions?

crates/proto/src/error.rs Outdated Show resolved Hide resolved
.github/workflows/test.yml Outdated Show resolved Hide resolved
@djc
Copy link
Collaborator

djc commented Mar 2, 2021

Ditching the Index impls sounds good to me provided we don't rely on them too much in tree?

@djc
Copy link
Collaborator

djc commented Mar 4, 2021

BTW, I know smallvec is currently used but maybe we should take the chance to migrate to tinyvec. smallvec has a pretty bad security record, with 5 RustSec-reported vulnerabilities reported over the past three years.

@saethlin
Copy link
Contributor Author

saethlin commented Mar 4, 2021

I agree, but if we're looking to decrease the memory footprint of Name, going to tinyvec takes us in the wrong direction. If we're not worried overmuch about that, I'm all in. Perhaps I'm just reacting to a lint that we should add an allow for in this PR then fix soon after when we actually address the situation with errors. TinyVec can accept an inline buffer up to 24 bytes without growing, but itself has a minimum size of 32, where SmallVec has a minimum size of 24, the same as Vec.

@djc
Copy link
Collaborator

djc commented Mar 4, 2021

I'm personally fine with prioritizing security over memory usage.

@saethlin
Copy link
Contributor Author

saethlin commented Mar 5, 2021

As a nice side effect, this is also ~5% faster in a microbenchmark.

@saethlin saethlin marked this pull request as ready for review March 5, 2021 04:34
@bluejekyll bluejekyll mentioned this pull request Mar 6, 2021
@bluejekyll
Copy link
Member

Where are we on this? I reviewed. I like the changes better than the one I had been working on. Is there still a concern on additional stack space?

The previous implementation used Rc to represent Label, then composed
those in an array to represent Name. That produced a large number of
small allocations in the parsing code path. This new implementation
avoids allocations entirely for small names, and unless the name has a
very large number of labels, it is stored entirely in one allocation.

This also removes the Index impl for Name. Since we no longer contain
any Labels, we cannot implement that (a common problem with Index leaking
implementation details).
@saethlin
Copy link
Contributor Author

saethlin commented Mar 6, 2021

@bluejekyll I think this is good to go. I've rebased this into two commits to erase the previous implementation strategy and preserve the benchmark on its own so people can check it out and run against the old code. I also bumped one of the inline buffers up to 24 bytes because it's free.

I do not think there are any outstanding concerns about stack footprint. I think the current implementation is a good compromise, and the consequence it has on the size of the error types that was tripping lints is a bigger problem that I'm tackling in another PR.

Copy link
Collaborator

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great! Commented on some minor nits, but I definitely think we should merge something like this.

crates/proto/src/rr/domain/name.rs Outdated Show resolved Hide resolved
crates/proto/src/rr/domain/name.rs Show resolved Hide resolved
crates/proto/src/rr/domain/name.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to submit this earlier.

crates/resolver/src/error.rs Show resolved Hide resolved
@djc
Copy link
Collaborator

djc commented Mar 7, 2021

@bluejekyll I think this can just be merged, but it doesn't look like I have privileges to do so because CI failed (from what I can tell, spuriously). (Maybe remove the requirement that CI passes necessarily?)

@saethlin
Copy link
Contributor Author

saethlin commented Mar 7, 2021

CI failed (from what I can tell, spuriously). (Maybe remove the requirement that CI passes necessarily?)

I'm not convinced this is totally spurious? It's failed in the same way the last two commits in this PR 🤷

@bluejekyll
Copy link
Member

I kicked the CI jobs again, we'll see what happens.

@bluejekyll
Copy link
Member

FYI, I reran the tests twice, I think we're good. Merging in. Thanks for this PR and all the perf improvements!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants