Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim duplicates #324

Merged
merged 4 commits into from May 10, 2022
Merged

Trim duplicates #324

merged 4 commits into from May 10, 2022

Commits on May 10, 2022

  1. serialize: benchmark and test TrimDuplicates

    This is in preparation for enhancing performance of that function.
    pohly committed May 10, 2022
    Copy the full SHA
    e6d69a5 View commit details
    Browse the repository at this point in the history
  2. serialize: add MergeKVs as faster TrimDuplicates alternative

    TrimDuplicates is unnecessarily complex: it takes and returns a variable number
    of slices when in practice it is always called with a pair of slices and the
    result is always merged into a single slice. It is also slow and does a lot of
    allocations.
    
    MergeKVs is a simpler implementation that is good enough for merging the
    WithValues key/value slice with the key/value parameters. It intentionally
    skips some unnecessary error checking (the WithValues key/value slice was
    already sanitized), does not handle individual slices that themselves contain
    duplicates (because that can be avoided through code review and/or static code
    analysis), and does not handle all potential duplicates (in practice, all keys
    are string constants).
    
    Because of this, it is considerably faster:
    
    name                                   old time/op    new time/op    delta
    TrimDuplicates/0x0/no-duplicates-36       177ns ± 3%       2ns ± 0%   -98.76%  (p=0.008 n=5+5)
    ...
    TrimDuplicates/9x9/end-duplicate-36      14.1µs ± 1%     3.0µs ± 1%   -78.39%  (p=0.008 n=5+5)
    
    name                                   old alloc/op   new alloc/op   delta
    TrimDuplicates/0x0/no-duplicates-36       48.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
    ...
    TrimDuplicates/9x9/no-duplicates-36      4.35kB ± 0%    0.90kB ± 0%   -79.42%  (p=0.008 n=5+5)
    TrimDuplicates/9x9/all-duplicates-36     2.03kB ± 0%    0.90kB ± 0%   -55.91%  (p=0.008 n=5+5)
    TrimDuplicates/9x9/start-duplicate-36    4.71kB ± 0%    0.96kB ± 0%   -79.56%  (p=0.008 n=5+5)
    TrimDuplicates/9x9/end-duplicate-36      4.71kB ± 0%    0.96kB ± 0%      ~     (p=0.079 n=4+5)
    
    name                                   old allocs/op  new allocs/op  delta
    TrimDuplicates/0x0/no-duplicates-36        1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
    ...
    TrimDuplicates/9x9/end-duplicate-36        39.0 ± 0%       2.0 ± 0%   -94.87%  (p=0.008 n=5+5)
    pohly committed May 10, 2022
    Copy the full SHA
    e2432e8 View commit details
    Browse the repository at this point in the history
  3. klogr: simplify test failure message

    The mismatch is easier to see when the output is printed above
    each other instead of side-by-side.
    pohly committed May 10, 2022
    Copy the full SHA
    6f47a32 View commit details
    Browse the repository at this point in the history
  4. use faster MergeKVs

    Performance is better. This slightly changes the output of the traditional
    klogr format:
    - all key/value pairs gets sorted, not just each slice (an improvement?)
    - a single space is printed where previously two were printed
    
    klog no longer de-duplicates when the same key is passed twice in the same
    call (WithValues, InfoS, etc.). That should not occur in practice and can be
    prevented via code review and/or static code analysis, so we don't need to pay
    the price of checking for it at runtime.
    pohly committed May 10, 2022
    Copy the full SHA
    4c3943c View commit details
    Browse the repository at this point in the history