Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Write table clearing in a way that the compiler recognizes #702

Merged
merged 2 commits into from Nov 30, 2022

Commits on Nov 29, 2022

  1. zstd: Write table clearing in a way that the compiler recognizes

    Benchmark results on amd64 below. These do not take into account klauspost#701.
    They're for Go 1.19; Go 1.20 produces slightly better asm for the old
    code, but still produces pretty bad asm on 32-bit platforms.
    
    See also golang/go#56954.
    
    name                                 old speed      new speed       delta
    Encoder_EncodeAllXML-8                283MB/s ± 1%    284MB/s ± 0%     ~     (p=0.026 n=30+20)
    Encoder_EncodeAllSimple/fastest-8     111MB/s ± 0%    111MB/s ± 1%     ~     (p=0.011 n=28+20)
    Encoder_EncodeAllSimple/default-8    78.4MB/s ± 1%   78.3MB/s ± 1%     ~     (p=0.572 n=30+19)
    Encoder_EncodeAllSimple/better-8     65.9MB/s ± 1%   66.2MB/s ± 1%   +0.53%  (p=0.009 n=30+20)
    Encoder_EncodeAllSimple/best-8       11.1MB/s ± 1%   11.6MB/s ± 3%   +4.42%  (p=0.000 n=27+28)
    Encoder_EncodeAllSimple4K/fastest-8   911MB/s ± 1%    914MB/s ± 1%   +0.31%  (p=0.004 n=29+20)
    Encoder_EncodeAllSimple4K/default-8  73.1MB/s ± 1%   73.6MB/s ± 1%   +0.67%  (p=0.000 n=29+20)
    Encoder_EncodeAllSimple4K/better-8   60.5MB/s ± 1%   62.7MB/s ± 1%   +3.64%  (p=0.000 n=29+17)
    Encoder_EncodeAllSimple4K/best-8     8.62MB/s ± 3%  10.11MB/s ± 1%  +17.24%  (p=0.000 n=30+27)
    Encoder_EncodeAllHTML-8               133MB/s ± 1%    133MB/s ± 1%     ~     (p=0.101 n=30+19)
    Encoder_EncodeAllTwain-8             84.8MB/s ± 1%   86.2MB/s ± 3%   +1.63%  (p=0.000 n=24+20)
    Encoder_EncodeAllPi-8                62.6MB/s ± 1%   62.7MB/s ± 0%     ~     (p=0.102 n=30+20)
    Random4KEncodeAllFastest-8           2.50GB/s ± 1%   2.50GB/s ± 1%     ~     (p=0.449 n=29+20)
    Random10MBEncodeAllFastest-8         2.39GB/s ± 2%   2.52GB/s ± 6%   +5.23%  (p=0.000 n=27+20)
    
    name                                 old alloc/op   new alloc/op    delta
    Encoder_EncodeAllXML-8                  0.00B           0.00B          ~     (all equal)
    Encoder_EncodeAllSimple/fastest-8       2.73B ±27%      3.00B ± 0%     ~     (p=0.018 n=30+18)
    Encoder_EncodeAllSimple/default-8       4.00B ± 0%      4.00B ± 0%     ~     (all equal)
    Encoder_EncodeAllSimple/better-8        5.00B ± 0%      5.00B ± 0%     ~     (all equal)
    Encoder_EncodeAllSimple/best-8          19.5B ± 3%      19.0B ± 0%   -2.40%  (p=0.000 n=30+24)
    Encoder_EncodeAllSimple4K/fastest-8     0.00B           0.00B          ~     (all equal)
    Encoder_EncodeAllSimple4K/default-8     0.00B           0.00B          ~     (all equal)
    Encoder_EncodeAllSimple4K/better-8      0.00B           0.00B          ~     (all equal)
    Encoder_EncodeAllSimple4K/best-8        2.00B ± 0%      1.43B ±40%  -28.33%  (p=0.000 n=30+30)
    Encoder_EncodeAllHTML-8                 2.37B ±27%      2.25B ±33%     ~     (p=0.398 n=30+20)
    Encoder_EncodeAllTwain-8                0.00B           0.00B          ~     (all equal)
    Encoder_EncodeAllPi-8                   12.4B ± 5%      12.2B ± 6%     ~     (p=0.283 n=30+20)
    Random4KEncodeAllFastest-8              0.00B           0.00B          ~     (all equal)
    Random10MBEncodeAllFastest-8           31.9kB ± 2%     30.5kB ± 9%   -4.27%  (p=0.002 n=28+20)
    greatroar committed Nov 29, 2022
    Copy the full SHA
    9f90c56 View commit details
    Browse the repository at this point in the history

Commits on Nov 30, 2022

  1. Copy the full SHA
    9583c33 View commit details
    Browse the repository at this point in the history