Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: asm decode - use SSE instruction #544

Merged
merged 1 commit into from Mar 24, 2022

Conversation

WojciechMula
Copy link
Contributor

Part of task #515.

Benchmarks from an Ice Lake machine. It's a bit better.

Benchmark_seqdec_decodeNoBMI/n-12286-lits-13914-prev-9869-1990358-3296656-win-4194304.blk-16        128501        125128        -2.62%
Benchmark_seqdec_decodeNoBMI/n-12485-lits-6960-prev-976039-2250252-2463561-win-4194304.blk-16       134702        130647        -3.01%
Benchmark_seqdec_decodeNoBMI/n-14746-lits-14461-prev-209-8-1379909-win-4194304.blk-16               120971        119333        -1.35%
Benchmark_seqdec_decodeNoBMI/n-1525-lits-1498-prev-2009476-797934-2994405-win-4194304.blk-16        12804         12794         -0.08%
Benchmark_seqdec_decodeNoBMI/n-3478-lits-3628-prev-895243-2104056-2119329-win-4194304.blk-16        29078         28750         -1.13%
Benchmark_seqdec_decodeNoBMI/n-8422-lits-5840-prev-168095-2298675-433830-win-4194304.blk-16         83494         83368         -0.15%
Benchmark_seqdec_decodeNoBMI/n-1000-lits-1057-prev-21887-92-217-win-8388608.blk-16                  8079          7952          -1.57%
Benchmark_seqdec_decodeNoBMI/n-15134-lits-20798-prev-4882976-4884216-4474622-win-8388608.blk-16     164592        162022        -1.56%
Benchmark_seqdec_decodeNoBMI/n-2-lits-0-prev-620601-689171-848-win-8388608.blk-16                   52.4          52.7          +0.63%
Benchmark_seqdec_decodeNoBMI/n-90-lits-67-prev-19498-23-19710-win-8388608.blk-16                    791           789           -0.21%
Benchmark_seqdec_decodeNoBMI/n-931-lits-1179-prev-36502-1526-1518-win-8388608.blk-16                7793          7782          -0.14%
Benchmark_seqdec_decodeNoBMI/n-2898-lits-4062-prev-335-386-751-win-8388608.blk-16                   23069         22699         -1.60%
Benchmark_seqdec_decodeNoBMI/n-4056-lits-12419-prev-10792-66-309849-win-8388608.blk-16              32626         32118         -1.56%
Benchmark_seqdec_decodeNoBMI/n-8028-lits-4568-prev-917-65-920-win-8388608.blk-16                    75567         75241         -0.43%
Benchmark_seqdec_decode/n-12286-lits-13914-prev-9869-1990358-3296656-win-4194304.blk-16             111331        108619        -2.44%
Benchmark_seqdec_decode/n-12485-lits-6960-prev-976039-2250252-2463561-win-4194304.blk-16            114481        111280        -2.80%
Benchmark_seqdec_decode/n-14746-lits-14461-prev-209-8-1379909-win-4194304.blk-16                    108650        105387        -3.00%
Benchmark_seqdec_decode/n-1525-lits-1498-prev-2009476-797934-2994405-win-4194304.blk-16             11454         11091         -3.17%
Benchmark_seqdec_decode/n-3478-lits-3628-prev-895243-2104056-2119329-win-4194304.blk-16             26245         25501         -2.83%
Benchmark_seqdec_decode/n-8422-lits-5840-prev-168095-2298675-433830-win-4194304.blk-16              71110         69874         -1.74%
Benchmark_seqdec_decode/n-1000-lits-1057-prev-21887-92-217-win-8388608.blk-16                       7320          7221          -1.35%
Benchmark_seqdec_decode/n-15134-lits-20798-prev-4882976-4884216-4474622-win-8388608.blk-16          131987        130904        -0.82%
Benchmark_seqdec_decode/n-2-lits-0-prev-620601-689171-848-win-8388608.blk-16                        49.5          49.8          +0.63%
Benchmark_seqdec_decode/n-90-lits-67-prev-19498-23-19710-win-8388608.blk-16                         701           696           -0.74%
Benchmark_seqdec_decode/n-931-lits-1179-prev-36502-1526-1518-win-8388608.blk-16                     6886          6828          -0.84%
Benchmark_seqdec_decode/n-2898-lits-4062-prev-335-386-751-win-8388608.blk-16                        21162         20897         -1.25%
Benchmark_seqdec_decode/n-4056-lits-12419-prev-10792-66-309849-win-8388608.blk-16                   29856         29612         -0.82%
Benchmark_seqdec_decode/n-8028-lits-4568-prev-917-65-920-win-8388608.blk-16                         64311         63216         -1.70%

Benchmarks from an Ice Lake machine:

Benchmark_seqdec_decodeNoBMI/n-12286-lits-13914-prev-9869-1990358-3296656-win-4194304.blk-16        128501        125128        -2.62%
Benchmark_seqdec_decodeNoBMI/n-12485-lits-6960-prev-976039-2250252-2463561-win-4194304.blk-16       134702        130647        -3.01%
Benchmark_seqdec_decodeNoBMI/n-14746-lits-14461-prev-209-8-1379909-win-4194304.blk-16               120971        119333        -1.35%
Benchmark_seqdec_decodeNoBMI/n-1525-lits-1498-prev-2009476-797934-2994405-win-4194304.blk-16        12804         12794         -0.08%
Benchmark_seqdec_decodeNoBMI/n-3478-lits-3628-prev-895243-2104056-2119329-win-4194304.blk-16        29078         28750         -1.13%
Benchmark_seqdec_decodeNoBMI/n-8422-lits-5840-prev-168095-2298675-433830-win-4194304.blk-16         83494         83368         -0.15%
Benchmark_seqdec_decodeNoBMI/n-1000-lits-1057-prev-21887-92-217-win-8388608.blk-16                  8079          7952          -1.57%
Benchmark_seqdec_decodeNoBMI/n-15134-lits-20798-prev-4882976-4884216-4474622-win-8388608.blk-16     164592        162022        -1.56%
Benchmark_seqdec_decodeNoBMI/n-2-lits-0-prev-620601-689171-848-win-8388608.blk-16                   52.4          52.7          +0.63%
Benchmark_seqdec_decodeNoBMI/n-90-lits-67-prev-19498-23-19710-win-8388608.blk-16                    791           789           -0.21%
Benchmark_seqdec_decodeNoBMI/n-931-lits-1179-prev-36502-1526-1518-win-8388608.blk-16                7793          7782          -0.14%
Benchmark_seqdec_decodeNoBMI/n-2898-lits-4062-prev-335-386-751-win-8388608.blk-16                   23069         22699         -1.60%
Benchmark_seqdec_decodeNoBMI/n-4056-lits-12419-prev-10792-66-309849-win-8388608.blk-16              32626         32118         -1.56%
Benchmark_seqdec_decodeNoBMI/n-8028-lits-4568-prev-917-65-920-win-8388608.blk-16                    75567         75241         -0.43%
Benchmark_seqdec_decode/n-12286-lits-13914-prev-9869-1990358-3296656-win-4194304.blk-16             111331        108619        -2.44%
Benchmark_seqdec_decode/n-12485-lits-6960-prev-976039-2250252-2463561-win-4194304.blk-16            114481        111280        -2.80%
Benchmark_seqdec_decode/n-14746-lits-14461-prev-209-8-1379909-win-4194304.blk-16                    108650        105387        -3.00%
Benchmark_seqdec_decode/n-1525-lits-1498-prev-2009476-797934-2994405-win-4194304.blk-16             11454         11091         -3.17%
Benchmark_seqdec_decode/n-3478-lits-3628-prev-895243-2104056-2119329-win-4194304.blk-16             26245         25501         -2.83%
Benchmark_seqdec_decode/n-8422-lits-5840-prev-168095-2298675-433830-win-4194304.blk-16              71110         69874         -1.74%
Benchmark_seqdec_decode/n-1000-lits-1057-prev-21887-92-217-win-8388608.blk-16                       7320          7221          -1.35%
Benchmark_seqdec_decode/n-15134-lits-20798-prev-4882976-4884216-4474622-win-8388608.blk-16          131987        130904        -0.82%
Benchmark_seqdec_decode/n-2-lits-0-prev-620601-689171-848-win-8388608.blk-16                        49.5          49.8          +0.63%
Benchmark_seqdec_decode/n-90-lits-67-prev-19498-23-19710-win-8388608.blk-16                         701           696           -0.74%
Benchmark_seqdec_decode/n-931-lits-1179-prev-36502-1526-1518-win-8388608.blk-16                     6886          6828          -0.84%
Benchmark_seqdec_decode/n-2898-lits-4062-prev-335-386-751-win-8388608.blk-16                        21162         20897         -1.25%
Benchmark_seqdec_decode/n-4056-lits-12419-prev-10792-66-309849-win-8388608.blk-16                   29856         29612         -0.82%
Benchmark_seqdec_decode/n-8028-lits-4568-prev-917-65-920-win-8388608.blk-16                         64311         63216         -1.70%
Copy link
Owner

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything counts :)

@klauspost klauspost merged commit 0309165 into klauspost:master Mar 24, 2022
@WojciechMula WojciechMula deleted the asm-decode-use-sse-move branch April 1, 2022 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants