Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io_buffer.rb, request.rb - improve handling with a body array/enumeration #2696

Closed
wants to merge 1 commit into from

Conversation

MSP-Greg
Copy link
Member

@MSP-Greg MSP-Greg commented Sep 11, 2021

Description

Improve the speed of assembling the response. Tested using the script in PR #2695, running 3 sets, with the rackup files to generate an array body response, a chunked body response, and a single string element body response. All three were approx 50kB in size, the first two having ~ 50 'elements'.

Note that the current code has a small memory leak with the array and chunked rackup files, which does not occur with the PR code.

Summarizing the req/sec (RPS), there is significant improvement:

        Array   Chunked   String
PR      12210    11017     12207
Master   2717     1312     10507

The detail below also shows improvement in wrk's 'Request time distribution'.

wrk overload results
benchmarks/local/bench_overload_wrk.sh -s tcp -w2 -t5:5 -b50 -c5 -R test/rackup/ci_array.ru

PR
────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50    12119    1.2   21.1   33.1   41.0   51.1   0.09   183071    183071     0
 13    65    12187    1.2   29.2   45.1   55.1   64.0   0.19   184105    184105     0
 17    85    12205    1.3   39.7   61.0   73.8   81.6   0.16   184399    184399     0
 23   115    12286    1.5   55.6   84.8  102.3  111.9   0.15   185727    185727     0
 30   150    12253    1.8   74.1  112.7  135.7  153.9   0.26   185162    185162     0
             12210                                    Totals   922464    922464
══════════════════════════════════════════════════════════════════════════════════════

Master
────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50     2692    5.6   81.8  131.6  157.2  173.5   1.03    40681     40681     0
 13    65     2726    5.7  114.1  179.0  212.3  229.0   1.92    41190     41190     0
 17    85     2719    6.0  156.5  244.7  286.1  298.4   1.39    41104     41104     0
 23   115     2732    6.9  221.7  342.3  396.4  414.0   2.29    41310     41310     0
 30   150     2719    7.0  298.8  458.7  528.1  552.1   1.15    41252     41252     0
              2717                                    Totals   205537    205537
══════════════════════════════════════════════════════════════════════════════════════


benchmarks/local/bench_overload_wrk.sh -s tcp -w2 -t5:5 -b50 -c5 -R test/rackup/ci_chunked.ru

PR
────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50    10978    1.3   23.3   36.6   45.3   53.9   0.14   165852    165852     0
 13    65    11025    1.4   32.3   49.9   60.9   79.8   0.07   166580    166580     0
 17    85    10988    1.5   44.1   67.8   82.1   91.5   0.17   166002    166002     0
 23   115    11050    1.6   61.6   94.1  113.4  129.2   0.16   166977    166977     0
 30   150    11044    1.8   81.8  124.3  149.9  161.6   0.24   166912    166912     0
             11017                                    Totals   832323    832323
══════════════════════════════════════════════════════════════════════════════════════

Master
────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50     1298     10    167    272    324    352   1.31    19631     19631     0
 13    65     1298     11    237    373    444    469   1.14    19654     19654     0
 17    85     1294     11    327    512    599    625   1.68    19593     19593     0
 23   115     1321     12    455    705    822    847   1.44    20006     20006     0
 30   150     1350     12    602    880   1080   1120   2.23    20482     20482     0
              1312                                    Totals    99366     99366
══════════════════════════════════════════════════════════════════════════════════════


benchmarks/local/bench_overload_wrk.sh -s tcp -w2 -t5:5 -b50 -c5 (uses ci_string.ru)

PR
────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50    12131    1.2   21.1   33.2   41.1   50.1   0.02   183259    183259     0
 13    65    12195    1.2   29.2   45.2   55.2   62.9   0.22   184263    184263     0
 17    85    12188    1.4   39.9   61.2   74.3   84.2   0.17   184156    184156     0
 23   115    12272    1.5   55.7   85.0  102.5  118.6   0.23   185429    185429     0
 30   150    12249    1.8   74.2  113.0  135.9  149.5   0.10   185193    185193     0
             12207                                    Totals   922300    922300
══════════════════════════════════════════════════════════════════════════════════════

Master
────────wrk────────  ─Request─time─distribution─(ms)─  Worker─requests  ─wrk─requests─
 -t    -c   req/sec   50%    75%    90%    99%   100%  spread   total     total   bad
 10    50    10490    1.4   24.5   38.6   48.0   57.7   0.42   158487    158487     0
 13    65    10535    1.5   34.2   52.9   64.9   77.7   0.13   159147    159147     0
 17    85    10509    1.6   46.7   71.8   87.3   96.7   0.20   158796    158796     0
 23   115    10489    1.8   65.4   99.7  120.6  131.6   0.09   158505    158505     0
 30   150    10512    2.3   87.4  132.8  160.2  173.0   0.48   158869    158869     0
             10507                                    Totals   793804    793804
══════════════════════════════════════════════════════════════════════════════════════

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@MSP-Greg
Copy link
Member Author

PR #2595 (08-Apr) included code to test performance with three types of response bodies, array, chunked, and string. Quite a few code optimizations later, I've ran tests on the code here.

The code was run with -w2 -t5:5, using an abstract UNIXSocket, which isn't available with wrk. The speed increase varies, but is significant with larger array or chunked bodies.

Note that the code uses files from both #2694 and #2695, so I'll wait for those...

Speed comparison PR vs Master

This PR:

Request / Response time distribution in mS
10000 requests - 10 loops of 100 clients * 10 requests per client

───────────────────────────────────────────────────────────────────────────────────    1kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array   9823  0.300  0.330  0.386  0.425  0.551  0.852  0.994  1.209  1.356  1.742
 chunk   9902  0.293  0.325  0.388  0.458  0.629  0.865  0.995  1.187  1.320  1.618
string   9909  0.294  0.326  0.384  0.434  0.582  0.862  1.007  1.226  1.364  1.678

───────────────────────────────────────────────────────────────────────────────────   10kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array   9650  0.314  0.346  0.399  0.434  0.509  0.851  1.011  1.238  1.381  1.716
 chunk   9559  0.310  0.345  0.408  0.465  0.619  0.886  1.022  1.222  1.362  1.674
string   9696  0.307  0.341  0.400  0.443  0.556  0.866  1.011  1.216  1.368  1.664

───────────────────────────────────────────────────────────────────────────────────  100kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array   6185  0.830  0.890  0.980  1.020  1.070  1.330  1.570  1.840  2.020  2.800
 chunk   5754  0.920  0.990  1.070  1.110  1.180  1.520  1.720  1.960  2.110  2.460
string   6254  0.830  0.900  0.980  1.020  1.070  1.390  1.600  1.860  2.020  2.490

300 requests - 10 loops of 15 clients * 2 requests per client
─────────────────────────────────────────────────────────────────────────────────── 2050kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array     98   87.3   89.9   92.9   94.3   96.2  100.8  107.9  123.8  131.8  142.4
 chunk     98   86.6   89.9   94.7   96.6   98.2  103.2  107.1  112.5  117.9  222.8
string    111   77.5   81.0   84.3   85.6   87.4   90.7   94.4   96.7   98.0  102.9

Master:

Request / Response time distribution in mS
10000 requests - 10 loops of 100 clients * 10 requests per client

───────────────────────────────────────────────────────────────────────────────────    1kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array   8722  0.306  0.349  0.571  0.726  0.816  0.973  1.185  1.391  1.554  1.941
 chunk   7284  0.327  0.494  0.810  0.898  1.074  1.384  1.639  1.852  1.965  2.243
string   9427  0.302  0.337  0.413  0.541  0.712  0.902  1.045  1.251  1.395  1.643

───────────────────────────────────────────────────────────────────────────────────   10kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array   6590  0.340  0.430  0.890  1.090  1.280  1.590  1.840  2.040  2.210  2.570
 chunk   3963  0.724  1.287  1.893  2.149  2.375  2.959  3.416  3.800  4.043  4.604
string   9261  0.314  0.349  0.424  0.539  0.711  0.917  1.067  1.284  1.424  1.707

───────────────────────────────────────────────────────────────────────────────────  100kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array   1744   2.96   3.93   4.96   5.37   5.79   6.70   7.40   8.01   8.41   9.11
 chunk    600  12.40  13.70  15.50  16.20  17.00  18.80  20.10  21.20  21.90  23.30
string   6150   0.84   0.90   0.98   1.02   1.09   1.46   1.64   1.89   2.05   2.44

300 requests - 10 loops of 15 clients * 2 requests per client
─────────────────────────────────────────────────────────────────────────────────── 2050kB Body
       req/sec   10%    20%    40%    50%    60%    80%    90%    95%    97%    99%
 array     74  117.7  121.9  126.1  127.9  130.5  135.7  139.4  142.2  146.7  157.9
 chunk     31  274.4  298.5  311.9  316.8  320.9  330.5  339.5  346.4  350.9  371.5
string    117   74.2   76.8   79.9   81.3   82.9   86.2   89.1   90.8   92.9   94.3

@MSP-Greg
Copy link
Member Author

MSP-Greg commented Jan 1, 2022

I've been working on all these files, adding comments, renaming methods, fixing some bad initial decisions for parameters, etc.

Closing, almost finished with updates. Happy New Year!

@MSP-Greg MSP-Greg closed this Jan 1, 2022
@MSP-Greg MSP-Greg deleted the 00-response-refactor branch June 5, 2022 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf refactor waiting-for-review Waiting on review from anyone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants