Benchmark refactor - argparse CLI #533

Erotemic · 2022-04-21T20:59:23Z

@bwoodsend This is a paired down version of #532 that is ready for review / merging. There are two main features added here:

Uses the standard library's cpuinfo module to print CPU details when reporting the results of the benchmarks. This will help ensure that users don't compare incomparable data and can serve as a reminder for what benchmarks can be compared against each other. (whoops)
Adds an argparse CLI so the user can specify specific modules to disable. I personally don't care about the other json modules that aren't a drop-in replacement for Python's json module. Also the user can specify a --factor as a fraction to speed up the benchmarks for easier hacking.

To add these two features I had to do a little bit of refactoring. I wanted to make it easy to add / remove any new implementation that might come onto the scene, so I registered each benchmark with it's associated library name using a decorator and am passing the list of library names that the user requested to test to the benchmark functions. (I was very tempted to factor out the global variables but I restrained myself).

Speaking of global variables, I made it such that the external libraries aren't imported until the script knows they are needed. I use importlib to implement this as a loop, and then I set those modules as global variables so the rest of the benchmark structure can work without modification.

Lastly, I added a note about what units the benchmarks were in, because I always confuse myself with that.

for more information, see https://pre-commit.ci

… into benchmark-flexibility

codecov-commenter · 2022-04-21T21:00:43Z

Codecov Report

Merging #533 (052add4) into main (b3f8754) will not change coverage.
The diff coverage is n/a.

❗ Current head 052add4 differs from pull request most recent head 3d25fb4. Consider uploading reports for the commit 3d25fb4 to get more accurate results

@@           Coverage Diff           @@
##             main     #533   +/-   ##
=======================================
  Coverage   91.75%   91.75%           
=======================================
  Files           6        6           
  Lines        1819     1819           
=======================================
  Hits         1669     1669           
  Misses        150      150

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3f8754...3d25fb4. Read the comment docs.

bwoodsend · 2022-04-21T21:11:49Z

Uses the standard library's cpuinfo module to print CPU details when reporting the results of the benchmarks.

> python -VV
Python 3.10.1
> python -c 'import cpuinfo'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'cpuinfo'

Are you sure it's standard lib?

for more information, see https://pre-commit.ci

Erotemic · 2022-04-21T21:32:56Z

I suppose I was mistaken. It looks like cpuinfo is coming from torch. It's not important enough to add a dependency. I'll remove that.

… into benchmark-flexibility

bwoodsend

I think a marginally nicer way to eliminate the global LIBRARIES variable would be to have a class which takes libraries as a parameter and stores it as an attribute, then make all the functions that now take a libraries parameter methods of that class instead. But only marginally nicer. Honestly, if I were that allergic to OOP-destroying global variables then I would have abandoned Python altogether in favour pedantically object oriented, but otherwise underwhelming at every turn, Java 🙃 .

tests/benchmark.py

Erotemic · 2022-04-22T23:04:12Z

I was considering a class that stored the parameters as an alternative. But as marginally nice as that is, this script isn't run as part of a larger system, so the globals only bother me on an aesthetic level, they don't have any major down side that I can see.

When I get around to working on #532 go get nice graphs for the benchmarks (and hopefully t-tests to indicate when there is a high probability regression), I'll be rewriting most of this, so I figured I'll just keep the changes here minimal.

hugovk

Thanks!

hugovk · 2022-04-25T09:11:53Z

Oops, this failed on the CI:

https://github.com/ultrajson/ultrajson/runs/6154659419?check_suite_focus=true

Please could you check it?

Let's also add this to run the benchmark on PRs changing the benchmark script or workflow file:

on:
  push:
    branches:
      - main
  pull_request:
    paths:
      - ".github/workflows/benchmark.yml"
      - "tests/benchmark.py"

Erotemic and others added 5 commits April 21, 2022 16:16

Generalize the way new json modules can be added to existing benchmarks

ccd3040

Added argparse CLI with ability to disable specific modules

32de298

Add note about units of benchmarks

78f24e2

[pre-commit.ci] auto fixes from pre-commit.com hooks

79a6122

for more information, see https://pre-commit.ci

Merge branch 'benchmark-flexibility' of github.com:Erotemic/ultrajson…

eb71ee8

… into benchmark-flexibility

[pre-commit.ci] auto fixes from pre-commit.com hooks

3eb7a19

for more information, see https://pre-commit.ci

Erotemic added 2 commits April 21, 2022 17:33

Remove cpuinfo

3850b93

Merge branch 'benchmark-flexibility' of github.com:Erotemic/ultrajson…

991bf12

… into benchmark-flexibility

Erotemic changed the title ~~Benchmark refactor - CPU info and argparse CLI~~ Benchmark refactor - argparse CLI Apr 21, 2022

bwoodsend reviewed Apr 22, 2022

View reviewed changes

tests/benchmark.py Outdated Show resolved Hide resolved

remove skip-lib-comps command

3d25fb4

bwoodsend approved these changes Apr 23, 2022

View reviewed changes

bwoodsend requested a review from hugovk April 23, 2022 10:13

hugovk approved these changes Apr 25, 2022

View reviewed changes

hugovk added the changelog: Changed For changes in existing functionality label Apr 25, 2022

hugovk merged commit a900e46 into ultrajson:main Apr 25, 2022

Erotemic mentioned this pull request Apr 25, 2022

Benchmark CI fixes #534

Merged

sync-by-unito bot mentioned this pull request Jul 11, 2022

Bump ujson from 4.3.0 to 5.4.0 in /sample-projects/streaming-audio/FastAPI/live-transcription-fastapi deepgram/deepgram-python-sdk#28

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark refactor - argparse CLI #533

Benchmark refactor - argparse CLI #533

Erotemic commented Apr 21, 2022 •

edited

codecov-commenter commented Apr 21, 2022 •

edited

bwoodsend commented Apr 21, 2022

Erotemic commented Apr 21, 2022

bwoodsend left a comment

Erotemic commented Apr 22, 2022

hugovk left a comment

hugovk commented Apr 25, 2022

Benchmark refactor - argparse CLI #533

Benchmark refactor - argparse CLI #533

Conversation

Erotemic commented Apr 21, 2022 • edited

codecov-commenter commented Apr 21, 2022 • edited

Codecov Report

bwoodsend commented Apr 21, 2022

Erotemic commented Apr 21, 2022

bwoodsend left a comment

Choose a reason for hiding this comment

Erotemic commented Apr 22, 2022

hugovk left a comment

Choose a reason for hiding this comment

hugovk commented Apr 25, 2022

Erotemic commented Apr 21, 2022 •

edited

codecov-commenter commented Apr 21, 2022 •

edited