Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configdata.yml performance issues / switch away from PyYAML #2777

Open
The-Compiler opened this issue Jul 4, 2017 · 23 comments
Open

configdata.yml performance issues / switch away from PyYAML #2777

The-Compiler opened this issue Jul 4, 2017 · 23 comments
Labels
component: config Issues related to configuration. component: performance Issues with performance/benchmarking. priority: 1 - middle Issues which should be done at some point, but aren't that important.
Milestone

Comments

@The-Compiler
Copy link
Member

With the current new-config branch, test_init_benchmark in test_configdata.py takes (median):

  • ~20ms with the PyYAML C extension on my machine
  • ~200ms without the C extension on my machine
  • ~20s (?!) on Travis (without C extension, I think)

Even if this is a travis only thing, 200ms+ on each start is quite a hard hit to take. I can think of a couple of solutions:

  • Try if something improves with ruamel.yaml (Switch to ruamel.yaml #2745)
  • Log a warning if C extensions weren't found (where are they available? Where not? Can we install a wheel from PyPI with them?)
  • "Compile" the YAML to a Python file at development time. Since we already need to regenerate the docs anyways, probably not a too big issue, and easiest for users
  • "Compile" the YAML to a Python (or JSON, if loading that is fast?) file at the first start. That needs some code to detect when configdata.yml has changed and stuff though.
  • "Compile" the YAML to a Python file when packaging, similar to asciidoc2html.py. If it's unavailable, load the yml file and show warnings as per above. Kind of difficult to figure out whether data is outdated or not.
@The-Compiler The-Compiler added config revolution component: performance Issues with performance/benchmarking. priority: 1 - middle Issues which should be done at some point, but aren't that important. labels Jul 4, 2017
@The-Compiler The-Compiler added this to the v1.0.0 milestone Jul 4, 2017
@The-Compiler The-Compiler added this to Backlog in Config revolution Jul 6, 2017
The-Compiler added a commit that referenced this issue Oct 4, 2017
@The-Compiler
Copy link
Member Author

This is still quite an issue, and I don't know of a solution as long as we're keeping the YAML file.

However, I think .ini files could actually work quite nicely for this purpose!

[bindings.key_mappings]
default = {
    "<Ctrl-[>": "<Escape>",
    "<Ctrl-6>": "<Ctrl-^>",
    "<Ctrl-M>": "<Return>",
    "<Ctrl-J>": "<Return>",
    "<Shift-Return>": "<Return>",
    "<Enter>": "<Return>",
    "<Shift-Enter>": "<Return>",
    "<Ctrl-Enter>": "<Ctrl-Return>"
  }
type_name = Dict
type_keytype = Key
type_valtype = Key
desc = This setting can be used to map keys to other keys.

    When the key used as dictionary-key is pressed, the binding for the key used
    as dictionary-value is invoked instead.

    This is useful for global remappings of keys, for example to map Ctrl-[ to
    Escape.

    Note that when a key is bound (via `bindings.default` or
    `bindings.commands`), the mapping is ignored.
>>> import configparser
>>> cp = configparser.ConfigParser()
>>> cp.read('configdata.ini')
['configdata.ini']
>>> import json
>>> json.loads(cp['bindings.key_mappings']['default'])
{'<Ctrl-[>': '<Escape>', '<Ctrl-6>': '<Ctrl-^>', '<Ctrl-M>': '<Return>', '<Ctrl-J>': '<Return>', '<Shift-Return>': '<Return>', '<Enter>': '<Return>', '<Shift-Enter>': '<Return>', '<Ctrl-Enter>': '<Ctrl-Return>'}
>>> print(cp['bindings.key_mappings']['desc'])
This setting can be used to map keys to other keys.

When the key used as dictionary-key is pressed, the binding for the key used
as dictionary-value is invoked instead.

This is useful for global remappings of keys, for example to map Ctrl-[ to
Escape.

Note that when a key is bound (via `bindings.default` or
`bindings.commands`), the mapping is ignored.

It requires a bit more parsing on our side (decoding default as JSON, and handling type_* keys), but it might be the least worst solution and still is easy to write.

@The-Compiler
Copy link
Member Author

The-Compiler commented Oct 10, 2017

I still hate the idea of making this an .ini (or .json or .py), I want to dig a bit deeper first.

Looks like the following environments have C extensions:

  • PyPI-installs on Travis
  • Archlinux docker containers (fixed by installing gcc and libyaml)

And the following don't:

  • macOS on Travis
  • AppVeyor

However, they took <3s to load the configdata.yml. It's still a lot, but miles away from the 20s I saw earlier. It looks like they're loading configdata more often than needed, though.

The-Compiler added a commit to qutebrowser/docker-ci that referenced this issue Oct 11, 2017
The-Compiler added a commit that referenced this issue Oct 11, 2017
The-Compiler added a commit that referenced this issue Oct 11, 2017
The-Compiler added a commit that referenced this issue Oct 11, 2017
The-Compiler added a commit that referenced this issue Oct 11, 2017
The-Compiler added a commit that referenced this issue Oct 11, 2017
The-Compiler added a commit that referenced this issue Oct 11, 2017
@The-Compiler
Copy link
Member Author

The master branch now shows a warning when it take longer than 1s (or 3s on the CI), pointing to this issue. Let's leave it at that for v1.0.0 for now and see if it's a problem in practice.

Also, all environments on Travis now have C extensions for PyYAML available.

@The-Compiler The-Compiler removed this from the v1.0.0 milestone Oct 11, 2017
@The-Compiler
Copy link
Member Author

In #3177 (comment) apparently @lejar has seen this while running the testsuite.

@lejar Do you see this when starting qutebrowser as well, and does it actually take 14+ seconds? What if you start it via ./.tox/py36/bin/python -m qutebrowser --temp-basedir?

@lejar
Copy link
Contributor

lejar commented Oct 30, 2017

@The-Compiler Nope when I run it normally it is pretty snappy (start < 2 sec). Do note though that I've been doing all of this in a vmware image of ubuntu on a windows 10 host, with the vm being stored on a 7200 rpm hdd (though I'd hope that most of what the tests need is already in ram). It has 4 cores and 4gb ram.

I ran the tests again with htop open and by the time the tests get to the yaml errors, it's using about 13.5% ram and 100% of one core. On the windows (host) side I'm monitoring disk IO but it is really minimal.

One thing I did notice is that the reported failure duration increases with every failure:

Failed: Got logging message on logger misc with level WARNING: YAML load took unusually long, please report this at https://github.com/qutebrowser/qutebrowser/issues/2777 duration: 2.535158s
5.506342s
8.392097s
11.435129s
14.344682s
17.371652s
20.29147s

@The-Compiler
Copy link
Member Author

Hm, that is a very interesting observation! Can you please pastebin the full output you get (e.g. via tox -e py36 -- --qute-bdd-webengine 2>&1 | tee tox.log or so)?

@lejar
Copy link
Contributor

lejar commented Oct 31, 2017

I killed it after a couple hours. tox.log

@The-Compiler
Copy link
Member Author

@lejar I've taken a first look at the log but I haven't figured out what's going on yet. I plan to get back to it somewhen this weekend though.

@seelaman
Copy link
Contributor

same issue

Ubuntu 17.10, fresh install with tox

$ git rev-parse HEAD
2ef6e74

tox.log

@mjauvin
Copy link
Contributor

mjauvin commented Jan 26, 2018

I got this warning:

E       Failed: Got logging message on logger misc with level WARNING: YAML load took unusually long, please report this at https://github.com/qutebrowser/qutebrowser/issues/2777
E       duration: 70.800697s
E       PyYAML version: 3.12
E       C extension: False
E       Stack:
E       
E         File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
E           "__main__", mod_spec)
E         File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
E           exec(code, run_globals)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pytest.py", line 73, in <module>
E           raise SystemExit(pytest.main())
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/config.py", line 59, in main
E           return config.hook.pytest_cmdline_main(config=config)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 617, in __call__
E           return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 222, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 216, in <lambda>
E           firstresult=hook.spec_opts.get('firstresult'),
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/callers.py", line 180, in _multicall
E           res = hook_impl.function(*args)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/main.py", line 134, in pytest_cmdline_main
E           return wrap_session(config, _main)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/main.py", line 103, in wrap_session
E           session.exitstatus = doit(config, session) or 0
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/main.py", line 141, in _main
E           config.hook.pytest_runtestloop(session=session)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 617, in __call__
E           return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 222, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 216, in <lambda>
E           firstresult=hook.spec_opts.get('firstresult'),
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/callers.py", line 180, in _multicall
E           res = hook_impl.function(*args)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/main.py", line 164, in pytest_runtestloop
E           item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 617, in __call__
E           return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 222, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 216, in <lambda>
E           firstresult=hook.spec_opts.get('firstresult'),
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/callers.py", line 180, in _multicall
E           res = hook_impl.function(*args)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 62, in pytest_runtest_protocol
E           runtestprotocol(item, nextitem=nextitem)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 70, in runtestprotocol
E           rep = call_and_report(item, "setup", log)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 155, in call_and_report
E           call = call_runtest_hook(item, when, **kwds)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 175, in call_runtest_hook
E           return CallInfo(lambda: ihook(item=item, **kwds), when=when)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 189, in __init__
E           self.result = func()
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 175, in <lambda>
E           return CallInfo(lambda: ihook(item=item, **kwds), when=when)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 617, in __call__
E           return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 222, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 216, in <lambda>
E           firstresult=hook.spec_opts.get('firstresult'),
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/callers.py", line 180, in _multicall
E           res = hook_impl.function(*args)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 100, in pytest_runtest_setup
E           item.session._setupstate.prepare(item)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/runner.py", line 493, in prepare
E           col.setup()
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/python.py", line 1175, in setup
E           fixtures.fillfixtures(self)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 242, in fillfixtures
E           request._fillfixtures()
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 385, in _fillfixtures
E           item.funcargs[argname] = self.getfixturevalue(argname)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 427, in getfixturevalue
E           return self._get_active_fixturedef(argname).cached_result[0]
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 453, in _get_active_fixturedef
E           result = self._getfixturevalue(fixturedef)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 518, in _getfixturevalue
E           val = fixturedef.execute(request=subrequest)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 791, in execute
E           return hook.pytest_fixture_setup(fixturedef=self, request=request)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 617, in __call__
E           return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 222, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/__init__.py", line 216, in <lambda>
E           firstresult=hook.spec_opts.get('firstresult'),
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/pluggy/callers.py", line 180, in _multicall
E           res = hook_impl.function(*args)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 822, in pytest_fixture_setup
E           result = call_fixture_func(fixturefunc, request, kwargs)
E         File "/home/marc/src/qutebrowser/.tox/py36-pyqt59/lib/python3.6/site-packages/_pytest/fixtures.py", line 714, in call_fixture_func
E           res = fixturefunc(**kwargs)
E         File "/home/marc/src/qutebrowser/tests/unit/config/test_config.py", line 38, in configdata_init
E           configdata.init()
E         File "/home/marc/src/qutebrowser/qutebrowser/config/configdata.py", line 246, in init
E           DATA, MIGRATIONS = _read_yaml(utils.read_file('config/configdata.yml'))
E         File "/home/marc/src/qutebrowser/qutebrowser/config/configdata.py", line 193, in _read_yaml
E           data = utils.yaml_load(yaml_data)
E         File "/home/marc/src/qutebrowser/qutebrowser/utils/utils.py", line 902, in yaml_load
E           ''.join(traceback.format_stack())))
E       !

tests/helpers/logfail.py:61: Failed

edited by @The-Compiler to add proper code tags

@The-Compiler
Copy link
Member Author

I think I finally tracked this down. It only affects the testsuite and shouldn't have affected real usage.

The-Compiler added a commit that referenced this issue Feb 10, 2018
@hjek
Copy link

hjek commented Jul 20, 2018

Got a message telling me to report this issue here. Running Qutebrowser 1.3.3 on FreeBSD.

23:57:51 WARNING: YAML load took unusually long, please report this at https://github.com/qutebrowser/qutebrowser/issues/2777                                                                              
duration: 5.25404s                                                                                    PyYAML version: 3.12                                                                                 
C extension: False                                                                                    Stack:                                                                                               
                                                                                                        File "/usr/local/bin/qutebrowser", line 11, in <module>                                            
    load_entry_point('qutebrowser==1.3.3', 'gui_scripts', 'qutebrowser')()                              File "/usr/local/lib/python3.6/site-packages/qutebrowser/qutebrowser.py", line 188, in main        
    return app.run(args)                                                                                File "/usr/local/lib/python3.6/site-packages/qutebrowser/app.py", line 101, in run                 
    configinit.early_init(args)                                                                         File "/usr/local/lib/python3.6/site-packages/qutebrowser/config/configinit.py", line 40, in early_in
it                                                                                                        configdata.init()                                                                                
  File "/usr/local/lib/python3.6/site-packages/qutebrowser/config/configdata.py", line 257, in init       DATA, MIGRATIONS = _read_yaml(utils.read_file('config/configdata.yml'))                          
  File "/usr/local/lib/python3.6/site-packages/qutebrowser/config/configdata.py", line 200, in _read_yaml                                                                                                  
    data = utils.yaml_load(yaml_data)                                                                   File "/usr/local/lib/python3.6/site-packages/qutebrowser/utils/utils.py", line 657, in yaml_load   
    ''.join(traceback.format_stack()))) 

@The-Compiler
Copy link
Member Author

Reopening as this is still an issue for some people. Probably time to switch to TOML or JSON. Maybe we might want to get rid of PyYAML entirely seeing that it repeatedly had issues with its maintainance so far.

@The-Compiler The-Compiler reopened this Aug 17, 2018
@The-Compiler The-Compiler changed the title configdata.yml performance issues configdata.yml performance issues / switch away from PyYAML Aug 17, 2018
@jgkamat
Copy link
Member

jgkamat commented Jan 30, 2019

I would vote for json as well, if we pass some flags before dumping it to a file, it should be as easy to read in plaintext as well. With it being in core, we can count on it being fast everywhere (I hope...).

@hjek way too late for a response, but if you change build options to build with --with-libyaml, the problem should go away for you. I'm guessing that's PORT_OPTIONS:MLIBYAML.

@The-Compiler
Copy link
Member Author

The-Compiler commented Feb 17, 2019

Dumping some more thoughts here.

@jgkamat
Copy link
Member

jgkamat commented Feb 17, 2019 via email

@The-Compiler
Copy link
Member Author

Maybe we should try profiling both of them against the session/configdata.yml
files and see which one performs better, since they are both noticeable in
overall qutebrowser profiles.

I quickly talked to @jgkamat in IRC about this - I agree for configdata.yml performance is quite important (as it's rather big), but for the sessions, that's just the rather broken autosave. With #4545/#4087 and #3918 that shouldn't really matter anymore.

@zabbal
Copy link

zabbal commented Jul 10, 2020

TOML might be nice, not sure about popularity

It's used for systemd units so it's almost impossible to find GNU/Linux machine nowadays without some TOML on it. Given that python seems to be converging on pyproject.toml it'll become even more commonly used in the near future.

@The-Compiler
Copy link
Member Author

@zabbal systemd uses something ini-like which is not TOML. Also, I was mainly talking about the toml Python library above.

@The-Compiler
Copy link
Member Author

While I still think this important, I don't think it should block v2.0.0. We'll need PyYAML as a dependency for some while even after #5359 is done, so this can be done at a later point as well (with then fully dropping PyYAML in v3.0.0).

@The-Compiler The-Compiler modified the milestones: v2.0.0, v3.0.0 Jan 14, 2021
@The-Compiler The-Compiler added the component: config Issues related to configuration. label Apr 24, 2021
@gadefox
Copy link

gadefox commented Aug 22, 2021

Any idea? Debian 11, the libyaml package is installed..

WARNING: YAML load took unusually long, please report this at #2777
duration: 8.551703s
PyYAML version: 5.4.1
C extension: False
Stack:
File "/home/ren/.local/bin/qutebrowser", line 8, in sys.exit(main())
File "/home/ren/.local/lib/python3.9/site-packages/qutebrowser/qutebrowser.py", line 220, in main return app.run(args)
File "/home/ren/.local/lib/python3.9/site-packages/qutebrowser/app.py", line 91, in run configinit.early_init(args)
File "/home/ren/.local/lib/python3.9/site-packages/qutebrowser/config/configinit.py", line 42, in early_init configdata.init()
File "/home/ren/.local/lib/python3.9/site-packages/qutebrowser/config/configdata.py", line 275, in init DATA, MIGRATIONS = _read_yaml(resources.read_file('config/configdata.yml'))
File "/home/ren/.local/lib/python3.9/site-packages/qutebrowser/config/configdata.py", line 217, in _read_yaml data = utils.yaml_load(yaml_data)
File "/home/ren/.local/lib/python3.9/site-packages/qutebrowser/utils/utils.py", line 692, in yaml_load ''.join(traceback.format_stack())))

@toofar
Copy link
Member

toofar commented Aug 23, 2021

You might need libyaml-dev installed (and then re-install PyYAML in the virtualenv) for it to build the C extensions

@gadefox
Copy link

gadefox commented Aug 23, 2021

Thanks.. I used the old fashion way (python3 setup.py build), because pip downloads the latest version (5.4.1) from PyPi.org and the tarball doesn't contain source code at all (so what?!). Download from pyyaml.org instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: config Issues related to configuration. component: performance Issues with performance/benchmarking. priority: 1 - middle Issues which should be done at some point, but aren't that important.
Projects
No open projects
Development

No branches or pull requests

9 participants