Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite based local cache #10326

Closed
dkarlovi opened this issue Dec 2, 2021 · 4 comments · Fixed by #10336
Closed

SQLite based local cache #10326

dkarlovi opened this issue Dec 2, 2021 · 4 comments · Fixed by #10336
Labels
Milestone

Comments

@dkarlovi
Copy link

dkarlovi commented Dec 2, 2021

Testing the completion feature from #10320, the one thing that stands out is completion for available packages being rather slow, at times even seeming like it's not working (a second or two pass between the interaction and the response).

This is somewhat expected since it's based on current search feature. Testing it, I'm getting typical responses of about 800msec, going up to 1500msec. This, while it works, is not great. This might improve if the search gets "vendor only" mode as discussed in #10325, but we'll still need to use the current name-only search once the vendor is established.

I'm wondering if it would be possible to have some sort of SQLite based, actionable local metadata index for simpler operations, sort of like DNF and APT do too? It would be an optional extension, only if ext-sqlite is found would it be enabled / used.

It would work sort of like:

  • pull the basic information (like vendors and packages), this would be used for a while and occasionally (say, 1h TTL) a diff would be fetched from Packagist
  • this could be enough info to power most of the general stuff, it could power vendor & name-only searches, maybe other stuff too

Alternative is of course to improve the search speed on Packagist itself, but I'd expect that part isn't exactly trivial. Interestingly, name-only search is 50% slower for me than regular search, I guess the latter one is powered by Algolia?

@GromNaN
Copy link
Contributor

GromNaN commented Dec 3, 2021

I benched the command composer search --only-name ^composer/ -vvv.

Full output
Running 2.2-dev+f5ffedfe60b5b0043c368b91e656288517aad0d9 (2021-11-30 13:33:38) with PHP 8.0.12 on Darwin / 20.2.0
Loading config file /root/.composer/config.json
Loading config file /root/.composer/auth.json
Reading /root/.composer/composer.json
Loading config file /root/.composer/config.json
Loading config file /root/.composer/auth.json
Loading config file /root/.composer/composer.json (/root/.composer/composer.json)
Loading config file /root/.composer/auth.json
Reading /root/.composer/auth.json
Checked CA file /usr/local/etc/ca-certificates/cert.pem: valid
Executing command (/root/.composer): git branch -a --no-color --no-abbrev -v
Executing command (/root/.composer): git describe --exact-match --tags
Executing command (CWD): git --version
Executing command (/root/.composer): git log --pretty="%H" -n1 HEAD --no-show-signature
Executing command (/root/.composer): hg branch
Executing command (/root/.composer): fossil branch list
Executing command (/root/.composer): fossil tag list
Executing command (/root/.composer): svn info --xml
Reading /root/.composer/vendor/composer/installed.json
Loading plugin PackageVersions\Installer (from composer/package-versions-deprecated)
Loading config file /root/.composer/config.json
Loading config file /root/.composer/auth.json
Executing command (/root): git branch -a --no-color --no-abbrev -v
Executing command (/root): git describe --exact-match --tags
Executing command (/root): git log --pretty="%H" -n1 HEAD --no-show-signature
Executing command (/root): hg branch
Executing command (/root): fossil branch list
Executing command (/root): fossil tag list
Executing command (/root): svn info --xml
Reading /root/.composer/composer.json
Loading config file /root/.composer/config.json
Loading config file /root/.composer/auth.json
Loading config file /root/.composer/composer.json (/root/.composer/composer.json)
Loading config file /root/.composer/auth.json
Reading /root/.composer/auth.json
Reading /root/.composer/vendor/composer/installed.json
Loading plugin PackageVersions\Installer_composer_tmp0 (from composer/package-versions-deprecated, installed globally)
Downloading https://repo.packagist.org/packages.json
[200] https://repo.packagist.org/packages.json
Writing /root/.composer/cache/repo/https---repo.packagist.org/packages.json into cache
Downloading https://packagist.org/packages/list.json
[200] https://packagist.org/packages/list.json
composer/ca-bundle
composer/composer
composer/installers
composer/metadata-minifier
composer/package-versions-deprecated
composer/pcre
composer/satis
composer/semver
composer/spdx-licenses
composer/xdebug-handler`

Blackfire trace: https://blackfire.io/profiles/ba2b0131-4cc5-4d30-af89-98bc2a92490c/graph

  1. Listing is always re-downloaded, even with consecutive runs.
  • https://repo.packagist.org/packages.json has no cache cache-control: private, max-age=0, no-cache, while file is written (and never read?).
  • https://packagist.org/packages/list.json has 5 minutes shared-cache cache-control: public, s-maxage=300 but nothing local.
  • It should be possible to add etag or last-modified in order to leverage 304 responses.
  1. 324 200 calls to preg_match spend 309 ms. For completion usage, it can be replaced by str_starts_with (or the polyfill)

@dkarlovi
Copy link
Author

dkarlovi commented Dec 3, 2021

@GromNaN nice! Would be a nice tweak for this specific search special case indeed, maybe fast enough to not require anything else.

@dkarlovi
Copy link
Author

dkarlovi commented Dec 3, 2021

Having said that, the SQLite based approach would likely be a fraction of the time even with this fix since it would have an index prepared.

@Seldaek
Copy link
Member

Seldaek commented Dec 3, 2021

The search stuff was definitely not built with such a high poll rate in mind, so yeah this is not surprising you're hitting some roadblocks, but I think most of it can be fixed fairly easily.

This should help a bit already. Ideally we'd also cache the full list.json file locally but I'm in a rush now so can do some other time.

I think once we have that, and possibly an optimization to search by vendor first it should be fine without introducing sqlite (which sounds to me way overkill here.. it's an array of names).

Once we have the vendor name btw we can also optimize ComposerRepository::search() to use https://packagist.org/packages/list.json?vendor=x which should return way less data, although if we cache the whole list maybe it's still faster to work from local cache.

@Seldaek Seldaek added this to the 2.2 milestone Dec 3, 2021
@Seldaek Seldaek closed this as completed in cc32ebc Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants