Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Improve speed of create-project by caching resulting lock files #7453

Closed
Toflar opened this issue Jul 9, 2018 · 7 comments
Closed
Labels

Comments

@Toflar
Copy link
Contributor

Toflar commented Jul 9, 2018

The Symfony Flex skeleton discussion in symfony/skeleton#66 has recently caused quite some discussions around the topic of the time needed for create-project.

It seems that the Symfony core team has now decided to submit the composer.lock to their skeletons and develop a bot that updates it regularly so that create-project does not have to resolve the dependencies but instead installs from the lock file. This considerably improves the time to first success. I agree that this is very important to allow onboarding for newcomers but I still think it's the wrong approach because for example it does not consider your platform.
But what I really don't like about that approach is that Symfony has to build its own solution. But any project benefits from a good time to first success, not just Symfony! And in addition to that it might be even interesting for packagist.org because reducing the number of complete package list requests thanks to some cache would be interesting for packagist.org as well.

So here's just a quick idea I came up with:

  1. create-project generates a hash of the PlatformRepository.
  2. It asks the repository if for package A there's a composer.lock matching the hash and if so this is used instead of resolving. Otherwise resolving takes place normally and then the result is pushed to the repository.

This would solve all of my concerns:

  1. The platform is considered
  2. It stays up-to-date
  3. It works for every package not just symfony/skeleton

Let's get a bit more technical maybe on a possible implementation:

We need some interface that has to be implemented on both, ComposerRepository as well as CompositeRepository.

interface ProjectSkeletonCacheProviderInterface
{
    public function getLockFileForPackage(PackageInterface $package, string $platformHash): ?string
    public function uploadLockFileForPackage(string $lockFile, string $platformHash): void
}

The CreateProjectCommand would check if the repository implements that interface, ask for the lock file. If present it writes it to the system, if not it does what it normally does and at the end uploads the resulting lock file for the next ones.

On the packagist.org end this needs two new endpoints:

  • GET /api/packages/{package}/skeletons/{packageVersion}/{platformHash}
  • PUT /api/packages/{package}/skeletons/{packageVersion}/{platformHash}

To ensure you cannot PUT arbitrary composer.lock contents, we can use the content-hash to validate the request (should be updated to sha256 though which would be a good idea anyway).
During the PUT request, the new skeleton needs to be tagged with all the package names that are present in it. And the Updater invalidates all the cache entries that are tagged with the package that is currently updating. This would ensure the skeleton is only valid as long as none of the packages was updated.

What I'm not sure about is how often create-project is really executed and if that cache would get many hits. I mean, I have no clue on how often e.g. create-project symfony/skeleton is executed. If it's two times a day that whole effort is not really worth it because packages will get updated too often and chances for having exactly the same platform are close to zero. If it's a few hundred times an hour we might be talking about something that benefits both, the users and packagist.org.

I'm sure there's other things we'd have to consider (like it only works if you only use packagist.org and no other repositories etc.) but I just wanted to float a suggestion. I'd love to see us working on some ideas that benefit the PHP community as a whole, not just Symfony.
Maybe @Seldaek or @naderman can throw some stats in here so we know about the number of requests?

Or is this whole idea just nonsense? 😄

/cc @nicolas-grekas @stof @javiereguiluz

@Toflar
Copy link
Contributor Author

Toflar commented Jul 9, 2018

@Toflar
Copy link
Contributor Author

Toflar commented Jul 9, 2018

Well the content-hash could be tampered with so not sure if there‘s a way we can allow clients to push cache entries. Maybe anyone has a smart idea here?

@alcohol
Copy link
Member

alcohol commented Jul 10, 2018

This looks to me like a lot of effort for very little gain.

@Toflar
Copy link
Contributor Author

Toflar commented Jul 10, 2018

BTW: The push could be verified against some composer-skeleton-whitelist.json file that contains all valid packages. This would need to be updated once a new dependency is added (or any transitive dependency adds a new one) but that should not happen too often.

This looks to me like a lot of effort for very little gain.

  • Time to first success is reduced quite a bit
  • A lot of memory saved because the amount of resolving processes gets smaller
  • Likely saves us traffic on packagist.org because the amount of resolving processes gets smaller

But I agree, it's work 😄

@alcohol
Copy link
Member

alcohol commented Jul 10, 2018

Time to first success is reduced quite a bit

Sure, if you look at pure numbers, installing from scratch vs installing from a lock file is significantly more expensive. But we're still talking about seconds, not minutes. It barely gives you the time to grab a coffee. And considering it is not an action you do like 50 times a day, I don't see a big win here.

A lot of memory saved because the amount of resolving processes gets smaller

No solving necessary you mean. But regardless, memory is cheap, this is not really a solid argument in my opinion.

Likely saves us traffic on packagist.org because the amount of resolving processes gets smaller

Traffic is not something we are concerned about. So no gain there.

@stof
Copy link
Contributor

stof commented Jul 10, 2018

Likely saves us traffic on packagist.org because the amount of resolving processes gets smaller

Traffic is not something we are concerned about. So no gain there.

Thus, this would replace traffic hitting static files by traffic likely to hit the Symfony app (I doubt the API makes much sense to dump to static files, especially if we need invalidation logic too). So it might actually be worse traffic.

@Seldaek
Copy link
Member

Seldaek commented Jul 20, 2018

Yeah I think this is a little overkill for something that wouldn't benefit that many projects IMO. I don't know how often create-project is used but I really much rather see people deal with their lock file themselves if they want this than us having to build magic around it.

@Seldaek Seldaek closed this as completed Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants