Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setuptools consuming excessive CI resources #3093

Open
jaraco opened this issue Feb 11, 2022 · 16 comments
Open

Setuptools consuming excessive CI resources #3093

jaraco opened this issue Feb 11, 2022 · 16 comments

Comments

@jaraco
Copy link
Member

jaraco commented Feb 11, 2022

@jaraco I think the large # of jobs this introduces to setuptools' GitHub workflows is slowing down builds for other PyPA projects. There is a limit of 20 concurrent jobs org-wide for our plan, and setuptools has gone from 36 jobs in #3078 to 119 jobs here. I've noticed a significant backlog over at https://github.com/pypa/warehouse/actions over the last few days and suspect this might be why.

Any thoughts on how we could get this down to a more reasonable number of concurrent jobs per workflow run?

(cc @ewdurbin)

Originally posted by @di in #2923 (comment)

@jaraco
Copy link
Member Author

jaraco commented Feb 11, 2022

I too have noticed that Setuptools tests now take forever to run, and turning down concurrency, if that would help the other projects would only throttle Setuptools.

Maybe the new platform tests could be set up to be run only for tagged commits (releases) or for PRs with a particular tag to drastically limit the scope of these platform tests.

@jaraco
Copy link
Member Author

jaraco commented Feb 11, 2022

I believe that push should limit the execution of these tests. Let's see if that improves the situation.

@di
Copy link
Sponsor Member

di commented Feb 11, 2022

Thanks!

@blink1073
Copy link
Contributor

I'm happy to open a PR to add Using concurrency to cancel any in-progress job or run as well

@jaraco
Copy link
Member Author

jaraco commented Feb 11, 2022

Sure. Let’s give it a try!

@jaraco
Copy link
Member Author

jaraco commented Feb 12, 2022

Something else I'd like to consider - the matrix of tests is doubled by the SETUPTOOLS_USE_DISTUTILS=local/stdlib. Soon I'd like to remove that matrix, maybe to replace it with a single job to capture regressions in that space.

@blink1073
Copy link
Contributor

You can leave it out of the matrix and add an explicit include entry with that option.

@blink1073
Copy link
Contributor

(I'm happy to do that as well)

@jaraco
Copy link
Member Author

jaraco commented Feb 16, 2022

It seems the release process is broken due to concurrency cancellation:

image

Because a tagged commit is also a mainline commit with the same SHA hash, the second one without the tag takes precedence, so the tagged commit gets cancelled and no release happens.

I'm not sure what needs to be done to address the issue.

@blink1073
Copy link
Contributor

We were discussing this on the original PR. I can move the release job to a separate workflow. We lose the needs safety check, but tagging is an explicit action.

@abravalheri
Copy link
Contributor

abravalheri commented Feb 16, 2022

@blink1073, @jaraco is a new workflow the best approach here or would it be better to add, for example github.ref_type to the concurrency.group key?

separate workflow

  • pros: the tests don't need to run twice.
  • cons: currently the integration tests just run on tags, so it might cause a situation where the tag is being published with integration tests never being checked...

github.ref_type in concurrency.group

  • pros: it makes sure the package is published after all the tests run
  • cons: the tests will run twice, one time for the push action and one for the tag.

@blink1073
Copy link
Contributor

We could move the integration tests to that new workflow

@abravalheri
Copy link
Contributor

abravalheri commented Feb 16, 2022

We could move the integration tests to that new workflow

Yeah, that would work!

I am trying to think if there is another situation when tests other than integration might be skipped.

I suppose if someone creates a few commits in a row, create a tag and than pushes that can happen.

So this change would imply people with commit rights follow a more strict discipline, which is very reasonable. But probably it is good to check Jason's opinion first (if you plan to submit a PR, maybe it is good to highlight that so it can addressed in the review?).

@blink1073
Copy link
Contributor

I can submit a PR, but I'll be afk for the next couple of hours.

@jaraco
Copy link
Member Author

jaraco commented Feb 16, 2022

I have a couple of objections to creating a separate workflow.

  • This project attempts to align with the principles in jaraco/skeleton. It can deviate, but I'd like to avoid creating a whole different experience here than with other projects derived from skeleton.
  • I don't want the release workflow to bypass checks. The test suite should be checked and pass on the code before it's released and shouldn't rely on a separate PR workflow. I do sometimes push code directly to main or merge a PR optimistically and rely on the test suite to catch such failed tests before release.
  • The proposed approach feels non-standard. If I saw that Github recommended creating a separate workflow for releases and provided guidance on how projects should adopt that, I'd consider it (especially if it didn't require a lot of per-repo configuration). In my experience, workflows can't be made dependent, so the only way to have dependent jobs is to have a single workflow like is implemented here, and that's Github's recommended best practice.
  • This new introduction of a feature for concurrency group is now driving other divergent changes, so tracking and reconciling these concerns in the future may prove difficult.

Can we explore tweaking the concurrency group instead?

@webknjaz
Copy link
Member

  • The proposed approach feels non-standard. If I saw that Github recommended creating a separate workflow for releases and provided guidance on how projects should adopt that, I'd consider it (especially if it didn't require a lot of per-repo configuration). In my experience, workflows can't be made dependent, so the only way to have dependent jobs is to have a single workflow like is implemented here, and that's Github's recommended best practice.

I've got a few comments on this.

  1. I've seen other people using a separate release workflow, with a justification that they cut a release on a commit that already has passing checks. But that only works if there's branch protection in place. Although, my concern is always that the artifact released may not match the artifact published, which is why I don't favor this approach.
  2. There's a way to trigger one workflow after the other (there's an event for that) but there's no coordination artifact-wise.
  3. There are reusable workflows. I haven't checked lately, but when/if they start supporting referencing same-commit repo-local definitions, this'll address your exact concern — you'd be able to organize a workflow that consists of several separately managed chunks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants