Skip to content
Zack Galbreath edited this page Jul 21, 2023 · 1 revision

Attendees

  • Aashish Chaudhary
  • Alec Scott
  • Dan LaManna
  • Jacob Nesbitt
  • Mike VanDenburgh
  • Ryan Krattiger
  • Scott Wittenburg
  • Todd Gamblin
  • Zack Galbreath

Cluster Maintenance

  • This week's upgrade of gitlab.spack.io went smoothly. We're updating documentation about our upgrade process as we go.
    • Before the next upgrade, we would like to replace GitLab's MinIO pod with S3. This will make the upgrade process even easier.
  • Alec opened spack-infra PR #579 to reorganize the way our runners are deployed (fewer replicas, more concurrency).

Metrics & Dashboarding

  • For timing statistics, Ryan has restructured the format of the documents that are getting uploaded to OpenSearch.

Buildcache Pruning

  • This is more-or-less ready to go! Our current implementation has a manual review step & maintains records of when each spec was pruned.
  • We plan to fully automate this system once we're comfortable that it is working as intended.

CI Status

  • Earlier this week Scott & Luke fixed some failures on develop that were caused by unsigned specs.
    • As follow-on work, Scott opened spack PR #38995 to fail early if a protected job does not have a signing key available.
  • Scott is also working to improve the reliability of our "protected-publish" jobs, possibly by moving this functionality from GitLab CI to spackbot. spack PRs #38866 and #39008
  • We have a PR open (spack #38882) to fix the "copy-only" job for the darwin-ml stack.
  • Jake is experimenting with Karpenter's newly added support for Windows.

Priorities

  • Wrap up timing statistics work and generate some first-draft dashboards
  • Update gitlab.spack.io to use S3 and ElasticCache rather than minio and redis.
  • Continue experimenting with Karpenter for Windows CI
  • Investigate why our gitlab sidekiq pods die and get restarted somewhat frequently. Perhaps increasing resource requests will reduce this error rate?
  • Update the sync script to merge topics branches against their base branch instead of assuming that it is always develop (necessary for release branch PRs).
Clone this wiki locally