Skip to content
Zack Galbreath edited this page Apr 28, 2023 · 1 revision

Attendees

  • Aashish Chaudhary
  • Alec Scott
  • Jacob Nesbitt
  • Massimiliano Culpo
  • Michael VanDenburgh
  • Scott Wittenburg
  • Todd Gamblin
  • Zack Galbreath

We are currently working on the following improvements:

  • Generally improving the aesthetic
  • Adding a global toolbar for filtering based on version and stack
  • Navigation breadcrumbs
  • Better display of OS information

Todd suggested that we also update the landing page to focus more on the "confused user" perspective:

  • What is this?
  • What do I get out of it?
  • How do I use it?

Release process improvements

AWS cost reduction & monitoring

  • Cluster rebuild for NAT gateway per AZ currently in progress!
  • We discussed how to pursue an "oversubscription" strategy to more tightly pack build jobs onto our EKS nodes (improve utilization). It was determined that we will need the following per-package statistics in order to achieve this goal:
    • Memory high water mark
    • CPU Load average
    • Build duration (min/avg/max)

Cluster secrets management

Rolling releases

We discussed our desire for rolling releases (periodic snapshot buildcaches). A primary motivation for this addition is that our current develop buildcache has a reproducibility problem. It's constantly changing and it contains multiple builds of the same spec, which confuses the solver and makes it non-deterministic. A snapshot buildcache would represent a consistent state for a given date. It would contain exactly one version of each spec installed by our GitLab CI stacks.

Some questions that were raised:

  • A snapshot buildcache could be generated after any successful develop pipeline. How frequently should we do so? Daily, weekly?
  • When creating a snapshot buildcache, should we do a rebuild everything, or an environment-aware copy from develop?
  • How long do we keep a snapshot buildcache for? 1 month? Does this need to match the interval at which we prune the larger develop buildcache?

Our goal is to start creating rolling releases and integrate them with cache.spack.io before ISC (May 21).

Zack & Scott plan to draft & share a design document early next week.

Priorities

  • Rolling release design & implementation
  • Redesign cache.spack.io
  • Consider improvements to how we currently backport fixes to patch releases
  • Capture better per-package stats so we can make informed decisions about how to modify our Kubernetes resource requests.
Clone this wiki locally