Skip to content

Elizabeth's WG use case

Todd Gamblin edited this page Jan 25, 2017 · 1 revision

Citibeth Use Case

I am building a system composed of three parts: a climate model (modele), an ice model (pism) and a coupler linking the two (icebin). I need to build these components in a variety of configurations:

  1. modele standalone
  2. pism standalone
  3. icebin standalone (w/ Python interface)
  4. icebin - pism standalone
  5. modele - icebin - pism together

Other wrinkles here:

  1. Sometimes a development version (not yet checked into git) is required; sometimes an officially released version.

  2. modele cannot be built directly by Spack; there is a special program that builds modele, using spack setup under the hood. Therefore, what I really need to build in this use case is modele's dependencies, not modele itself.

3.The overall software stack is about 15-90 packages, depending on the configuration. In some cases, I want to build some of icebin, modele and pism manually: that allows me to quickly change one package, rebuild and see the result in the overall finished product. In other cases, I just want to spack install from a released version. This need is probably common when one is developing software: you build the development version manually, even as you have also installed an official release version for other users.

These use cases bring up a number of needs:

  1. It is necessary to have a multi-package spack setup. Why? Suppose that A->B and I want to set up to build both of them manually. If I say spack setup B, there is no guarantee that the version of B that Spack sets up will be the same as the version of B needed when I say spack setup A. In reality, this is a significant problem that results in endless fiddling in packages.yaml for little gain. Having Spack set up A and B simultaneously, to work together, is far more robust.

  2. In theory, CMake builds work without any particular environment. In practice, they only work if make is run with the same environemnt variables that cmake was run with. Therefore, spack setup as it stands is not enough... currently it creates an spconfig.py file that sets the environment properly and calls CMake. But then I'm left on my own when I run make. I "solve" this problem by creating a "spack environment" and loading appropriate modules before running make. This means that in order to build A, I need to load modules for all of A's dependencies. Not ideal, just due to the busywork involved. Better would be that spack setup produce, in addition to spconfig.py, a similar wrapper for make; or an "environment" shell script wrapper that sets the environment properly, then calls any shell command.

  3. Due to the wide variety of configurations required for the same project, per-project scope is absolutely essential here. This stuff needs to be settable from the command line.

  4. Currently, creating a config scope requires one create a directory, plus a number of files inside it. That's fine for the current built-in scopes. But for scopes that will be specified on the command line, I would like it to be possible to create the scope in a single file.

Discussion

My first thought with "Software Developers Use Case 1" is we should consider "stacked Spacks", a feature I've long thought would be key to having groups of people collaborate on Spack. The idea is that when I create a Spack instance, I can specify that it looks for pre-built binaries inside one or more other Spack instances, before building them itself. My local Spack is "stacked" on top of a global Spack administered by sysadmins. Our two Spacks don't have to be identical; but they DO have to share hashes.

Stacked hashes would be helpful in use case 1 by allowing the user to build G1, G2, G3 in their own Spack that depend on E1, E2, E3 in a centrally-built Spack. The assumption here is that ALL packages are built within SOME Spack --- presumably using spack setup, if the package is built manually. That is really much easier than hand-building your own packages on top of an auto-built Spack.

This is a very "safe" thing to do because Spack is the ultimate functional-language no-side-effect system. Think of spack as a function that goes f(spec) --> installed hash: if the hash exists, then it is functionally equivalent to what Spack would build anyway, were it to build right now.

Features like spack activate break this no-side-effects property, making many cool things like this harder. For example, if spack activate is used, then there is no guarantee that two python packages in different Spacks are functionally equivalent, even if they have the same hash. That is one reason I do not use spack activate. A better approach is to create Spack views or other kinds of environments that have the same effect of combining (say) Python and a bunch of Python modules.

He would like to an executable from the central G3, link against his G2 library without having to rebuild G3.

I ALMOST see a way to support this. As long as the user's G2 library has the same hash as the central G2 library (which it could, if it was configured properly with spack setup), then G3 will at least have the right hash for G2. Problem is, the RPATH in G3 will still look in the central Spack, not the user's Spack. Of course, the user could copy the G3 binary and overwrite RPATHs. But that's a lot of programming work, just to save some rebuilding.

Spack Environment

In current prototypes, a Spack Environment consists of:

  1. A stack of command-line configuration scopes.
  2. A set of specs
  3. Instructions on how to install each spec: should it be done with spack install or spack setup? Should the top of the DAG be built, or only everything underneath it?
  4. Instructions on how to generate module load commands from each spec. (i.e. Should module load commands be generated for every node in this DAG? Or just the top node? Or all nodes except the top node? Or some more complex pattern?) Note that the questions here on recursion are similar to those in (3).

The following operations are possible in a Spack environment:

  1. Install: Do a spack install on each spec.
  2. Generate (once install is complete): Generate a set of module load commands to load the environment. Alternately, generate should also be able to produce a Spack view of the packages. (Although the two are not equivalent... Spack views may require additional env vars to be set manually to be useful).
Clone this wiki locally