hermetic/remoteable/cacheable process executions #20407
Replies: 3 comments
-
It's not clear yet how useful this would be. The most useful measurements to get first would require picking a package with no dependencies, then asking for that package:
Then, spack can be made to use
Then, in parallel:
|
Beta Was this translation helpful? Give feedback.
-
It's also totally possible to do this without using the pants CLI tools or |
Beta Was this translation helpful? Give feedback.
-
I believe I was extremely concerned about being annoying earlier and made up an excuse to close this issue. It describes a fundamentally possible path spack can traverse at some point, even if it's not feasible to consider right now. |
Beta Was this translation helpful? Give feedback.
-
Summary: use the pants CLI tools to insert file/directory contents from build steps into a local db, and allow execution against a backend for the bazel remote execution API (remexec API).
Rationale
The remexec API is a single-file protobuf API with two parts:
A client can send/receive filesystem data to a server via CAS API methods. A client can provide a process execution request to a server, which is then executed. The reason this works remotely is because the process execution request includes every single platform-appropriate file which is required to execute the process as a merkle tree.
Spack is able to build ~all of its dependencies from scratch, so it would be the only implementor of the remexec API which doesn't have a significant implicit dependency on the Linux environment that a process executes in. Other build tools would need to spend a huge amount of effort to achieve this -- they might as well just use Spack directly.
Description
There was a lot more written in this issue before github lost it.
The reason this is feasible is because pants provides two command-line tools which together implement the CAS and process execution portions of the API:
fs_util
ingests and materializes local directories from checksummed merkle trees.process_executor
, which retrieves the directory contents from a given checksum and executes a process with the given command line and environment variables.These tools can be made into a spack package: a mostly-complete first attempt is here.
Additional information: Symbolic Filesystem Changes and the
upc
LibraryPants uses a very idiosyncratic API to manipulate directory contents (e.g. merging directories) "symbolically", modifying only the CAS and not the local fs. We should be able to avoid creating a whole new API to work with files in general, though, because Spack has a pretty well-defined and decoupled "build step" concept. This hopefully allows for these symbolic manipulations to occur in a single, small part of the codebase, without affecting most other code.
fs_util
does not expose any API to perform these symbolic operations. At first, this could be replaceable with real filesystem operations and then handed tofs_util
, but this will be slow in many cases compared to symbolic manipulation. However, I have already prototyped a runner for these kinds of process execution requests (entry point) which specify a digest (virtual fs). The process execution part doesn't work for spack: it doesn't use FUSE, it's jvm-only, and it requires changing the application to use different methods for file operations. However, it is very likely to be useful for mapping out the problem.TODO
fs_util
andprocess_executor
into a spack package.strace
should be able to provide the exact information, but that won't be possible to automate over every configuration of every package, so there will be a long tail of manual effort here.fs_util
to read every file spack intends to use, and perform symbolic filesystem manipulations to form a coherentinput_digest
for the tool. Read all of a build step's output files withfs_util
as well.upc
library may be necessary for this step.process_executor
locally.process_executor
wrapper to create a remexec server, which can be launched as a separate process.Beta Was this translation helpful? Give feedback.
All reactions