Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TemporaryDirectoryForBigFiles() can still ignore $TMPDIR #2197

Open
primeos-work opened this issue Nov 23, 2023 · 2 comments
Open

TemporaryDirectoryForBigFiles() can still ignore $TMPDIR #2197

primeos-work opened this issue Nov 23, 2023 · 2 comments

Comments

@primeos-work
Copy link

primeos-work commented Nov 23, 2023

Update: I ran into this bug with Podman 4.4.1 on a RHEL 9.2 system. After writing this issue but before submitting it I realized that I only looked a the source-code (old vs. current) and didn't try to reproduce it with a more recent Podman version. It turns out that Podman 4.7.2 on Fedora 38 might not be affected by this issue (the temporary files end up in $TMPDIR and are prefixed container_images_storage instead of storage but it's also a very different setup/code-path as I use a systemd service running as root on RHEL while on Fedora I just run Podman as unprivileged user without connecting to a Podman service - I suspect that the difference comes from not connecting to a Podman service and I'll try to reproduce it with Podman 4.7.2 but IMO this issue applies regardless as the current source code of this repository still contains the problematic code).

Update2: I could reproduce the issue with Podman 4.8.0 (most recent release) on Fedora 38 - the key is indeed the Podman service / using --remote.


The temporary directory can be very important when writing big files. See #495 for a previous issue on this topic. The situation got improved with #747 but relying on SystemContext.BigFilesTemporaryDir doesn't seem to be sufficient as sys isn't always set. I currently have to build Dockerfiles that produce huge images (unfortunately) on a RHEL 9 system where not only /tmp but also /var/tmp is on a tmpfs. This results in such errors:

# Edit: I have the following environment variable set (it's likely important to connect to a Podman
# service and it could even be relevant that the service runs as root):
$ export CONTAINER_HOST=unix://run/podman/podman.sock
$ podman build .
[...]
Error: committing container for step {...}: copying layers and metadata for container "...": writing blob: storing blob to file "/var/tmp/storage4208335389/1": write /var/tmp/storage4208335389/1: no space left on device

The TMPDIR environment variable is set and working (already got me a few build steps further and I can see temporary files prefixed with buildah and libpod_builder in $TMPDIR) but the temporary files prefixed with storage for committing image layers still end up in /var/tmp.

The relevant source code should be this:

// unixTempDirForBigFiles is the directory path to store big files on non Windows systems.
// You can override this at build time with
// -ldflags '-X github.com/containers/image/v5/internal/tmpdir.unixTempDirForBigFiles=$your_path'
var unixTempDirForBigFiles = builtinUnixTempDirForBigFiles
// builtinUnixTempDirForBigFiles is the directory path to store big files.
// Do not use the system default of os.TempDir(), usually /tmp, because with systemd it could be a tmpfs.
// DO NOT change this, instead see unixTempDirForBigFiles above.
const builtinUnixTempDirForBigFiles = "/var/tmp"
const prefix = "container_images_"
// TemporaryDirectoryForBigFiles returns a directory for temporary (big) files.
// On non Windows systems it avoids the use of os.TempDir(), because the default temporary directory usually falls under /tmp
// which on systemd based systems could be the unsuitable tmpfs filesystem.
func temporaryDirectoryForBigFiles(sys *types.SystemContext) string {
if sys != nil && sys.BigFilesTemporaryDir != "" {
return sys.BigFilesTemporaryDir
}
var temporaryDirectoryForBigFiles string
if runtime.GOOS == "windows" {
temporaryDirectoryForBigFiles = os.TempDir()
} else {
temporaryDirectoryForBigFiles = unixTempDirForBigFiles
}
return temporaryDirectoryForBigFiles
}

#747 added this part but apparently that still isn't enough:

if sys != nil && sys.BigFilesTemporaryDir != "" {
return sys.BigFilesTemporaryDir
}

The documentation of the os.TempDir() function is as follows (https://pkg.go.dev/os#TempDir):

On Unix systems, it returns $TMPDIR if non-empty, else /tmp. On Windows, it uses GetTempPath, returning the first non-empty value from %TMP%, %TEMP%, %USERPROFILE%, or the Windows directory. On Plan 9, it returns /tmp.

Apparently the GetTempPath function from Windows also considers environment variables:
https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-gettemppatha#remarks

The GetTempPath function checks for the existence of environment variables in the following order and uses the first path found:

  1. The path specified by the TMP environment variable.
  2. The path specified by the TEMP environment variable.
  3. The path specified by the USERPROFILE environment variable.
  4. The Windows directory.

I therefore propose that we also check for $TMPDIR on the non-Windows code path.
The following implementation from Podman seems good to me:

https://github.com/containers/podman/blob/b4eb88fca4c48e90067a3558db48eccfa3580261/pkg/util/utils.go#L1042-L1049

So basically something like this (+ updated comments):

 func temporaryDirectoryForBigFiles(sys *types.SystemContext) string { 
 	if sys != nil && sys.BigFilesTemporaryDir != "" { 
 		return sys.BigFilesTemporaryDir 
 	} 
 	var temporaryDirectoryForBigFiles string 
 	if runtime.GOOS == "windows" { 
 		temporaryDirectoryForBigFiles = os.TempDir() 
 	} else { 
		tmpdir := os.Getenv("TMPDIR")
		if tmpdir == "" {
	 		temporaryDirectoryForBigFiles = unixTempDirForBigFiles 
		} else {
 			temporaryDirectoryForBigFiles = tmpdir
		}
 	} 
 	return temporaryDirectoryForBigFiles 
 } 

IMO that would be the correct behaviour but unfortunately that change could also cause regressions and should probably at least be documented as a breaking change.
Alternatives would be to use a different environment variable (should be less desirable for obvious reasons) or to change the code of Podman to ensure that sys.BigFilesTemporaryDir is always set when calling the function (that would work well but it's error prone and the chances of regressions is much higher).

What do you think?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 25, 2023

TMPDIR is explicitly defined to be a /tmp equivalent: https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap10.html . I don’t think it makes any sense to define a c/image-specific semantics of TMPDIR which is different, causing users to possibly have a general TMPDIR setting and a Podman-specific TMPDIR override.

Yes, Podman did decide to interpret TMPDIR that way, and *shrug*, that can’t be resolved in this c/image repo.

I think this needs to be handled in Podman proper; it is not too onerous to structure the SystemContext creation in a way that there are some values shared process-wide, consistently, without unmaintainable code duplication.

In fact, looking at the code as it is right now, AFAICS the relevant code path uses vendor/github.com/containers/buildah/pkg/parse/parse.go:SystemContextFromFlagSet, which always sets BigFilesTemporaryDir to some value. So adding an extra environment read into temporaryDirectoryForBigFiles would not change the behavior at all, it seems to me. (There could well be other code paths that don’t work exactly the same way, and some things might need fixing; just not this one.)

@primeos-work
Copy link
Author

primeos-work commented Nov 28, 2023

Thanks for the reply! :)

TMPDIR is explicitly defined to be a /tmp equivalent: https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap10.html.

Yes, but (AFAIK) /var/tmp isn't a POSIX standard at all, so I guess it should be no surprise that /var/tmp isn't mentioned there, right?
Also: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 uses a generic definition (that doesn't exclude /var/tmp, at least IMO):

TMPDIR: This variable shall represent a pathname of a directory made available for programs that need a place to create temporary files.

I'm not sure where /var/tmp originates from (maybe it was first formalized in the FHS?) but standards that are aware of it do state that $TMPDIR should be used:

https://www.freedesktop.org/software/systemd/man/latest/file-hierarchy.html:

/var/tmp/
The place for larger and persistent temporary files. In contrast to /tmp/, this directory is usually mounted from a persistent physical file system and can thus accept larger files. (Use /tmp/ for small ephemeral files.) This directory is generally not flushed at boot-up, but time-based cleanup of files that have not been accessed for a certain time is applied.

If applications find the environment variable $TMPDIR set, they should use the directory specified in it instead of /var/tmp/ (see environ(7) for details).

The same security restrictions as with /tmp/ apply: mkstemp(3), mkdtemp(3), and similar calls should be used. For further details about this directory, see Using /tmp/ and /var/tmp/ Safely.

Added in version 215.

https://systemd.io/TEMPORARY_DIRECTORIES/:

If the $TMPDIR environment variable is set, use that path, and neither use /tmp/ nor /var/tmp/ directly

So I would argue that, at least on modern Linux+systemd systems, the correct (/ standard conformant) behavior would actually be to use $TMPDIR over /var/tmp (and this should also make it easier and more intuitive for users to change the temporary files directory - IMO there should be a way to override the default path, even if $TMPDIR wouldn't be used but I guess sys.BigFilesTemporaryDir should be sufficient in that case).
It should also be more in line with the Windows specific logic and the user could easily ensure that all temporary files end up in the desired $TMPDIR.

I think this needs to be handled in Podman proper

IMO it would ideally be handled in both repos/projects but it should certainly be possible to only change it there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants