Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow explicit exports of host environment variables #730

Closed
henryiii opened this issue Jun 22, 2021 · 14 comments
Closed

Allow explicit exports of host environment variables #730

henryiii opened this issue Jun 22, 2021 · 14 comments

Comments

@henryiii
Copy link
Contributor

One thing I immediately needed when adding config support to boost-histogram is the ability to export an environment variable. Specifically, I would like to do this:

[tool.cibuildwheel.environment]
SETUPTOOLS_SCM_PRETEND_VERSION = "$SETUPTOOLS_SCM_PRETEND_VERSION"

Would that be something we could support for exporting an environment variable? The function to do this in Python seems to be called os.path.expandvars, though I don't think it has anything special to do with paths.

Related to #117

@henryiii
Copy link
Contributor Author

A quick test for this shows that "PATH=$PATH:/usr/local/bin" would change - currently, it literally stores the $PATH in the string, while with this change, $PATH would become the host path. We could disable certain variables (like PATH) from being picked up from the host, and we could allow \$PATH to consume the \ and produce $PATH instead of the expansion (like bash does). This would require a custom expanders, but it's not too hard, I think. Bash-style ${} should be supported too. Thoughts?

@henryiii
Copy link
Contributor Author

For that specific one, there's also a potential problem with the way the built-in expandvars works. I don't want it to be set if SETUPTOOLS_SCM_PRETEND_VERSION isn't set in the host. So I'd want SETUPTOOLS_SCM_PRETEND_VERSION = "$SETUPTOOLS_SCM_PRETEND_VERSION" to become export SETUPTOOLS_SCM_PRETEND_VERSION "" if this was not defined. I think actually setting it to the string with the $ would break setuptools-scm.

An alternative solution would be as mentioned in #117, there could be a list of exported variables, like:

[tool.cibuildwheel]
export-host = ["SETUPTOOLS_SCM_PRETEND_VERSION"]

I think the above solution is more elegant and less to remember/teach, but pointing out options.

@henryiii
Copy link
Contributor Author

Actually, "Or, can we import all the host variables, except ones that are already defined in the container, like PATH." sounds really simple and easy to teach (linux would behave much like windows/macOS already does), and wouldn't require adding anything - then CIBW_ENVIRONMENT/tool.cibuildwheel.environment would be additive (the latter would be especially useful). Is there anything wrong with making a list of variables not allowed, writing any variables present and not in that list to a file, and then using --env-file on the docker command?

@joerick
Copy link
Contributor

joerick commented Jun 23, 2021

So to my eyes there are two options here:

  1. CIBW_ENVIRONMENT_PASS, e.g. CIBW_ENVIRONMENT_PASS="CFLAGS SETUPTOOLS_SCM_PRETEND_VERSION". Defines the environment variables that are passed into the Docker container. This happens before CIBW_ENVIRONMENT is evaluated, so if I did CIBW_ENVIRONMENT="ANOTHER_CONFIG=$SETUPTOOLS_SCM_PRETEND_VERSION" that would work.

  2. "import all the host variables, except ones that are already defined in the container, like PATH" - Probably a good option, 95% of the time. I'm not sure about the 5% though. Would we need to provide a way to customise this if it did something wrong?

    Thinking about it more, environment variables often contain paths, those would not be accessible from the container (just looking at my own environment, I have things like NVM_DIR=/Users/joerick/.nvm ZSH=/Users/joerick/.oh-my-zsh, VIRTUAL_ENV=/Users/joerick/Projects/cibuildwheel/env. Would these create problems in some cases? I don't think we can rule it out.

Then, maybe the right approach would be to have an option, but by default, pass through everything that's not defined in the container. So, something like CIBW_ENVIRONMENT_PASS=auto does (2), but if I set CIBW_ENVIRONMENT_PASS="SETUPTOOLS_SCM_PRETEND_VERSION", then I'm whitelisting only that env var. What do you think @henryiii ?

@henryiii
Copy link
Contributor Author

Ideas for names: CIBW_ENVIRONMENT_PASS, CIBW_PASSENV, CIBW_PASS_ENVIRONMENT, CIBW_EXPOSE, CIBW_ENVIRONMENT_EXPOSE.

Sadly, there's no good way to disable this completely, CIBW_ENVIRONMENT_PASS="" would need to be the "auto" default. But you can just do CIBW_ENVIRONMENT_PASS="not_a_variable" to disable passing all the rest of the environment. CIBW_ENVIRONMENT_PASS="auto" would try to pass a variable called "auto".

2.0 seems like a good time to do this, since it changes the default (to something less surprising to new users).

@joerick
Copy link
Contributor

joerick commented Jun 23, 2021

For completeness, there is another way to do this, without adding an extra option, that's the HOST_ prefix idea from #117. That is, expose all the host environment variables with a HOST_ prefix. So to pass though a variable you'd do CIBW_ENVIRONMENT_LINUX="CFLAGS=$HOST_CFLAGS".

Worth considering too, as it would be more explicit, and wouldn't require a new option.

@joerick
Copy link
Contributor

joerick commented Jun 23, 2021

Looking at it again, this is a dupe of #117, I think?

@henryiii
Copy link
Contributor Author

henryiii commented Jun 23, 2021

That's not very discoverable, though. And I think trying to match the macOS/Windows behavior by default seems like a good change. I would hope that for 95% of the users, they would not need the new option, and would just not have to use CIBW_ENVIRONMENT as often (just for adding a variable, which is handy).

Yes, this didn't start that way, but now it's a dupe.

@henryiii
Copy link
Contributor Author

How about CIBW_HOST_PREFIX. It defaults to "", which copies the environment with the same names and does some filtering (like PATH). But if you set a prefix, like CIBW_HOST_PREFIX: "HOST_", the copied environment has a HOST_ prefix (and filtering is removed, such as on PATH).

@joerick
Copy link
Contributor

joerick commented Jun 24, 2021

Thinking about this more. I'm not sure that passing host environment variables through by default is a good idea, to be honest. We don't control what the CI sets and there's no way to know how those might impact users' builds.

For example, this is what Github sets:

SELENIUM_JAR_PATH=/usr/share/java/selenium-server-standalone.jar
CONDA=/usr/share/miniconda
GITHUB_WORKSPACE=/home/runner/work/ping/ping
JAVA_HOME_11_X64=/usr/lib/jvm/adoptopenjdk-11-hotspot-amd64
GITHUB_PATH=/home/runner/work/_temp/_runner_file_commands/add_path_9def4aab-2e04-4004-9efb-b7636aa27be4
GITHUB_ACTION=run
JAVA_HOME=/usr/lib/jvm/adoptopenjdk-11-hotspot-amd64
GITHUB_RUN_NUMBER=1
GRADLE_HOME=/usr/share/gradle-7.1
XDG_CONFIG_HOME=/home/runner/.config
DOTNET_SKIP_FIRST_TIME_EXPERIENCE=1
ANT_HOME=/usr/share/ant
JAVA_HOME_8_X64=/usr/lib/jvm/adoptopenjdk-8-hotspot-amd64
HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew"
HOMEBREW_CLEANUP_PERIODIC_FULL_DAYS=3650
BOOTSTRAP_HASKELL_NONINTERACTIVE=1
***
PIPX_BIN_DIR=/opt/pipx_bin
DEPLOYMENT_BASEPATH=/opt/runner
GITHUB_ACTIONS=true
ANDROID_NDK_LATEST_HOME=/usr/local/lib/android/sdk/ndk/22.1.7171670
GITHUB_SHA=683ef5c856591cdfbc06918e3117637f03ea7759
POWERSHELL_DISTRIBUTION_CHANNEL=GitHub-Actions-ubuntu20
DOTNET_MULTILEVEL_LOOKUP=0
GITHUB_REF=refs/pull/2/merge
RUNNER_OS=Linux
HOME=/home/runner
GITHUB_API_URL=https://api.github.com
LANG=C.UTF-8
RUNNER_TRACKING_ID=github_356cb84c-7a6b-4018-89bb-06ebbcd3e58f
RUNNER_TEMP=/home/runner/work/_temp
GITHUB_ENV=/home/runner/work/_temp/_runner_file_commands/set_env_9def4aab-2e04-4004-9efb-b7636aa27be4
GITHUB_EVENT_PATH=/home/runner/work/_temp/_github_workflow/event.json
INVOCATION_ID=fd9c1051a49b4ecf83f6cf66430b41c0
GITHUB_EVENT_NAME=pull_request
GITHUB_RUN_ID=968254576
ANDROID_NDK_HOME=/usr/local/lib/android/sdk/ndk-bundle
HOMEBREW_NO_AUTO_UPDATE=1
GITHUB_ACTOR=joerick
NVM_DIR=/home/runner/.nvm
ANDROID_HOME=/usr/local/lib/android/sdk
GOROOT_1_14_X64=/opt/hostedtoolcache/go/1.14.15/x64
GITHUB_GRAPHQL_URL=https://api.github.com/graphql
ACCEPT_EULA=Y
RUNNER_USER=runner
USER=runner
GITHUB_SERVER_URL=https://github.com
HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar"
PIPX_HOME=/opt/pipx
GECKOWEBDRIVER=/usr/local/share/gecko_driver
CHROMEWEBDRIVER=/usr/local/share/chrome_driver
SHLVL=1
ANDROID_SDK_ROOT=/usr/local/lib/android/sdk
VCPKG_INSTALLATION_ROOT=/usr/local/share/vcpkg
RUNNER_TOOL_CACHE=/opt/hostedtoolcache
HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew"
ImageVersion=20210621.1
DOTNET_NOLOGO=1
GRAALVM_11_ROOT=/usr/local/graalvm/graalvm-ce-java11-21.1.0
GITHUB_JOB=build
AZURE_EXTENSION_DIR=/opt/az/azcliextensions
PERFLOG_LOCATION_SETTING=RUNNER_PERFLOG
GITHUB_REPOSITORY=joerick/ping
CHROME_BIN=/usr/bin/google-chrome
ANDROID_NDK_ROOT=/usr/local/lib/android/sdk/ndk-bundle
GITHUB_RETENTION_DAYS=90
JOURNAL_STREAM=8:20230
RUNNER_WORKSPACE=/home/runner/work/ping
LEIN_HOME=/usr/local/lib/lein
LEIN_JAR=/usr/local/lib/lein/self-installs/leiningen-2.9.6-standalone.jar
GITHUB_ACTION_REPOSITORY=
PATH=/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/runner/.local/bin:/opt/pipx_bin:/usr/share/rust/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
RUNNER_PERFLOG=/home/runner/perflog
GITHUB_BASE_REF=master
CI=true
SWIFT_PATH=/usr/share/swift/usr/bin
ImageOS=ubuntu20
GITHUB_REPOSITORY_OWNER=joerick
GITHUB_HEAD_REF=ci-test
GITHUB_ACTION_REF=
GITHUB_WORKFLOW=test
DEBIAN_FRONTEND=noninteractive
GOROOT_1_15_X64=/opt/hostedtoolcache/go/1.15.13/x64
AGENT_TOOLSDIRECTORY=/opt/hostedtoolcache
GOROOT_1_16_X64=/opt/hostedtoolcache/go/1.16.5/x64
_=/usr/bin/env

Sooo many paths that would break inside the container. And we'd also have to check all the other CIs to find what they set, too.

I'm currently preferring the HOST_* option, myself.

Another thing I was thinking was that if a user did want to use a path from their host, they could do so by doing CIBW_ENVIRONMENT="DATA_FILES=/host/$HOST_DATA_FILES" i.e. importing from host and changing it in one operation.

Or conversely, if we did the PASSENV approach and the user wanted to use, I don't know... LANG from the host, they would have to choose between the container's version, or the host's version. With the HOST_ prefix, both are available for use.

@henryiii
Copy link
Contributor Author

henryiii commented Jun 24, 2021

How about CIBW_HOST_PREFIX. It defaults to "HOST_", but a user could set it to "" to pass everything. :)

Most of those variables will not be used - GITHUB_PATH, for example, is very unlikely to be referenced by a wheel build! Building a wheel in most cases should not pull paths from environment variables. You don't want to depend on those even for macOS and Windows - we don't unset them on those platforms (yes, they refer to real paths, but you still don't want to burn CI paths into a wheel). There's a small set of variables that influence the build, like PATH, but setting other random CI variables should have no effect.

But, yeah, just setting HOST_ doesn't sound too bad either. I just like that the empty approach mimics what the non-linux builds do already. So setting something like SETUPTOOLS_SCM_PRETEND_VERSION on the host would just work without any explicit passthrough on the linux build only. This seems like the most natural usage, so defaulting to CIBW_HOST_PREFIX="" but allowing it to be prefixed to completely control things by hand seems reasonable, still.

@henryiii
Copy link
Contributor Author

henryiii commented Jun 24, 2021

I could see variables like CXX being set on the host, which could cause problems. Having a settable CIBW_HOST_PREFIX that defaults to "HOST_" could allow us to work on building up a useful ignore list over time (the ignore list would only be used if the CIBW_HOST_PREFIX setting was empty).

Or we could just pass everything as HOST_* and that would just be something that would always have to be handled explicitly by users for Linux only.

@JesseFarebro
Copy link

Are there any updates with regard to exporting host environment variables? Now with the toml override syntax, this becomes even more important. Using Github Actions the only way to do this was something like:

env:
    CIBW_ENVIRONMENT: "GITHUB_REF=${{ env.GITHUB_REF }}"

But now if someone wants to use the new toml configuration and specifically overrides (because this isn't supported outside of toml), you can't access the host environment variables. If I specify CIBW_ENVIRONMENT it'll override the environment set in the toml config.

In #799 (reply in thread) it sounds like @henryiii might be open to the possibility of allowing environment variables to get merged into a single set. That would solve this issue as you could still forward environment variables as I describe above.

What I'd really like to see is the ability to merge environment variables (especially important for the override syntax as without it you're copying potentially many environment variables between all your overrides) with the ability to pass through host environment variables.

@joerick
Copy link
Contributor

joerick commented Nov 26, 2021

This was implemented in #914.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants