Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add env variable to indicate the build type. #9532

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

jameshilliard
Copy link
Contributor

Build scripts in some cases need to know if the build is a cross build or a native build.

Set the CARGO_BUILD_TYPE env variable for build scripts so that they can properly determine the build type being performed.

@rust-highfive
Copy link

r? @Eh2406

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 2, 2021
@Eh2406
Copy link
Contributor

Eh2406 commented Jun 15, 2021

I don't use cross compiling much, why would this be useful?

@joshtriplett
Copy link
Member

We talked about this in the @rust-lang/cargo meeting today. We were generally in favor of adding this. However, today we don't actually support this very well; cargo might, for instance, not always run build.rs twice with different values for CARGO_BUILD_TYPE.

So, we'd like to first fix Cargo's support for cross-compilation (which we're entirely in favor of doing), and then after that we can consider how we want to provide that information to build scripts.

@jameshilliard
Copy link
Contributor Author

I don't use cross compiling much, why would this be useful?

Mostly useful if a build script needs to have special cased handling for cross compilation scenarios, pyo3 for example needs to know if it is cross compiling.

@jameshilliard
Copy link
Contributor Author

However, today we don't actually support this very well; cargo might, for instance, not always run build.rs twice with different values for CARGO_BUILD_TYPE.

Hmm, under what circumstances could that be an issue? If you have a reproducer I can try and track that issue down.

So, we'd like to first fix Cargo's support for cross-compilation (which we're entirely in favor of doing), and then after that we can consider how we want to provide that information to build scripts.

Is there something else still broken? I wasn't aware of any issues left there other than stabilizing the fixes in #9322 and this shouldn't really depend on the fixes being stabilized either.

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 15, 2021

What does it do differently for cross compiling?

@jameshilliard
Copy link
Contributor Author

What does it do differently for cross compiling?

It has to use different heuristics for linking to the python interpreter when cross compiling, see here.

@alexcrichton
Copy link
Member

Many cross-compile builds can already be detected because the host is not equal to the target (via env vars to the build script). The only remaining case is then building for the same target as the host but wanting a different configuration than the host. Can you describe more how pyo3 fits in this use case? For example what is pyo3 specifically doing when it's compiling for the same host triple that it would otherwise not want to do?

To echo @joshtriplett's comment about landing this now, the problem can be made evident by adding a path dependency such as:

[dependencies]
shared = { path = 'shared' }

[build-dependencies]
shared = { path = 'shared' }

If shared has a build script then you can see it's run twice with cargo build --target wasm32-wasi and cargo build --target $the_host_target, but it's only run once with cargo build. This is the canonical problem in Cargo described in previous PRs that cargo build isn't modeled to do a "cross compile" in the manner of cargo build --target $the_host_target. By leaning more and more into exposing this information (such as a cross compile flag) then it gets more important that Cargo is fixed to treat cargo build as a cross compile natively by default.

r? @alexcrichton

@rust-highfive rust-highfive assigned alexcrichton and unassigned Eh2406 Jun 16, 2021
@jameshilliard
Copy link
Contributor Author

The only remaining case is then building for the same target as the host but wanting a different configuration than the host. Can you describe more how pyo3 fits in this use case? For example what is pyo3 specifically doing when it's compiling for the same host triple that it would otherwise not want to do?

Well I think it would potentially get linked against the host python instead of the target python if cross compilation isn't detected properly.

@alexcrichton
Copy link
Member

Does pyo3 have some sort of custom configuration as well for what the host python is and what a target python is?

All-in-all one thing that's also worth mentioning is that I don't think the term "cross" here or the way this PR exposes it is the right way to do this. Not to say that something like this doesn't need to be done, only that I would prefer we do it differently (since this will be a stable interface after all).

Today Cargo unconditionally sets HOST and TARGET env vars, but that leaves the HOST == TARGET case ambiguous. An idea @joshtriplett had was that perhaps a better design would be to always set HOST but optionally specify TARGET. If the TARGET env var is unset then the build script knows that its outputs are only ever going to be used on the host itself, for example as part of another build script or part of a procedural macro. This means that if both env vars are present it's equivalent to the "cross" build type you're specifying in this PR.

We can't implement that design because it's backwards incompatible unfortunately. What we can do, though, is perhaps pick different env vars and do the same thing (e.g. CARGO_BUILD_{HOST,TARGET}). That way build scripts which want to can use these env vars to realize if they're doing host-only or target-specific compilation.

One of things we were curious about in the discussion is why build scripts might do this (independently of how Cargo configures them to run). I think the linkage part makes sense, where you may have some target libraries for some other system and you don't want to accidentally pick up the host libraries. This does, however, require the build script to have some other form of configuration for where to find the libraries I think? So I'd like to confirm that.

@jameshilliard
Copy link
Contributor Author

Does pyo3 have some sort of custom configuration as well for what the host python is and what a target python is?

Well there's the setuptools extension which would itself be run by the host interpreter but is configured to invoke pyo3 to build against an alternate target interpreter when cross compiling.

Today Cargo unconditionally sets HOST and TARGET env vars, but that leaves the HOST == TARGET case ambiguous. An idea @joshtriplett had was that perhaps a better design would be to always set HOST but optionally specify TARGET. If the TARGET env var is unset then the build script knows that its outputs are only ever going to be used on the host itself, for example as part of another build script or part of a procedural macro. This means that if both env vars are present it's equivalent to the "cross" build type you're specifying in this PR.

Well I think for build scripts we generally don't care much about the HOST configs once they are compiled, splitting the configs is mostly just so that the build scripts can be compiled in the first place. That suggested approach sounds confusing to me since you technically have a target in native mode as well, it's just that the target is the same as the host so I think using a separate indicator env variable like CARGO_BUILD_TYPE is probably more clear/useful.

One of things we were curious about in the discussion is why build scripts might do this (independently of how Cargo configures them to run). I think the linkage part makes sense, where you may have some target libraries for some other system and you don't want to accidentally pick up the host libraries. This does, however, require the build script to have some other form of configuration for where to find the libraries I think? So I'd like to confirm that.

Yeah, so from my understanding pyo3 uses different build configuration logic for cross compilation vs native, so it mostly just needs to know if it was run in cross or native mode so that it uses the right python interpreter config.

@jameshilliard
Copy link
Contributor Author

If shared has a build script then you can see it's run twice with cargo build --target wasm32-wasi and cargo build --target $the_host_target, but it's only run once with cargo build. This is the canonical problem in Cargo described in previous PRs that cargo build isn't modeled to do a "cross compile" in the manner of cargo build --target $the_host_target. By leaning more and more into exposing this information (such as a cross compile flag) then it gets more important that Cargo is fixed to treat cargo build as a cross compile natively by default.

I added a few tests that attempt to cover all these scenarios but I'm not seeing any issues that affect CARGO_BUILD_TYPE, do those tests I just added look like they are replicating those scenarios properly?

@jameshilliard jameshilliard force-pushed the is-cross branch 2 times, most recently from 8cc60d2 to 449cd0c Compare June 16, 2021 21:52
use std::env;
fn main() {
println!("cargo:foo=bar");
assert_eq!(env::var("CARGO_BUILD_TYPE").unwrap(), "native");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bug that I was mentioning. This build script is run once instead of twice, although there are two possible compilations in this scenario.

We want to move towards a world where cargo build is the same as cargo build --target $host, but this is a case where they are not the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bug that I was mentioning. This build script is run once instead of twice, although there are two possible compilations in this scenario.

But at least for this CARGO_BUILD_TYPE it's still giving expected results...so not sure why that bug would be an issue for the CARGO_BUILD_TYPE feature.

We want to move towards a world where cargo build is the same as cargo build --target $host, but this is a case where they are not the same.

Isn't this still somewhat problematic for cases where one needs different logic for cross compilations due to the inability to execute binaries built for the target on the host?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For CARGO_BUILD_TYPE this would be an incorrect value. The model that Cargo wants to get to is that every single compilation is a "cross" compilation where there's a clear distinction between host/target artifacts and you can separately configure both. That means that the build script here should either run once in "cross" mode or twice, once in "cross" and once in "native"

(FWIW I'm not personally a fan of the "build type" and "cross/native" terminology to get exposed here. I think their purpose is fine but eventually I'd want to bikeshed on changing the interface. For now I'd like to get agreement on what needs to be implemented first though)

If --target doesn't equal the host platform, then the binary naturally won't run on the host, but if it's omitted or equals the $host then it's up to build configuration as to whether the binary runs on the host. By default the output of cargo build would be able to run on the host.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model that Cargo wants to get to is that every single compilation is a "cross" compilation where there's a clear distinction between host/target artifacts and you can separately configure both.

While being able to separately configure host/targets is needed for cross compilation it's not really exclusive to cross compilation scenarios and can be also done for native compilation(for example using different optimization flags for build scripts vs the actual target binaries).

That means that the build script here should either run once in "cross" mode or twice, once in "cross" and once in "native"

Best I can tell cross vs native is a somewhat different issue from host vs target configuration. In rust/cargo they seem to be fairly closely tied but for build scripts that need to say build against non-rust dependencies like python there's additional implications where having this env variable is more important so that the configuration logic works correctly.

(FWIW I'm not personally a fan of the "build type" and "cross/native" terminology to get exposed here. I think their purpose is fine but eventually I'd want to bikeshed on changing the interface. For now I'd like to get agreement on what needs to be implemented first though)

Yeah, I figured that would be the case, this was just the first thing I came up with.

If --target doesn't equal the host platform, then the binary naturally won't run on the host, but if it's omitted or equals the $host then it's up to build configuration as to whether the binary runs on the host.

Some build systems like meson have wrapper capabilities to work around these limitations. There's a good bit of unusual configurations you can get here in general, for example in buildroot we have to use a qemu wrapper for gobject-introspection cross compilation(although that actually isn't implemented using the native meson exe wrapper but rather by overriding specific gobject-introspection executables with buildroot managed wrapper scripts, it's rather complex). But I think that's somewhat separate from this feature which is mostly just for indicating if the target was overridden to the build script, what the build script actually does with that info would be highly specific to the build script use case.

By default the output of cargo build would be able to run on the host.

Sure, and in the pyo3 case I think exposing if it was run with default vs an explicit target to the build script is helpful, however it does seem there may be some unclear edge cases here so not entirely sure if this is the best approach. Right now pyo3 just uses an target == host check(which is obviously broken when cross compiling for the same target arch as the host) to determine if it is cross compiling, although there are some override env variables that I added there for now which can override that check. I figured at least this way the build script can know that the target was overridden.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we're roughly talking about the same thing but it seems to be straying from the original comment I made about the value of native on this line I don't think is correct. To restate some properties I believe Cargo wants to achieve, but doesn't today:

  • cargo build should be the same as cargo build --target $host
  • Cargo should support separate host/target configuration.
    • As corrolary of the above point cargo build should use the host/target configuration as it does in the same manner of cargo build --target $host
  • All compilations should be "cross" compilations (see the first point) but the term "cross" isn't quite right because cargo build isn't building anything for a different architecture.

Cargo does not today treat cargo build the same as cargo build --target $host. Cargo has recently picked up support for separate host/target configuration but it's going to be blocked on fixing this.

Projects like pyo3 which want to link to different python interpreters or something like that will always need some means of auxiliary configuration. If Cargo informs pyo3 that the build is happening for the host ("native" in this PR more-or-less) then it probably wants to disregard the configuration and look at the host to see what it can do. If Cargo informs pyo3 that the build is happening for the target (roughly "cross" as-is here, but not completely because this should show up in the cargo build case too) then pyo3 would look for configuration. If it finds no configuration and host==target then it would assume to link to the current system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we're roughly talking about the same thing but it seems to be straying from the original comment I made about the value of native on this line I don't think is correct.

Well I'm seeing essentially two different things here, making cargo's internal logic handle native builds in the same way as cross is the first one, but that's not really what I'm trying to do here. The issue I'm trying to address here is regarding passing of invocation flag information on whether a specific target was specified when invoking cargo to the build scripts as they may integrate with externally managed dependencies(like python in the pyo3 case) that need to be informed of that.

Cargo does not today treat cargo build the same as cargo build --target $host. Cargo has recently picked up support for separate host/target configuration but it's going to be blocked on fixing this.

I don't think this is really blocked by support for splitting host/target configurations at all as this should not change based on that, essentially all I'm trying to do here is inform the build script if the user set a custom target flag for the cargo build, whether the target config should apply to the host build script is a separate concern and should be independent of whether or not a target flag was passed to cargo, if the build script doesn't actually use the cross vs native information for anything then there would not be any behavior change. We may in addition want to add a variable that informs the build script if a target config applies to the host so that the build script heuristics can be further improved, and that would actually be dependent on the separate host/target configuration changes.

Projects like pyo3 which want to link to different python interpreters or something like that will always need some means of auxiliary configuration.

Yeah, so the idea is that this is just additional information to improve the build script auxiliary configuration heuristics. Keep in mind for the pyo3 case the target configuration params are effectively chained from a setuptools plugin, we mostly just want to preserve that information when it gets passed to the build script.

Essentially a typical cross compile invoked from the meta build system like buildroot would look something like this I think:

  1. buildroot invokes host-python build against a python package like python-cryptography with cross config in env or cargo config file with config file location in env
  2. host-python invokes setuptools-rust
  3. setuptools-rust running in host-python interpreter invokes cargo while forwarding the target config to cargo
  4. cargo invokes the pyo3 build script while passing target configs to build script
  5. pyo3 build script picks up target configuration passed by cargo during execution
  6. pyo3 build script reads python sysconfigdata to determine how to link against target-python based on information from cargo(this particular change is helpful as it helps the build script know how to appropriately pick up target-python's sysconfigdata)
  7. the extension module is compiled by cargo configured against target-python's sysconfigdata params

What I'm trying to ensure here is that target flag invocation information is available through the entire cross compilation call chain here. Keep in mind that we don't invoke cargo directly when using pyo3 typically, cargo is effectively always being managed by setuptools-rust which in turn may or may not be managed by another meta build system(buildroot in this case).

In addition it's conceivable that the cargo build script may in addition need to do runtime python invocations for configuration and in that case it would need to properly differentiate the host-python from the target-python interpreter depending on build requirements, we even have some exceptionally complex/exotic scenarios in buildroot involving GIR code-generation which requires target python invocations in qemu wrappers that may require additional special cased handling. So a build script may need to be using both host and target configs depending on the requirements.

If Cargo informs pyo3 that the build is happening for the host ("native" in this PR more-or-less) then it probably wants to disregard the configuration and look at the host to see what it can do.

Well this seems to be highly dependent on the specific cross compilation requirements of the build script I think, which is why I didn't want it to directly change any of the other cross or host config variables that get passed to the build script or any behavior directly.

If Cargo informs pyo3 that the build is happening for the target (roughly "cross" as-is here, but not completely because this should show up in the cargo build case too) then pyo3 would look for configuration.

Well the build script looks like it really should be able to know how the cargo build was invoked, even if cargo's own internal logic should not change based on if a target flag was set. Essentially the cross param is a hint to the build script that it may in some cases depending on requirements would need to adjust its configuration logic due to a target flag having been passed to cargo. This hint could also be used by the build script merely for doing sanity checks that emit warnings.

This isn't really intended to be used for host vs target TargetConfig build target variant differentiation but rather to inform the build script how cargo was originally invoked.

If it finds no configuration and host==target then it would assume to link to the current system.

Yeah, so one of the motivations here is to provide a way for the build script to better deal with this particular edge case, currently pyo3 is using this sort of host==target comparison, however this is unreliable since host==target alone is not enough to determine if a target flag was passed to cargo or if one is cross compiling in cases where the host and target use the same triple, if a target flag was passed to cargo it's highly likely that the user is cross compiling and the build script should thus be informed of that so it can appropriately handle the situation or at least warn the user that additional information may be required for a successful build, it would generally be expected that the build script provide escape hatch env variables regardless for overriding behavior.

Hopefully that makes sense, I'm trying to cover additional background info for clarity as some of the more exotic scenarios may be non-obvious.

use std::env;
fn main() {
println!("cargo:foo=bar");
assert_eq!(env::var("CARGO_BUILD_TYPE").unwrap(), "cross");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is also a bug because this build script should be run twice, once with the host and once with the target.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, any idea how I should reproduce this issue in a test so that it fails?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have the build script produce different values based on CARGO_BUILD_TYPE and both of those could get fed into the final binary. The final binary could then print things out and assert that they're equal to their expected values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further review this current behavior appears to be what is intended, CARGO_BUILD_TYPE is for indicating if a target flag was passed to cargo, it's not intended to be used for determining if a build script is a host or target dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I think I managed to reproduce this issue while adding a CARGO_BUILD_DEPENDENCY_TYPE env variable and while attempting to filter CARGO_BUILD_TYPE to only be present for target dependency types.

@alexcrichton
Copy link
Member

@jameshilliard I'm not really sure what to do. It's been pointed out that the behavior you've implemented here is not the behavior the Cargo team would like to see implemented. The Cargo team does not want to have cargo build and cargo build --target $host behave differently, and merging this would be going a step further in that direction we do not want to take.

I'm not trying to dissuade you from this PR or say that your use case isn't valid, I've been trying to explain how this can't land right now until underlying bugs have been fixed. If you'd rather not fix the underlying bug, that's ok, but I'd like to be clear that this won't land in the meantime.

@jameshilliard
Copy link
Contributor Author

It's been pointed out that the behavior you've implemented here is not the behavior the Cargo team would like to see implemented.

I think what we would want for differentiating there is a different flag, I mean to me that looks to be effectively a separate issue than what I'm trying to fix here.

The Cargo team does not want to have cargo build and cargo build --target $host behave differently, and merging this would be going a step further in that direction we do not want to take.

I'm aware, however this is really not something cargo itself or build scripts dealing with pure rust projects should be picking up at all in general, it's for passing build type hints to build scripts that integrate with external dependencies that have different cross build requirements which are fundamentally not under cargo's direct control(like python for example). I understand that's the direction cargo wants to go but this is for better integrating with non-cargo dependencies effectively, at least I'm not seeing another good way to deal with this particular problem, this is merely to provide build scripts with that additional information so they can better handle exotic external integrations.

I'm not trying to dissuade you from this PR or say that your use case isn't valid, I've been trying to explain how this can't land right now until underlying bugs have been fixed. If you'd rather not fix the underlying bug, that's ok, but I'd like to be clear that this won't land in the meantime.

I mean I'm aware of the bug but it seems out of scope here to me. I agree we should fix that as well but I'm still not seeing how it is intertwined with what I'm trying to do here. The purpose of this particular PR is to inform all the build scripts being executed(regardless of if they are invoked to build for a host vs a target dependency) of whether or not a target flag was passed to cargo...nothing more...that's something that is expected to always be the same for all dependent build scripts as there's still only one top level cargo invocation here.

What you're looking for if I'm understanding correctly is really a different flag that informs the build script if it is being invoked to build for a host vs a target dependency, since this would actually change depending on if the build script is a host vs a target dependency. But that's not the purpose of what I'm trying to do here, that's a separate problem however I think and does require the fixes you are mentioning. I think in some scenarios build scripts would want to have both of these flags available. I'll try to investigate that fix more as I agree it is useful but it looks like it's a somewhat more complex and separate issue than the one I'm trying to deal with here.

@alexcrichton
Copy link
Member

I've tried to explain why the issues are unfortunately intertwined, but to reiterate this is further exposing the ability for build scripts and packages to realize that cargo build is not the same as cargo build --target. The behavior of build script execution differs in those two environments. The Cargo team doesn't want to leak more abilities to see how these two compilations are structurally different, because it increases the difficulty of actually fixing the issue because more projects may be relying on the old behavior.

For exotic configurations I'd recommend custom project-specific env vars for now. It's not great but it should at least get the job done and buys time until we can fix the underlying issues in Cargo.

@jameshilliard
Copy link
Contributor Author

I've tried to explain why the issues are unfortunately intertwined, but to reiterate this is further exposing the ability for build scripts and packages to realize that cargo build is not the same as cargo build --target.

I mean, it's intended to expose if a target flag is passed. But yeah I see the issue you're referring to.

The behavior of build script execution differs in those two environments.

Sure, but this would be intended to just inform the script if a target flag is passed to cargo, even if the build script execution is otherwise the same.

The Cargo team doesn't want to leak more abilities to see how these two compilations are structurally different, because it increases the difficulty of actually fixing the issue because more projects may be relying on the old behavior.

This is really intended just for cases where the build script is integrating with external dependencies where a build type hint is useful for determining how to tie in with those dependencies. One option I just thought of may be to only expose this variable when the script is being run for target dependencies and not host dependencies.

For exotic configurations I'd recommend custom project-specific env vars for now. It's not great but it should at least get the job done and buys time until we can fix the underlying issues in Cargo.

Well that's what I did for the moment in pyo3, however this isn't really something that can be fixed in cargo alone from what I can tell, the pyo3 build script needs to be able to differentiate between native and cross compilation scenarios from my understanding regardless.

@jameshilliard
Copy link
Contributor Author

The Cargo team does not want to have cargo build and cargo build --target $host behave differently, and merging this would be going a step further in that direction we do not want to take.

I think I got this mostly fixed in #9634, needs a good bit of cleanup though.

@bors
Copy link
Collaborator

bors commented Jul 20, 2021

☔ The latest upstream changes (presumably #9601) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton
Copy link
Member

Thanks for the PR, but I'm going to be stepping down from the Cargo team so I'm going to un-assign myself from this. The Cargo team will help review this when they get a chance.

Build scripts in some cases need to know if the build is a cross
build or a native build.

Set the CARGO_BUILD_TYPE env variable for build scripts so that
they can properly determine the build type being performed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants