Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of code generators #204

Open
jschwe opened this issue Aug 28, 2022 · 19 comments · May be fixed by #247
Open

Integration of code generators #204

jschwe opened this issue Aug 28, 2022 · 19 comments · May be fixed by #247
Labels
enhancement New feature or request

Comments

@jschwe
Copy link
Collaborator

jschwe commented Aug 28, 2022

I implemented that approach of explicitly telling the code generator where to output the headers by setting an environment variable and it works well. I have written a proposal for upstreaming this approach into cxx-builder. That would allow Corrosion to automatically set that environment variable so using cxx with CMake wouldn't require anything different than using Corrosion with any other Rust library. @jschwe I would appreciate your input on that proposal.

Originally posted by @Be-ing in #150 (comment)

@jschwe
Copy link
Collaborator Author

jschwe commented Aug 28, 2022

@Be-ing I opened a new issue to discuss your request.

As far as I can see you wish for the following:

  1. Corrosion should set an environment variable with a path to a location where code generators may generate code into, probably something like ${CMAKE_BINARY_DIR}/generated.
  2. Corrosion should add the path to include directories.

I think in general, this is not only interesting for your specific project, but also for e.g. a potential cbindgen integration, or a bindgen extension that generates additional .h/.c files to make C static inline functions available to rust.

However, I think providing support for 2. is not so easy, because:

  • Corrosion would not know which CMake targets need to get the additional include directories. I would not like to add them globally or on a directory level.
  • If a CMake C/C++ target depends on generated headers, it should have a CMake dependency on the code generator, which in your case would be some Rust crate. How would corrosion know which crates generate code?

@Be-ing
Copy link

Be-ing commented Aug 28, 2022

If you don't want to unconditionally add the include paths, an option could be added to corrosion_import_crates to do so. For example:

corrosion_import_crates(MANIFEST_PATH rust/src/Cargo.toml CRATES rust-crate-that-uses-cxx CXX)

In case someone wants to import multiple crates, some of which use the code generator and some do not, they could make multiple calls to corrosion_import_crates with and without the option for using the code generator.

I don't think this is feasible in a general case; how to do this depends on the implementation details of the specific code generator. I am hoping we can reach a consensus how to handle this for cxx in dtolnay/cxx#462. If that is not accepted, Corrosion could still add a CXX_QT option for corrosion_import_crates, but I would prefer to have one CXX option that works for both cxx-build and cxx-qt-build.

@Be-ing
Copy link

Be-ing commented Aug 28, 2022

To clarify, for cxx and cxx-qt, I am only concerned with adding include paths for the generated headers. The code generator that runs in crates' build.rs not only generates C++, it also compiles it with the cc crate. This is important because otherwise the code generation would have to run at CMake configure time if CMake was building the generated C++ code.

@jschwe
Copy link
Collaborator Author

jschwe commented Aug 29, 2022

I am only concerned with adding include paths for the generated headers

I still believe this is hard to do without the user specifying which targets should get the additional include paths. include_directories() only affects the current directory and downwards, so depending on your directory structure, so this would only work if the directories are structured accordingly. I prefer the user/downstream specifying which targets should have the additional include directories.

The code generator that runs in crates' build.rs not only generates C++, it also compiles it with the cc crate. This is important because otherwise the code generation would have to run at CMake configure time if CMake was building the generated C++ code.

At work we also have a custom code generator, which is run at build time. Is there something special about your project that requires running it at configure time? Have you looked at add_custom_command()?

I don't think this is feasible in a general case; how to do this depends on the implementation details of the specific code generator.

Cmake >= 3.18 supports indirect function calls, so I think we may be able to provide some interface where downstream can customize the generator behavior via their own function. I have not thought this through at all, but maybe corrosion could provide something along the lines of:

corrosion_generate_headers(TARGETS <targets which should receive updated include directories>  
   INCLUDE_VISIBILITY <PUBLIC|PRIVATE|INTERFACE>
   OUTPUT_DIRECTORY <directory where the generator should generates the files too>
   GENERATOR_FN <function to call downstream defined function, which defines one or multiple custom 
                              commands to create some output header file. A list of output files is returned back to 
                             `corrosion_generate_headers` via a parameter.>
   GENERATOR_CUSTOM_ARGUMENTS <additional arguments which should be passed to GENERATOR_FN.>

   DEPENDS <>

@Be-ing
Copy link

Be-ing commented Aug 29, 2022

I still believe this is hard to do without the user specifying which targets should get the additional include paths.

I don't understand what the issue is with the user specifying which targets use the code generator.

@Be-ing
Copy link

Be-ing commented Aug 29, 2022

Cmake >= 3.18 supports indirect function calls, so I think we may be able to provide some interface where downstream can customize the generator behavior via their own function.

I think that would be overcomplicated. Users may as well add target_include_directories after corrosion_import_crates as they can already do.

@Be-ing
Copy link

Be-ing commented Aug 29, 2022

At work we also have a custom code generator, which is run at build time. Is there something special about your project that requires running it at configure time? Have you looked at add_custom_command()?

cxx-qt-build outputs multiple C++ files and their names are dependent on the contents of the Rust modules they are generated from, so CMake can't know at configure time what source files to add to the target without running the code generator. But that problem is already solved by compiling the generated C++ with the cc crate in build.rs.

@jschwe
Copy link
Collaborator Author

jschwe commented Aug 30, 2022

I don't understand what the issue is with the user specifying which targets use the code generator.

Users may as well add target_include_directories after corrosion_import_crates as they can already do.

Ah, then this was a misunderstanding on my side. I was under the impression that you didn't want the user to have to specify which targets get the additional include paths. So then for your use-case it would be sufficient to set an environment variable pointing to a location where the code-generator should generate the code to?

@Be-ing
Copy link

Be-ing commented Aug 30, 2022

My goal is for Corrosion to handle both setting the environment variable and adding the include path so the user doesn't even need to know about these. Adding a CXX option to corrosion_import_crates to selectively enable this feature would be okay.

@Be-ing
Copy link

Be-ing commented Aug 30, 2022

I think in general, this is not only interesting for your specific project, but also for e.g. a potential cbindgen integration, or a bindgen extension that generates rust-lang/rust-bindgen#1090.

Yes, if we could get cxx and cbindgen upstreams to agree to using the same environment variable, Corrosion would only need one option to handle them all.

@jschwe
Copy link
Collaborator Author

jschwe commented Aug 31, 2022

My goal is for Corrosion to handle both setting the environment variable and adding the include path so the user doesn't even need to know about these

Sorry, but I still don't see a good way for corrosion to correctly add include paths without the user providing additional information.

I don't understand what the issue is with the user specifying which targets use the code generator.

To clarify, the include directories should be set via target_include_directories() which requires knowledge of the target which should receive the additional include directories.
Using include_directories would only work for some project structures, where target is in the same or a subdirectory, but not if the target requiring the include directory is in a neighbor or parent directory.

I'm open to proving an environment variable which could specify a directory such as ${CMAKE_BINARY_DIR}/corrosion-generated for downstream users to place their generated output into.
I don't think cxx or cbindgen should have to care about such a variable. The user of the code generator should just specify the output directory when invoking the generator.

@Be-ing
Copy link

Be-ing commented Aug 31, 2022

I never proposed using include_directories, nor do I ever recommend using those legacy non-target-specific CMake functions.

I don't think cxx or cbindgen should have to care about such a variable.

I don't understand how cxx or cbindgen could know where to put their output without being aware of this environment variable.

@jschwe
Copy link
Collaborator Author

jschwe commented Aug 31, 2022

I never proposed using include_directories, nor do I ever recommend using those legacy non-target-specific CMake functions.

Then I just really don't understand how you expect corrosion to know which C/C++ CMake targets should get the additional include directories. Could you give an example of how the syntax of corrosion_import_crate would look like in your vision, and how corrosion would then know which C/C++ targets need the additional include directories?

I don't understand how cxx or cbindgen could know where to put their output without being aware of this environment variable.

Presumably the code generator would anyway need to be called by the user, because options need to be specified. E.g. cbindgen takes a --output <out_header.h> option, so the user could just specify that.

@Be-ing
Copy link

Be-ing commented Aug 31, 2022

corrosion_import_crates(MANIFEST_PATH rust/src/Cargo.toml CRATES rust-crate-that-uses-cxx CXX)

could set the environment variable and the include path for the rust-crate-that-uses-cxx target.

Presumably the code generator would anyway need to be called by the user, because options need to be specified. E.g. cbindgen takes a --output <out_header.h> option, so the user could just specify that.

The code generators are Rust libraries called by build.rs, not CLI tools. I tried the CLI tool approach, it was not a good solution.

@jschwe
Copy link
Collaborator Author

jschwe commented Sep 5, 2022

Ah thanks, I think I understand a bit better now what you want. Basically you want to set INTERFACE_INCLUDE_DIRECTORIES to the location of the generated headers, so that CMake can add the includes to any C/C++ targets linking in a Rust library.
Is that correct so far?

The code generators are Rust libraries called by build.rs, not CLI tools.

That shouldn't really matter though. From your linked issue, I it seems your problem is mainly that cxx-bridge does not support specifying the output location.

Currently I'm still not convinced that this is something that needs to be done on the corrosion side. Couldn't you just add a cxx-qt wrapper aroud corrosion_import_crate() that internally does

corrosion_import_crate(...)
set_env_vars(...)
add_include_directories(... INTERFACE ...)

@Be-ing
Copy link

Be-ing commented Sep 8, 2022

Basically you want to set INTERFACE_INCLUDE_DIRECTORIES to the location of the generated headers, so that CMake can add the includes to any C/C++ targets linking in a Rust library.
Is that correct so far?

Yes.

it seems your problem is mainly that cxx-bridge does not support specifying the output location.

Strictly speaking for cxx-qt's purposes, it doesn't need to because cxx-qt-build does not use cxx-build. However, it would simplify the implementation and usage of Corrosion if cxx, cxx-qt, and hopefully cbindgen as well converged on using the same environment variable to specify the directory to output generated headers.

Couldn't you just add a cxx-qt wrapper aroud corrosion_import_crate()

That is possible. cxx-qt used to have its own CMake module before using Corrosion (this was required before cxx-qt-build compiled the C++ code it generated). The problem is that requires distributing a CMake module for cxx-qt, which makes it more complicated for downstream users. It would be simpler for users of cxx-qt to only need Corrosion and have the rest handled by Cargo.

@jschwe jschwe added the enhancement New feature or request label Sep 20, 2022
@jschwe
Copy link
Collaborator Author

jschwe commented Oct 27, 2022

#244 will be a first step for cxx integration.

Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
Be-ing added a commit to Be-ing/corrosion that referenced this issue Oct 28, 2022
This sets the GENERATED_HEADER_DIR environment variable when calling
cargo and adds that path to the target's include paths. Cargo build
scripts can read this to determine where to output generated C/C++
headers.

Depends on dtolnay/cxx#1120 for cxx-build

Fixes corrosion-rs#204
@Be-ing
Copy link

Be-ing commented Jun 2, 2023

I had a discussion about this at Cargo office hours a few weeks ago. The conclusion we came to was this could be incorporated into the upcoming binary dependency feature of Cargo, where crates can specify bin crates as dependencies. If Cargo exposed a command line interface for running those binary dependencies, for example something like cargo run --package my_cxx_app --bindep cxxbridge-cmd -- cli-arguments-to-bindep, that would solve the thorny "where does the binary come from" question, the version of the code generator could be precisely controlled with Cargo.lock, and there wouldn't be any redundant compilation of Rust dependencies.

@jschwe
Copy link
Collaborator Author

jschwe commented Jun 3, 2023

That does sound useful, but it seems there are still a number of issues remaining before bindeps can be stabilized: https://github.com/rust-lang/cargo/labels/Z-bindeps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants