Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle libclang with bindgen #918

Open
fitzgen opened this issue Aug 17, 2017 · 14 comments
Open

Bundle libclang with bindgen #918

fitzgen opened this issue Aug 17, 2017 · 14 comments

Comments

@fitzgen
Copy link
Member

fitzgen commented Aug 17, 2017

Goal: make it super easy for people to use bindgen (and crates to depend on bindgen without inflicting downstream pain), no installing libclang manually, no configuring LIBCLANG_PATH, no making sure your libclang is the right version.

Previously, we have speculated about getting rustup to distribute libclang for us. I talked with @alexcrichton and he pointed out that we don't even need to do that to get what we want. We can bundle libclang.{so,dll,dylib} into our crate directly and then have build.rs setup LIBCLANG_PATH as needed.

To be clear, this functionality would be behind a new feature, on by default. One could always turn this feature off to get the current behavior, and for targets for which we don't have a bundled libclang available, we will also have the current behavior.

There are still some open questions:

  • Should we put the bundled libclang into this repository? A different bindgen-libclangs repository?

  • Should we use git lfs?

  • Should the libclangs be literally bundled in the crate, and downloaded with the rest of the crate's source, or should we fetch only the target's libclang inside build.rs?

  • What targets should we bundle libclang for? Tier-1 platforms? Start with one or two and then add more as we go?

@fitzgen
Copy link
Member Author

fitzgen commented Aug 17, 2017

cc @emilio

@WiSaGaN
Copy link
Contributor

WiSaGaN commented Aug 17, 2017

Are we able to put this behind a feature under clang-sys? This seems me to be a general problem of budling the dev/runtime dependency though.

@fitzgen
Copy link
Member Author

fitzgen commented Aug 17, 2017

Are we able to put this behind a feature under clang-sys?

Ah, interesting idea!

cc @KyleMayes

@jamesmunns
Copy link
Member

jamesmunns commented Oct 2, 2017

Another note here is the size limit for Cargo. There is generally a 10MB crate limit, which would be a problem, as libclang is a bit larger than that (22MB for libclang 3.9 on linux):

root@722964a6980b:/usr/lib/x86_64-linux-gnu# ls -hal | grep clang
lrwxrwxrwx 1 root root   17 Dec  7  2016 libclang-3.9.so -> libclang-3.9.so.1
-rw-r--r-- 1 root root  22M Dec  7  2016 libclang-3.9.so.1

It appears that this can be raised on a case by case basis, but this would have to be kept in mind.

Edit: It does look like libclang compresses reasonably well:

root@722964a6980b:/usr/lib/x86_64-linux-gnu# tar cjf test.tar.bz2 libclang-3.9.so.1
root@722964a6980b:/usr/lib/x86_64-linux-gnu# ls -hal | grep test
-rw-r--r-- 1 root root 7.3M Oct  2 11:04 test.tar.bz2

If multiple versions are supported, we should also probably not download all versions, as that could be 100s of MBs just for the Tier-1 crates.

I do like the idea of being more "batteries included" option, especially if it is always possible to easily override the default (with feature flags, environment variables, etc).

Another idea would be to host libclang somewhere predictable, and download libclang from inside of a build.rs script, but I could see some problems with this (potential re-downloads across multiple projects, for one).

@fitzgen
Copy link
Member Author

fitzgen commented Oct 2, 2017

It appears that this can be raised on a case by case basis, but this would have to be kept in mind.

I talked with @alexcrichton about this in person, and he said it was no problem.

It does look like libclang compresses reasonably well:

Neat! That's much smaller than I had anticipated.

If multiple versions are supported,

If we start doing this, then I think we would want to bundle just one version of libclang that we can focus on. Other versions should generally still work, but we can focus most of our attention into supporting the blessed libclang version's oddities rather than all of the libclang version oddities at once.

I do like the idea of being more "batteries included" option, especially if it is always possible to easily override the default (with feature flags, environment variables, etc).

👍 yep

Another idea would be to host libclang somewhere predictable, and download libclang from inside of a build.rs script, but I could see some problems with this (potential re-downloads across multiple projects, for one).

Re-downloads across different projects is a problem with anything other than tight integration with rustup.

What this hosting story could introduce that putting libclang into the crate wouldn't is if a single project transitively depends on multiple different bindgen versions, we could download libclang multiple times.

... actually that's true regardless if libclang is inside the crate or downloaded in build.rs.

@Luthaf
Copy link

Luthaf commented Nov 13, 2017

Hey! I am interested in this feature, what do you think the way forward could be? Trying to bundle libclang in this crate? Bundling it in clang-sys?

If we want to bundle libclang, does this means compiling it ourself (and dealing with backward compatibility with old glibc/kernels on linux, or macOS deployment targets); or using some other pre-built libclang, like the one from conda-forge (https://anaconda.org/conda-forge/clangdev)?


Another idea would be to host libclang somewhere predictable, and download libclang from inside of a build.rs script, but I could see some problems with this (potential re-downloads across multiple projects, for one).

I would rather not do that, as one use case I have consist of compiling Rust code on supercomputers, which does not allow network access. If libclang is bundled inside this crate, I can just vendor the crates and scp everything on the supercomputer.

@KyleMayes
Copy link
Contributor

KyleMayes commented Nov 13, 2017

I experimented with using Git LFS to download binaries as a part of clang-sys a few weeks ago, but stopped when I recognized a few important issues (sorry for not bringing this up earlier):

  1. Git LFS is not a part of Git, users would need to install it themselves so that the clang-sys build script could invoke it to download the binaries from the repository. This just replaces installing Clang
    (and potentially pointing clang-sys to the installation with LIBCLANG_PATH and friends) with installing Git LFS, which while may be easier doesn't accomplish the original goal of not requiring users to install anything themselves.
  2. Git LFS on GitHub is only free up to 1 GiB of downloads a month, a limit which would be very quickly reached, even with compressed binaries.

I don't think Git LFS is a good solution. The second issue could be "solved" if the Rust team was willing to host the repository containing the binaries and pay for the Git LFS bundles, but I imagine this is probably considerably more expensive than more traditional file hosting methods.

I agree with @jamesmunns that putting the binaries into the repository without something like Git LFS is not a good solution either, because then everyone, even users not using this new optional bundling feature, would have to download these binaries every time they download clang-sys (unless the bundling feature lived on a separate branch, I suppose).

Perhaps the build script could download the binaries from a predictable location as suggested above, but into a shared location, something like ~/.rustup except for Clang binaries. Then, multiple versions of libclang would not all need to download the same thing.

@Luthaf
Copy link

Luthaf commented Nov 13, 2017

because then everyone, even users not using this new optional bundling feature, would have to download these binaries every time they download clang-sys

What about having a separated libclang-precompiled crate, that only contains the bundled libs, and is only a dependency when the feature is enabled?

@est31
Copy link
Member

est31 commented Nov 27, 2017

Uploading stuff to crates.io means it has to sit there for ever. Binaries are target specific, very big, and as short lived as the rest of the code. They will need to be updated very often. If you really need to mirror it I suggest putting it into a different cloud service than crates.io and e.g. publish URLs to download the library if build.rs detects that the library is missing.

I am personally very interested in mirroring all crates on crates.io. Right now my clone is about 17 GB, so not very big. Most of the "big offender" crates have huge binaries inside however. I wouldn't want bindgen to become one!

At the very least have a libclang-precompiled crate that I then can blacklist in my mirroring efforts, to not make me blacklist bindgen (this would break too many crates).

@Luthaf
Copy link

Luthaf commented Nov 26, 2018

Thinking more about this, I don't think that having a libclang-precompiled crate would solve my issues with libclang.

As I see it, this precompiled crate would be an optional dependency of libclang, that once activated would cause bindgen to use some pre-built version of libclang. And I would like the end user to be able to activate this feature when compiling some code if they don't have libclang installed. But as far as I know, it is not possible to activate a feature in a crate if you don't have a direct dependency on it. For example, in the following crates dependencies tree:

my-application
     |---> foo
            |---> foo-sys
                     |---> bindgen
                              |---> libclang-precompiled

only foo-sys could activate the libclang-precompiled feature of bindgen.

An option would be to have every single crate re-export the feature using something like

[features]
libclang-precompiled = ["<dependency that depends on bindgen>/libclang-precompiled"]

but that feels like a really non-ergonomic think to do.


Is distributing libclang through rustup really not an option? It would solve my issues with the libclang-precompiled feature activation, and should alleviate the concerns of having big, short-lived binaries taking space on crates.io.

@danobi
Copy link
Contributor

danobi commented Dec 7, 2020

What about vendoring libclang source code (via submodule) and building libclang in build.rs?

@vadorovsky
Copy link
Contributor

vadorovsky commented Aug 18, 2023

What about vendoring libclang source code (via submodule) and building libclang in build.rs?

I like this idea. I mean, currently it would mean including LLVM as a submodule and make sure that we pick only libclang with cmake, but I think it would still make sense.

I know I'm posting in an issue which is 3 years old, but is there any chance we can take this issue in consideration? I'm very happy to work on this if the idea gets acceptance. Or on any other idea which would remove the requirement on host clang in a certain version.

My problem is: bindgen is a dependency in librockdb-sys, which is a dependency in Solana. That results in this error on every distro which ships a recent version of LLVM:

Caused by:
  process didn't exit successfully: `/home/vadorovsky/repos/solana/target/debug/build/librocksdb-sys-abdc3fa2c4fdabfc/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'Unable to find libclang: "the `libclang` shared library at /usr/lib/llvm/16/lib/libclang.so.16.0.6+libcxx could not be opened: Dynamic loading not supported"', /home/vadorovsky/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bindgen-0.65.1/lib.rs:603:31

I see that clang-sys is being used with clang_6_0 feature:

clang-sys = { version = "1", features = ["clang_6_0"] }

My current "fix" is to build anything that depends on bindgen inside a Debian bullseye container which ships clang 11 (apparently still working with the clang_6_0 feature).

@pvdrz
Copy link
Contributor

pvdrz commented Aug 26, 2023

@vadorovsky I doubt your issue has to do with the fact that the clang-sys crate uses the clang_6_0 feature (which means this crate requires clang 6.0 or later). I work on bindgen on a dristro with clang 15.0.7 (which I think qualifies as recent) and it is able to link to clang without issue.

I've seen this particular issue about "Dynamic loading not supported" previous times in the issue tracker and the last time it happened was because the clang-sys version used by bindgen didn't support the latest clang (15 at the time).

However, clang-sys 1.4.0 (the version used by bindgen) supports clang 16 which is the most recent clang version according to https://releases.llvm.org/ and I can run bindgen with clang 16 without issues on my machine as well.

@HadrienG2
Copy link

See also #2767 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants