Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI segfaults on FreeBSD #1489

Closed
gnzlbg opened this issue Aug 24, 2019 · 27 comments · Fixed by #1507
Closed

CI segfaults on FreeBSD #1489

gnzlbg opened this issue Aug 24, 2019 · 27 comments · Fixed by #1507

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 24, 2019

CI has started to segfault on FreeBSD since the latest nightly.

This build passed with nightly-x86_64-unknown-freebsd 1.39.0 (760226733 2019-08-22)
This build failed with nightly-x86_64-unknown-freebsd 1.39.0 (9eae1fc0e 2019-08-23)

The segfault is triggered by the build.rs of rand_pcg v0.1.2 and rand_chacha v0.1.1 (cc @dhardy), which AFAICT only call autocfg (cc @cuviper ). There was an update of autocfg recently so initially I thought that might be it, but it seems that the builds were working correctly with that update, so AFAICT neither autocfg nor rand are at fault here (cc @mati865 @asomers ). Looking at the recently merged PRs in rust-lang/rust, I don't see any suspicious one.

@mati865
Copy link
Contributor

mati865 commented Aug 24, 2019

I'm mostly unavailable until Monday/Tuesday.
Without backtrace it'll be hard to find the reason. Maybe you could run GDB on that CI or even locally?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 24, 2019

I have a minimal small-ish reproducer now: #1490

# Cargo.toml
[package]
name = "test_failure"
version = "0.1.0"
authors = ["The Rust Project Developers"]
license = "MIT OR Apache-2.0"

[build-dependencies]
autocfg = "0.1"

and build.rs:

extern crate autocfg;

fn main() {
    println!("cargo:rerun-if-changed=build.rs");
    let ac = autocfg::new();
    ac.emit_rustc_version(1, 26);
}

This fails on FreeBSD12, but not on FreeBSD11. A backtrace of the segfault would be nice.

@asomers
Copy link
Contributor

asomers commented Aug 24, 2019

I'll take a look.

@asomers
Copy link
Contributor

asomers commented Aug 24, 2019

Something is calling fstatat, and it's calling the FreeBSD 12 version but probably expecting the FreeBSD 11 structure definitions. It screws up the stack so the core file's stack trace is useless, but I think this is the culprit:

#0  0x00000008012df594 in stat () from /lib/libc.so.7
#1  0x0000000001065f60 in std::sys::unix::fs::stat ()
    at /checkout/src/libstd/sys/unix/fs.rs:736
#2  0x0000000001054d47 in std::fs::metadata (path=0x7fffffffe1a8)
    at /checkout/src/libstd/fs.rs:1398
#3  0x000000000104609d in autocfg::AutoCfg::with_dir (dir=...)
    at /usr/home/somers/.cargo/registry/src/github.com-1ecc6299db9ec823/autocfg-0.1.6/src/lib.rs:143
#4  0x0000000001045d8a in autocfg::AutoCfg::new ()
    at /usr/home/somers/.cargo/registry/src/github.com-1ecc6299db9ec823/autocfg-0.1.6/src/lib.rs:123
#5  0x0000000001045c92 in autocfg::new ()
    at /usr/home/somers/.cargo/registry/src/github.com-1ecc6299db9ec823/autocfg-0.1.6/src/lib.rs:109
#6  0x0000000001045696 in build_script_build::main () at build.rs:5

Why isn't libc using the right link_name?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 24, 2019

@asomers

Why isn't libc using the right link_name?

Can you submit a PR fixing that ?

@asomers
Copy link
Contributor

asomers commented Aug 24, 2019

I'm still trying to figure out what's going on. I verified that the 8-22 nightly compiler works and the 8-23 nightly doesn't, but I don't see any obviously relevant changes in that date range. rust-lang/rust's liblibc does use the correct link_name. The problem is something more subtle. I'll keep investigating .

@cuviper
Copy link
Member

cuviper commented Aug 24, 2019

The only thing in rust-lang/rust@7602267...9eae1fc that looks relevant is the libc upgrade, 0.2.61...0.2.62 0.2.60...0.2.61, which includes the FreeBSD changes of #1467.

The new match which_freebsd() here is emitting cargo:rustc-cfg=freebsd11 even for the unknown case, but Rust CI is building for FreeBSD 10.3.

https://github.com/rust-lang/rust/blob/eeba189cfb2cfc5c5898513352d4ca8f1df06e05/src/ci/docker/scripts/freebsd-toolchain.sh#L8

@cuviper
Copy link
Member

cuviper commented Aug 24, 2019

Since fstatat was specifically mentioned earlier:

libc/src/unix/mod.rs

Lines 632 to 638 in 9af04ce

#[cfg_attr(target_os = "macos", link_name = "fstatat$INODE64")]
#[cfg_attr(
all(target_os = "freebsd", freebsd11),
link_name = "fstatat@FBSD_1.1"
)]
pub fn fstatat(dirfd: ::c_int, pathname: *const ::c_char,
buf: *mut stat, flags: ::c_int) -> ::c_int;

So forcing freebsd11 also forces that link name, but what happens when this is built against FreeBSD 10 on Rust CI? Does it actually end up linking without the specific version?

I think the reason this hit autocfg in particular is just that it's used in a build script, and those are built with -Cprefer-dynamic, so it will link to libstd.so that was itself linked in Rust CI's environment. (Versus regular builds using libstd.rlib that will link locally.)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 24, 2019

Does it actually end up linking without the specific version?

Isn't the C library dynamically linked here ? As in, shouldn't it pick the FreeBSD12 C library and try to find there the fstatat@FBSD_1.1 symbol ? I'm not sure what happens if in the C library using during the Rust CI build doesn't have this symbol.

So forcing freebsd11 also forces that link name, but what happens when this is built against FreeBSD 10 on Rust CI?

So this is the bug. We should add a freebsd10 cfg macro until Rust CI is upgraded to FreeBSD 11, and only enable that when building as part of libstd. Cirrus-CI has a freebsd-10-4-release-amd64 image, we should add that to CI to reproduce this and test that no regressions are introduced.

@asomers
Copy link
Contributor

asomers commented Aug 24, 2019

So this is the bug. We should add a freebsd10 cfg macro until Rust CI is upgraded to FreeBSD 11, and only enable that when building as part of libstd. Cirrus-CI has a freebsd-10-4-release-amd64 image, we should add that to CI to reproduce this and test that no regressions are introduced.

No, I don't think that's it. Very few symbols changed between FreeBSD 10 and 11, and fstatat was not among them. Could LIBC_CI be getting set somehow?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 25, 2019

I don't know how - searching rust-lang/rust for LIBC_CI doesn't reveal anything.

I set up a freebsd10 build job here (#1491) and tried to use the different link names, but that doesn't appear to make a difference.

@asomers
Copy link
Contributor

asomers commented Aug 25, 2019

I can reproduce the problem without using autocfg or build.rs. But I still don't know why Rust is using the wrong symbol version. Could it be building libc without using the build.rs at all? That would do it.

use std::path::PathBuf;

#[test]
fn t() {
    let pb = PathBuf::from(".");
    println!("{:?}", std::fs::metadata(&pb).unwrap());
}

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 25, 2019

I can reproduce the problem without using autocfg or build.rs. But I still don't know why Rust is using the wrong symbol version.

First, what's the wrong symbol ? I suppose that the bug is that instead of picking the symbol that we use for FreeBSD11, the fstatat symbol (generic one) gets used, and that happens to point to a different symbol than the FreeBSD11 one in FreeBSD12. Is that it?

If so, then #1491 should fix that. It detects FreeBSD10, and uses the same symbols as FreeBSD11. It's not a great fix, but might do.

@cuviper
Copy link
Member

cuviper commented Aug 25, 2019

You can use cargo build -v to see all of the rustc commands, including whatever cfg options you expect.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 25, 2019

A better fix would be to either migrate rust-lang/rust to FreeBSD11, or to downgrade libc to target FreeBSD10 by default.

@asomers
Copy link
Contributor

asomers commented Aug 25, 2019

I don't think #1491 will fix the problem. IIUC, build.rs should be emitting cargo:rustc-cfg=freebsd11 whenever LIBC_CI is undefined. That should've pulled in the correct link names for FreeBSD 11. And yet rust isn't using them.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 25, 2019

@pietroalbini @cuviper is there a way to find the build logs for that particular nightly version ?

There we should be able to see if --cfg freebsd11 is passed when building libc as part of libstd.

@semarie
Copy link
Contributor

semarie commented Aug 26, 2019

If I didn't mess myself, libstd on nightly is currently build using libc 0.2.51.

And this version of libc doesn't provide --cfg freebsd11 in build.rs.

@cuviper
Copy link
Member

cuviper commented Aug 26, 2019

Cargo.toml versions are treated as minimum semver requirements, but it may still use something newer with semver compatibility. Cargo.lock tells you the actual versions used for the whole workspace.

https://github.com/rust-lang/rust/blob/4c58535d09d1261d21569df0036b974811544256/Cargo.lock#L1585

@cuviper
Copy link
Member

cuviper commented Aug 26, 2019

Oh, but I did mistake the versions earlier... that nightly change here was really 0.2.60...0.2.61 -- sorry for that confusion!

@mati865
Copy link
Contributor

mati865 commented Aug 26, 2019

@gnzlbg there are no specific jobs for nightly builds. AFAIK most recent merge that completes before 00:00 UTC is promoted as nightly.
Libc isn't built in verbose mode in Rust PRs so you cannot tell which features are enabled from the logs.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 26, 2019

@cuviper if libstd is using 0.2.61 then that might explain this. IIRC, the build.rs being used there had a bug for freebsd and wasn't passing --cfg freebsd11 by default. That was fixed in 0.2.62, so maybe bumping libc in rust-lang/rust to 0.2.62 fixes this issue.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 26, 2019

E.g. see here: 0.2.60...0.2.61#diff-a7b0a2dee0126cddf994326e705a91eaR18

If the freebsd version is 10, like for Rust CI, then no --cfg freebsdxy is passed to cargo... as opposed to 0.2.62, where this was fixed, and a default of --cfg freebsd11 is used.

@mati865
Copy link
Contributor

mati865 commented Aug 27, 2019

maybe bumping libc in rust-lang/rust to 0.2.62 fixes this issue.

Updating getrandom in rust-lang/rust#63806 pulled libc = 0.2.62 so the answer should come soon.

@asomers
Copy link
Contributor

asomers commented Sep 1, 2019

I tried to build rust from scratch. 5000 seconds later it crashed with the same bug. Updating libc to 0.2.62 didn't help. Now I'm trying to build using a local checkout of libc. But it's not working. When I try to patch libc to use a local path, I get errors like this. Can anybody more experienced with building rust help me:

Updating only changed submodules
Submodules updated in 0.04 seconds
   Compiling libc v0.2.64 (/usr/home/asomers/src/rust/libc)
   Compiling bootstrap v0.0.0 (/usr/home/asomers/src/rust/rust/src/bootstrap)
    Finished dev [unoptimized] target(s) in 8.29s
Building stage0 std artifacts (x86_64-unknown-freebsd -> x86_64-unknown-freebsd)
   Compiling libc v0.2.64 (/usr/home/asomers/src/rust/libc)
error: the feature `cfg_target_vendor` has been stable since 1.33.0 and no longer requires an attribute to enable
  --> /usr/home/asomers/src/rust/libc/src/lib.rs:28:13
   |
28 |     feature(cfg_target_vendor, link_cfg, no_core)
   |             ^^^^^^^^^^^^^^^^^
   |
   = note: `-D stable-features` implied by `-D warnings`

error: unused attribute
  --> /usr/home/asomers/src/rust/libc/src/lib.rs:34:1
   |
34 | #![no_std]
   | ^^^^^^^^^^
   |
   = note: `-D unused-attributes` implied by `-D warnings`

error: crate-level attribute should be in the root module
  --> /usr/home/asomers/src/rust/libc/src/lib.rs:34:1
   |
34 | #![no_std]
   | ^^^^^^^^^^

error: aborting due to 3 previous errors

error: Could not compile `libc`.

Simply commenting out the offending lines produces other, different build errors. It's as if libc were getting built with the wrong options. But removing the [patch.crates-io] and instead updating the dependency in each subcrate produces the same results.

@mati865
Copy link
Contributor

mati865 commented Sep 3, 2019

@asomers
Copy link
Contributor

asomers commented Sep 3, 2019

Actually, updating to 0.2.62 does fix the problem. Earlier when I said that it didn't, it's because I didn't do a full build; I took a shortcut.

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue Sep 6, 2019
bors added a commit that referenced this issue Sep 11, 2019
Test FreeBSD 12 on latest nightly

~~Let's see if [libc update](rust-lang/rust#63806) for Rust fixed it.~~

Fixes #1489
@bors bors closed this as completed in b156fb3 Sep 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants