Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.2.61 regression #1466

Closed
carllerche opened this issue Aug 13, 2019 · 20 comments · Fixed by #1467
Closed

0.2.61 regression #1466

carllerche opened this issue Aug 13, 2019 · 20 comments · Fixed by #1467

Comments

@carllerche
Copy link
Member

The 0.2.61 release causes the Tokio CI to fail on FreeBSD:

Failure: https://cirrus-ci.com/task/5089534616797184

I tested the CI run with 0.2.60 and it passes: tokio-rs/tokio#1434

I checked the diff, but did not see anything that would obviously cause this. I will continue to investigate, but any advise would be appreciated.

@semarie
Copy link
Contributor

semarie commented Aug 14, 2019

I suspect #1440 to have changed some "default" values when LIB_CI isn't used (the standard case, because it should be used for CI only).

Some code changed from "defined when freebsd12 is absent" to "defined when freebsd11 is present" (which assume LIB_CI to be set). So the default behaviour isn't tested anymore.

@asomers @pizzamig I think build.rs should provide cargo:rustc-cfg=freebsd11 by default now, and not only when LIB_CI is defined.

@gnzlbg gnzlbg mentioned this issue Aug 14, 2019
@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

@semarie I also suspect #1440 introduced the issue.

I haven't looked at this into great detail, but a couple of things are clear (1) libc builds for the target, (2) the libc APIs that tokio uses are part of the built library (otherwise tokio would have failed to compile), and (3) the types of these APIs did not change substantially (otherwise tokio would have failed to compile).

That CI build job is using FreeBSD 12.0, so maybe this is caused by an ABI difference that's leading to undefined behavior ?

@semarie
Copy link
Contributor

semarie commented Aug 14, 2019

build.rs defines freebsd11 only when LIBC_CI is used:

    if env::var("LIBC_CI").is_ok() {
        if let Some(11) = which_freebsd() {
            println!("cargo:rustc-cfg=freebsd11");
        }
        if let Some(12) = which_freebsd() {
            println!("cargo:rustc-cfg=freebsd12");
        }
        if let Some(13) = which_freebsd() {
            println!("cargo:rustc-cfg=freebsd13");
        }
    }

so in normal build, there is no frebsd11 defined.

Now in src/unix/bsd/freebsdlike/mod.rs :

    #[cfg_attr(
        all(target_os = "freebsd", freebsd11),
        link_name = "kevent@FBSD_1.0"
    )]
    pub fn kevent(kq: ::c_int,
                  changelist: *const ::kevent,
                  nchanges: ::c_int,
                  eventlist: *mut ::kevent,
                  nevents: ::c_int,
                  timeout: *const ::timespec) -> ::c_int;

    #[cfg_attr(
        all(target_os = "freebsd", freebsd11),
        link_name = "mknodat@FBSD_1.1"
    )]
    pub fn mknodat(dirfd: ::c_int, pathname: *const ::c_char,
                  mode: ::mode_t, dev: dev_t) -> ::c_int;

before it was all(target_os = "freebsd", not(freebsd12)), so when no freebsd12 the link_name was used. so at least kevent() and mknodat() doesn't have the right link_name on freebsd11 now.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

@semarie Thanks, I agree that this is the bug, but I still don't understand why that is causing the tokio build jobs to fail.

Tokio is using FreeBSD12 on their CI, and when building libc without freebsd11 defined, kevent gets the kevent symbol. That would be bad, if the ABI of kevent and kevent@FBSD_1.0 differ. However, we test FreeBSD12 in our CI without freebsd11 and the tests for kevent pass. This would mean that FreeBSD12 kevent symbol being picked is ABI compatible with what we define in libc. That is, AFAICT, even though libc has an error, the symbol being called is at least ABI compatible with libc.

For this to be an issue, the semantics of the kevent API must have changed from FreeBSD11 to FreeBSD12, and what the tokio error means is that tokio requires FreeBSD11 semantics.

I think it would be worth it to exactly pin-point why the tokio tests are failing. Knowing this might be useful for the discussion about how to let users choose the FreeBSD version that libc targets (e.g. changing semantics of some ABI-compatible symbols is something that they should be aware of).

@pizzamig
Copy link
Contributor

I guess that I introduced an error in the logic how things are handled.
@semarie is right, a freebsd11 or freebsd12 has to be always defined, not only when LIBC_CI is defined.

The struct kevent passed to kevent() is changed from 11 to 12. Hence:

  • in FreeBSD 11 you have kevent@FBSD_1.0
  • in FreeBSD 12 you have kevent@FBSD_1.0 and kevent@FBSD_1.5, to support the new feature, while providing retro-compatibility.

For what I can see, if freebsd12 is defined, the new definition of struct kevent is picked up (src/unix/bsd/freebsdlike/freebsd/freebsd12/mod.rs:20) while there is only one defintion for the kevent() system call (src/unix/bsd/freebsdlike/mod.rs:1156).
I've spot those kind of error with libc_test, however, I guess that the missing LIBC_CI can cause the confusion.
I'll work on a patch immediately

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

@pizzamig there is a patch for the LIBC_CI issue already: #1467

I've spot those kind of error with libc_test

If the kevent struct is wrong for freebsd12 and up, it would be worth it to investigate why CI does not fail for it, and fix that and the struct definition.

@pizzamig
Copy link
Contributor

pizzamig commented Aug 14, 2019

@gnzlbg
The struct kevent is correctly defined in libc.
In FreeBSD 12 there is the extra field pub ext: [u64; 4]

In the mio crate (sys/unix/kqueue.rs) I see a macro creates a struct libc::kevent without the extra field. The compiler should complain about the missing field in the initializer.
I wonder how mio can use the freebsd11 struct kevent while the freebsd12 fn kevent is used,

@pizzamig
Copy link
Contributor

I was able to successfully test tokio against a patched libc and a patched mio, where I added the additional field ext: [64; 4] to the mio macro kevent in sys/unix/kqueue.rs.

However, I don't know how to make the field ext: [64; 4] optional with the version of FreeBSD in the mio crate.
Any idea?

@DoumanAsh
Copy link
Contributor

DoumanAsh commented Aug 14, 2019

@pizzamig Can mio actually detect when the field present? I think without some dirty hacks it is unlikely, unless we somehow can easily spot freebsd 11 and 12?

But the fact that it compiled on 12 without problem indicates that field was missing in libc too?

Also which path will you take for kevent function on >=12?

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

In the mio crate (sys/unix/kqueue.rs) I see a macro creates a struct libc::kevent without the extra field.

The problem probably is that, because cfg(freebsd11) is not true, the kevent function from FreeBSD12 is being called. However, because cfg(freebsd12) is not true either, the kevent struct from FreeBSD11 is being used, which lacks the extra field.

The kevent function of FreeBSD12 is probably invoking undefined behavior, because it is attempting to read or write from a struct field that the struct definition being used by tokio does not have.

The fix for this is to remove this catch all else block here:

https://github.com/rust-lang/libc/blob/master/src/unix/bsd/freebsdlike/freebsd/mod.rs#L1337

and replace it with an else if cfg(freebsd11) { ...}.

That way, if the appropriate cfg(freebsdXY) is not defined, instead of getting the definitions for freebsd11 by default, you'll get a compile-time error saying that these are not available.

@pizzamig
Copy link
Contributor

@gnzlbg that's exactly what is happening. Nice catch!
Do you know how if there is a way to "export" this cfg from the libc crate, in such a way that it can be used by other crates?
In this case, mio has to know how kevent looks like (with or without the extra field).
The worst case scenario, is to duplicate the detection performed by build.rs in libc in every crate using kevent on FreeBSD, quite inconvenient..

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

Do you know how if there is a way to "export" this cfg from the libc crate, in such a way that it can be used by other crates?

The libc crate only supports freebsd11. Other freebsd versions are not supported.

What you want is probably some sort of target-platform versioning, see #570 .

@DoumanAsh
Copy link
Contributor

DoumanAsh commented Aug 14, 2019

If freebsd11 is only supported version does it mean that kevent should be linked with kevent@FBSD_1.0 always?

Right now kevent link_name is kevent@FBSD_1.0 only on 11, but what about 12?
If it is not supported, is it going to be ignored?

In this case, mio has to know how kevent looks like (with or without the extra field).

As work-around it would be good to have ctor that could allow initialize common fields, and the rest could be done by user, if necessary

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

If freebsd11 is only supported version does it mean that kevent should be linked with kevent@FBSD_1.0 always?

Yes.

but what about 12?
If it is not supported, is it going to be ignored?

Yes.

@pizzamig
Copy link
Contributor

@gnzlbg
If I have understood correctly, because it's currently too complicated to support different versions of FreeBSD, libc should compile assuming that's freebsd11, and my patch broke this assumption, moving checks from not freebsd11 to freebsd12 because normally no symbols were defined if LIBC_CI was not defined.
Removing the LIBC_CI check, freebsd11 won't be the default and only version supported.
I'd rather keep the LIBC_CI check in place and, if not defined, explicitly default to freebsd11, otherwise we have to fix all crates to also detect which version of FreeBSD they are using.
At least, until a more general solution is not accepted and adopted.

@DoumanAsh the kevent link_name you see is to force libc to use the kevent freebsd11 version even if you are on freebsd12, where both versions are available for compatibility:

$  objdump -T /lib/libc.so.7| grep kevent
000000000008fab0  w   DF .text	0000000000000011 (FBSD_1.0)   kevent
0000000000090d10  w   DF .text	000000000000000e  FBSD_1.5    kevent

@DoumanAsh
Copy link
Contributor

the kevent link_name you see is to force libc to use the kevent freebsd11 version even if you are on freebsd12, where both versions are available for compatibility:

But current implementation uses this name when freebsd11 is defined, which shouldn't be the case for freebsd12, right?

@pizzamig
Copy link
Contributor

The previous logic was: use freebsd11 definitions if freebsd12 is not detected. So, if no name is defined, it would automatically default to freebsd11 definitions.
The current implementation (my patch, my bad) adopt freebsd11 names if freebsd11 is defined. If no name is defined, you get the bug that we are discussing right now.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

If I have understood correctly, because it's currently too complicated to support different versions of FreeBSD, libc should compile assuming that's freebsd11, and my patch broke this assumption,

Correct.

--

But current implementation uses this name when freebsd11 is defined

Right.

, which shouldn't be the case for freebsd12, right?

Right, when freebsd12 is defined, a different name (kevent) is used.

To clarify: libc should always be compiled with freebsd11 defined, always uses the FreeBSD11 symbols and type definitions, and that works on FreeBSD12 and 13 because they provide FreeBSD11 backward compatible symbols.

The freebsd11,freebsd12,freebsd13 cfgs aren't user-facing options - they are private implementation details only. There is currently no supported way for a user to compile a libc that uses the symbols of FreeBSD12 or FreeBSD13 and is therefore backward incompatible with FreeBSD11.

bors added a commit that referenced this issue Aug 15, 2019
Fix FreeBSD

#1440 broke FreeBSD by changing `libc` FreeBSD targets to require a `cfg(freebsdXX)` to be defined, but not updating `build.rs` to define `freebsd11` when `LIBC_CI` is not available. Since `LIBC_CI` is always defined on CI, this issue went undetected.

This PR fixes that issue in the `build.rs` and introduces a build task that tests FreeBSD without `LIBC_CI` on FreeBSD11, although I'm  not sure this would have caught the issue in #1466 .

Closes #1466 .
@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 15, 2019

@carllerche i've published 0.2.62 with what we believe is a fix. Please do let us know and/or re-open this issue if updating to libc 0.2.62 doesn't fix this for you.

@DoumanAsh
Copy link
Contributor

I just saw CI on freebsd 12 no longer failing with new version, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants