Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On MacOS w/M1, Rayon tests segfaults on crossbeam-deque 0.8, fixed by reverting to 0.7 #869

Open
HackerFoo opened this issue Jul 16, 2022 · 4 comments
Labels
bug crossbeam-deque crossbeam-epoch O-ARM Target: ARM processors (arm, thumb and AArch64 targets)

Comments

@HackerFoo
Copy link

I found that Rayon segfaults in my app after recently updating rustc, so I ran cargo test on that project. I ran git bisect to narrow the commit down to a change from crossbeam-deque from 0.7.2 to 0.8.0. I suspected the compiler, but found that tests fail on each stable rustc version back to 1.58.0 on my machine (MacOS 12.4, M1/aarch64).

Changing crossbeam-deque to 0.7.4 fixes the tests. So I have evidence that it may need to be fixed here, although I haven't narrowed down the failure yet.

rayon-rs/rayon#956

@taiki-e
Copy link
Member

taiki-e commented Jul 16, 2022

This seems similar to the case mentioned in #860.

I said in #860 (comment):

Reducing MAX_OBJECTS makes it more likely to trigger any potential data races. So reverting #552 may reduce the occurrence of SIGSEGV. (Of course, that does not mean that the underlying bug is fixed.)

Could you try to revert #552 and test it?

@taiki-e taiki-e added O-ARM Target: ARM processors (arm, thumb and AArch64 targets) bug crossbeam-epoch crossbeam-deque labels Jul 16, 2022
@HackerFoo
Copy link
Author

Reverting #552 fixes Rayon's tests and works with my app.

Add this to Rayon's Cargo.toml to try it:

[patch.crates-io]
crossbeam-deque = { git = "https://github.com/HackerFoo/crossbeam.git", branch = "revert-552" }

https://github.com/HackerFoo/crossbeam/tree/revert-552

@taiki-e
Copy link
Member

taiki-e commented Jul 22, 2022

Thanks for confirming! I've reverted #552 as part of #879.

It is difficult for me to investigate this at this time as I could not reproduce this issue in my environment (mac m1, but not so many cores) even if reduced MAX_OBJECTS more, but I guess the underlying problem is a deque bug.

bors bot added a commit that referenced this issue Jul 22, 2022
879: epoch: Adjust MAX_OBJECTS r=taiki-e a=taiki-e

- Revert #552 to mitigate the risk of segmentation faults in buggy downstream implementations (see #869)
- Reduce MAX_OBJECTS on cfg(miri)

Co-authored-by: Taiki Endo <te316e89@gmail.com>
@tatsuya6502
Copy link

tatsuya6502 commented Jul 23, 2022

I cannot reproduce the issue too. I tried the following environment:

  • macOS 12.4 arm64 on Apple M1 chip (4 × P-cores + 4 x E-cores)
  • Linux x86_64 on Intel Core i7 12700F (20 × threads = 8 × P-cores + 4 x E-cores)

but all tests passed.

I ran cargo test more than 10 times in each environment. I used Rust 1.62.1 and 1.62.0. Also, in order to use the exact same versions of the crates to the original GH issue (rayon-rs/rayon#956), I ran cargo update -p <crate> --precise <version> several times to modify the Cargo.lock.

@HackerFoo — What is the exact M1 chip do you use? (M1, M1 Pro, M1 Max, M1 Ultra)

You can try sysctl -a | grep machdep.cpu:

$ sysctl -a | grep machdep.cpu
machdep.cpu.cores_per_package: 8
machdep.cpu.core_count: 8
machdep.cpu.logical_per_package: 8
machdep.cpu.thread_count: 8
machdep.cpu.brand_string: Apple M1

$ sw_vers
ProductName:	macOS
ProductVersion:	12.4
BuildVersion:	21F79

Also, just for sure, can you please test it again on your Mac as the followings?

  1. Revert the Cargo.toml of rayon.
  2. Use this Cargo.lock:
    • Cargo.lock.zip
    • (Please unzip it. GH Issue does not allow to attach .lock file directory)

FYI, I did the followings:

$ git clone git@github.com:rayon-rs/rayon.git
$ cd $_
$ git checkout a92f91b
$ git rev-parse HEAD                         
a92f91bf43aa3fd7f37f57bf603122a315255b9e
$ cargo update -p crossbeam-deque --precise 0.8.1
$ cargo update -p crossbeam-epoch --precise 0.9.8
## Continued running `cargo update` on different crates.
...

$ cargo tree
rayon v1.5.3 (/Volumes/data2/git-repos/rayon)
├── crossbeam-deque v0.8.1
│   ├── cfg-if v1.0.0
│   ├── crossbeam-epoch v0.9.8
│   │   ├── cfg-if v1.0.0
│   │   ├── crossbeam-utils v0.8.8
│   │   │   ├── cfg-if v1.0.0
│   │   │   └── lazy_static v1.4.0
│   │   ├── lazy_static v1.4.0
│   │   ├── memoffset v0.6.5
│   │   │   [build-dependencies]
│   │   │   └── autocfg v1.1.0
│   │   └── scopeguard v1.1.0
│   │   [build-dependencies]
│   │   └── autocfg v1.1.0
│   └── crossbeam-utils v0.8.8 (*)
├── either v1.6.1
└── rayon-core v1.9.3 (/Volumes/data2/git-repos/rayon/rayon-core)
    ├── crossbeam-channel v0.5.4
    │   ├── cfg-if v1.0.0
    │   └── crossbeam-utils v0.8.8 (*)
    ├── crossbeam-deque v0.8.1 (*)
    ├── crossbeam-utils v0.8.8 (*)
    └── num_cpus v1.13.1
        └── libc v0.2.126
[build-dependencies]
└── autocfg v1.1.0
[dev-dependencies]
├── lazy_static v1.4.0
├── rand v0.8.5
│   ├── libc v0.2.126
│   ├── rand_chacha v0.3.1
│   │   ├── ppv-lite86 v0.2.16
│   │   └── rand_core v0.6.3
│   │       └── getrandom v0.2.7
│   │           ├── cfg-if v1.0.0
│   │           └── libc v0.2.126
│   └── rand_core v0.6.3 (*)
└── rand_xorshift v0.3.0
    └── rand_core v0.6.3 (*)

## Disable sccache
$ unset RUSTC_WRAPPER

## Run tests
$ cargo test
$ cargo +1.62.0 test

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug crossbeam-deque crossbeam-epoch O-ARM Target: ARM processors (arm, thumb and AArch64 targets)
Development

No branches or pull requests

3 participants