Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run NetCDF dependent tests in serial due to crashing on MacOS. #320

Closed
wants to merge 1 commit into from

Conversation

metasim
Copy link
Contributor

@metasim metasim commented Oct 18, 2022

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

Attempted workaround for #299. After change, mdarray tests no longer seem to stack overflow or SIGSEGV on MacOS (aarch64).

Edit: not claiming this solves the real problem here, but running tests locally during development is really hampered by this bug.

@metasim
Copy link
Contributor Author

metasim commented Oct 18, 2022

@ChristianBeilschmidt Is it worth seeing if #299 goes away if HDF5 files were used in the tests instead of NetCDF files? I'm assuming GDAL's HDF5 support is similar/same to NetCDF here.

@lnicola
Copy link
Member

lnicola commented Oct 18, 2022

Not this again 😩...

TBH, I'm not too happy about this because of the amount of code that gets pulled in. At least I guess we're already using thiserror.

Can we make a static mutex and synchronize using that?

@metasim
Copy link
Contributor Author

metasim commented Oct 18, 2022

not too happy about this because of the amount of code that gets pulled in

@lnicola serial_test is only pulled in as a dev-dependency... is that what you're referring to?

@lnicola
Copy link
Member

lnicola commented Oct 18, 2022

Well, I'm not going to block merging this PR because of it.

is only pulled in as a dev-dependency... is that what you're referring to?

Yes, but, but I prefer to avoid what seems like gratuitous dependencies, especially proc macros:

  • they slow down the builds, including CI (which is already slower than it should be); okay, to be fair, we're already pulling in the whole proc macro stack because of thiserror. But by default it depends on futures?! It also uses dashmap (a concurrent hash map), hashbrown (an alternative to the std hash map), lazy_static (which is pseudo-deprecated in favor of once_cell, which is also used, transitively), some Unicode stuff and so on.
  • it has some cognitive overhead. I can't overstate this. If there's a mutex in the code, anyone can hit "go to definition" and/or read the doc comment on it. With a proc macro you probably have to look up the crate
  • proc macros don't always interact well with IDEs. For a long time, at least in rust-analyzer, completions didn't work inside proc macros because syn errors out on invalid code, like foo. (which you do all the time). Even when completion works, other IDE features might not (because of unfixed bugs)
  • even so, it's going to be slower, since the IDE needs to serialize the function body and pass it in another process to the proc macro, then read back and deserialize the result -- on every key press. I didn't measure it, but IDE latency is a real concern.
  • people will complain: "why are you pulling in 15 crates instead of using a plain mutex?"
  • what if one of those 15 crates gets compromised in the future?

I'll agree, each of these is a pretty minor concern in isolation. But I don't think using it is worth its weight in this case.

[dev-dependencies]
└── serial_test v0.9.0
    ├── dashmap v5.4.0
    │   ├── cfg-if v1.0.0
    │   ├── hashbrown v0.12.3
    │   ├── lock_api v0.4.9
    │   │   └── scopeguard v1.1.0
    │   │   [build-dependencies]
    │   │   └── autocfg v1.1.0
    │   ├── once_cell v1.15.0
    │   └── parking_lot_core v0.9.4
    │       ├── cfg-if v1.0.0
    │       ├── libc v0.2.135
    │       └── smallvec v1.10.0
    ├── futures v0.3.24
    │   ├── futures-channel v0.3.24
    │   │   ├── futures-core v0.3.24
    │   │   └── futures-sink v0.3.24
    │   ├── futures-core v0.3.24
    │   ├── futures-executor v0.3.24
    │   │   ├── futures-core v0.3.24
    │   │   ├── futures-task v0.3.24
    │   │   └── futures-util v0.3.24
    │   │       ├── futures-channel v0.3.24 (*)
    │   │       ├── futures-core v0.3.24
    │   │       ├── futures-io v0.3.24
    │   │       ├── futures-sink v0.3.24
    │   │       ├── futures-task v0.3.24
    │   │       ├── memchr v2.5.0
    │   │       ├── pin-project-lite v0.2.9
    │   │       ├── pin-utils v0.1.0
    │   │       └── slab v0.4.7
    │   │           [build-dependencies]
    │   │           └── autocfg v1.1.0
    │   ├── futures-io v0.3.24
    │   ├── futures-sink v0.3.24
    │   ├── futures-task v0.3.24
    │   └── futures-util v0.3.24 (*)
    ├── lazy_static v1.4.0
    ├── log v0.4.17
    │   └── cfg-if v1.0.0
    ├── parking_lot v0.12.1
    │   ├── lock_api v0.4.9 (*)
    │   └── parking_lot_core v0.9.4 (*)
    └── serial_test_derive v0.9.0 (proc-macro)
        ├── proc-macro-error v1.0.4
        │   ├── proc-macro-error-attr v1.0.4 (proc-macro)
        │   │   ├── proc-macro2 v1.0.47
        │   │   │   └── unicode-ident v1.0.5
        │   │   └── quote v1.0.21
        │   │       └── proc-macro2 v1.0.47 (*)
        │   │   [build-dependencies]
        │   │   └── version_check v0.9.4
        │   ├── proc-macro2 v1.0.47 (*)
        │   ├── quote v1.0.21 (*)
        │   └── syn v1.0.102
        │       ├── proc-macro2 v1.0.47 (*)
        │       ├── quote v1.0.21 (*)
        │       └── unicode-ident v1.0.5
        │   [build-dependencies]
        │   └── version_check v0.9.4
        ├── proc-macro2 v1.0.47 (*)
        ├── quote v1.0.21 (*)
        └── syn v1.0.102 (*)

@metasim
Copy link
Contributor Author

metasim commented Oct 18, 2022

@lnicola Yikes! Thanks for explaining. I only looked at the first layer of dependencies shown in crates.io. Lesson learned.

Would be good to not even have this problem.

Here are the options I see so far:

  1. Use a static mutex (I'd need this sketched out... where it lives, where it is acquired in the test, etc).
  2. Convert .nc files to .hdf5. Requires effort from @ChristianBeilschmidt
  3. Put these tests under some feature flag? Or target flag?

All in all, I think the goal here is to make testing as easy a possible for contributors. We already have two features (this and the trybuild error) that don't happen in CI, but can/do on other hardware.

If a feature can't be tested on all platforms, should it be behind a feature/target flag?

@lnicola
Copy link
Member

lnicola commented Oct 18, 2022

I'd rather try HDF5 (good idea) or even Zarr (surely that won't have any threading issues!). We're not testing GDAL here.

But in the meanwhile a mutex might be good enough.

@metasim
Copy link
Contributor Author

metasim commented Oct 18, 2022

@ChristianBeilschmidt: thoughts?

@lnicola
Copy link
Member

lnicola commented Oct 18, 2022

@metasim are you on MacOS? It took me quite a few tries to get the crash in #299. Linux x64.

@metasim
Copy link
Contributor Author

metasim commented Oct 19, 2022

@lnicola Yes, MacOS, aarch64.

@metasim
Copy link
Contributor Author

metasim commented Oct 21, 2022

Fixed by #321

@metasim metasim closed this Oct 21, 2022
@metasim metasim deleted the fix/299 branch November 13, 2022 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants