Run NetCDF dependent tests in serial due to crashing on MacOS. #320

metasim · 2022-10-18T14:17:35Z

I agree to follow the project's code of conduct.
I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

Attempted workaround for #299. After change, mdarray tests no longer seem to stack overflow or SIGSEGV on MacOS (aarch64).

Edit: not claiming this solves the real problem here, but running tests locally during development is really hampered by this bug.

metasim · 2022-10-18T14:20:24Z

@ChristianBeilschmidt Is it worth seeing if #299 goes away if HDF5 files were used in the tests instead of NetCDF files? I'm assuming GDAL's HDF5 support is similar/same to NetCDF here.

lnicola · 2022-10-18T14:21:26Z

Not this again 😩...

TBH, I'm not too happy about this because of the amount of code that gets pulled in. At least I guess we're already using thiserror.

Can we make a static mutex and synchronize using that?

metasim · 2022-10-18T16:46:21Z

not too happy about this because of the amount of code that gets pulled in

@lnicola serial_test is only pulled in as a dev-dependency... is that what you're referring to?

lnicola · 2022-10-18T17:19:57Z

Well, I'm not going to block merging this PR because of it.

is only pulled in as a dev-dependency... is that what you're referring to?

Yes, but, but I prefer to avoid what seems like gratuitous dependencies, especially proc macros:

they slow down the builds, including CI (which is already slower than it should be); okay, to be fair, we're already pulling in the whole proc macro stack because of thiserror. But by default it depends on futures?! It also uses dashmap (a concurrent hash map), hashbrown (an alternative to the std hash map), lazy_static (which is pseudo-deprecated in favor of once_cell, which is also used, transitively), some Unicode stuff and so on.
it has some cognitive overhead. I can't overstate this. If there's a mutex in the code, anyone can hit "go to definition" and/or read the doc comment on it. With a proc macro you probably have to look up the crate
proc macros don't always interact well with IDEs. For a long time, at least in rust-analyzer, completions didn't work inside proc macros because syn errors out on invalid code, like foo. (which you do all the time). Even when completion works, other IDE features might not (because of unfixed bugs)
even so, it's going to be slower, since the IDE needs to serialize the function body and pass it in another process to the proc macro, then read back and deserialize the result -- on every key press. I didn't measure it, but IDE latency is a real concern.
people will complain: "why are you pulling in 15 crates instead of using a plain mutex?"
what if one of those 15 crates gets compromised in the future?

I'll agree, each of these is a pretty minor concern in isolation. But I don't think using it is worth its weight in this case.

[dev-dependencies]
└── serial_test v0.9.0
    ├── dashmap v5.4.0
    │   ├── cfg-if v1.0.0
    │   ├── hashbrown v0.12.3
    │   ├── lock_api v0.4.9
    │   │   └── scopeguard v1.1.0
    │   │   [build-dependencies]
    │   │   └── autocfg v1.1.0
    │   ├── once_cell v1.15.0
    │   └── parking_lot_core v0.9.4
    │       ├── cfg-if v1.0.0
    │       ├── libc v0.2.135
    │       └── smallvec v1.10.0
    ├── futures v0.3.24
    │   ├── futures-channel v0.3.24
    │   │   ├── futures-core v0.3.24
    │   │   └── futures-sink v0.3.24
    │   ├── futures-core v0.3.24
    │   ├── futures-executor v0.3.24
    │   │   ├── futures-core v0.3.24
    │   │   ├── futures-task v0.3.24
    │   │   └── futures-util v0.3.24
    │   │       ├── futures-channel v0.3.24 (*)
    │   │       ├── futures-core v0.3.24
    │   │       ├── futures-io v0.3.24
    │   │       ├── futures-sink v0.3.24
    │   │       ├── futures-task v0.3.24
    │   │       ├── memchr v2.5.0
    │   │       ├── pin-project-lite v0.2.9
    │   │       ├── pin-utils v0.1.0
    │   │       └── slab v0.4.7
    │   │           [build-dependencies]
    │   │           └── autocfg v1.1.0
    │   ├── futures-io v0.3.24
    │   ├── futures-sink v0.3.24
    │   ├── futures-task v0.3.24
    │   └── futures-util v0.3.24 (*)
    ├── lazy_static v1.4.0
    ├── log v0.4.17
    │   └── cfg-if v1.0.0
    ├── parking_lot v0.12.1
    │   ├── lock_api v0.4.9 (*)
    │   └── parking_lot_core v0.9.4 (*)
    └── serial_test_derive v0.9.0 (proc-macro)
        ├── proc-macro-error v1.0.4
        │   ├── proc-macro-error-attr v1.0.4 (proc-macro)
        │   │   ├── proc-macro2 v1.0.47
        │   │   │   └── unicode-ident v1.0.5
        │   │   └── quote v1.0.21
        │   │       └── proc-macro2 v1.0.47 (*)
        │   │   [build-dependencies]
        │   │   └── version_check v0.9.4
        │   ├── proc-macro2 v1.0.47 (*)
        │   ├── quote v1.0.21 (*)
        │   └── syn v1.0.102
        │       ├── proc-macro2 v1.0.47 (*)
        │       ├── quote v1.0.21 (*)
        │       └── unicode-ident v1.0.5
        │   [build-dependencies]
        │   └── version_check v0.9.4
        ├── proc-macro2 v1.0.47 (*)
        ├── quote v1.0.21 (*)
        └── syn v1.0.102 (*)

metasim · 2022-10-18T18:22:01Z

@lnicola Yikes! Thanks for explaining. I only looked at the first layer of dependencies shown in crates.io. Lesson learned.

Would be good to not even have this problem.

Here are the options I see so far:

Use a static mutex (I'd need this sketched out... where it lives, where it is acquired in the test, etc).
Convert .nc files to .hdf5. Requires effort from @ChristianBeilschmidt
Put these tests under some feature flag? Or target flag?

All in all, I think the goal here is to make testing as easy a possible for contributors. We already have two features (this and the trybuild error) that don't happen in CI, but can/do on other hardware.

If a feature can't be tested on all platforms, should it be behind a feature/target flag?

lnicola · 2022-10-18T18:26:48Z

I'd rather try HDF5 (good idea) or even Zarr (surely that won't have any threading issues!). We're not testing GDAL here.

But in the meanwhile a mutex might be good enough.

metasim · 2022-10-18T19:16:56Z

@ChristianBeilschmidt: thoughts?

lnicola · 2022-10-18T19:47:20Z

@metasim are you on MacOS? It took me quite a few tries to get the crash in #299. Linux x64.

metasim · 2022-10-19T01:24:46Z

@lnicola Yes, MacOS, aarch64.

metasim · 2022-10-21T17:17:10Z

Fixed by #321

Run NetCDF dependent tests in serial due to crashing on MacOS.

8cc02ab

metasim requested review from lnicola, jdroenner and ChristianBeilschmidt October 18, 2022 14:17

metasim mentioned this pull request Oct 18, 2022

Stack overflow in programs::raster::mdimtranslate::tests::test_build_tiff_from_path #299

Closed

metasim mentioned this pull request Oct 19, 2022

Convert byte_no_cf and cf_nasa_4326 to Zarr #321

Merged

2 tasks

metasim closed this Oct 21, 2022

metasim deleted the fix/299 branch November 13, 2022 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run NetCDF dependent tests in serial due to crashing on MacOS. #320

Run NetCDF dependent tests in serial due to crashing on MacOS. #320

metasim commented Oct 18, 2022 •

edited

metasim commented Oct 18, 2022

lnicola commented Oct 18, 2022 •

edited

metasim commented Oct 18, 2022 •

edited

lnicola commented Oct 18, 2022 •

edited

metasim commented Oct 18, 2022

lnicola commented Oct 18, 2022 •

edited

metasim commented Oct 18, 2022

lnicola commented Oct 18, 2022 •

edited

metasim commented Oct 19, 2022

metasim commented Oct 21, 2022

Run NetCDF dependent tests in serial due to crashing on MacOS. #320

Run NetCDF dependent tests in serial due to crashing on MacOS. #320

Conversation

metasim commented Oct 18, 2022 • edited

metasim commented Oct 18, 2022

lnicola commented Oct 18, 2022 • edited

metasim commented Oct 18, 2022 • edited

lnicola commented Oct 18, 2022 • edited

metasim commented Oct 18, 2022

lnicola commented Oct 18, 2022 • edited

metasim commented Oct 18, 2022

lnicola commented Oct 18, 2022 • edited

metasim commented Oct 19, 2022

metasim commented Oct 21, 2022

metasim commented Oct 18, 2022 •

edited

lnicola commented Oct 18, 2022 •

edited

metasim commented Oct 18, 2022 •

edited

lnicola commented Oct 18, 2022 •

edited

lnicola commented Oct 18, 2022 •

edited

lnicola commented Oct 18, 2022 •

edited