A stab at removing unsafes #109

troublescooter · 2020-05-05T22:44:11Z

This pr is a work in progress to eliminate unsafes as far as possible.

troublescooter · 2020-05-12T19:52:49Z

Any remarks @boxtown ?

boxtown · 2020-05-13T04:32:20Z

Would it be possible to do some benchmarking on the updated functions? When I originally wrote this I was purposefully trying avoiding to avoid runtime bounds checking which I don't believe leads to UB (I'm assuming this is the reason for removing unsafes?) although at this point it may be the case that the Rust compiler is smart enough to compile away the bounds checks (I haven't really kept up with compiler improvements)

Side note, are the formatting changes from running cargo fmt? There's probably a potential QoL improvement opportunity here to just have the CI run formatting

troublescooter · 2020-05-14T13:46:33Z

In my opinion the best selling point for removing unsafes is not necessarily to remove UB, but to easily convince others that there's no UB. I couldn't find anything that would appear to be UB, but if somebody has to choose a library and there's one with lots of unsafe blocks the choice is either to be negligent of them or to manually check them, and code review doesn't scale very well.

Do you have a suggestion what an acceptable trade-off between performance and safety would be? My opinion is that a general purpose library should have no unsafe blocks, but I refrained from rewriting blocks that appeared to be deliberate and where I wasn't confident could be written safely without performance regressions. I've not yet benchmarked, but I'd be happy to do so if you give an idea what an acceptable benchmark would look like.

troublescooter · 2020-05-14T13:59:21Z

Btw: In order to do my refactor I changed the function ranks such that it now has no side effects. The tests did not capture any mutation of self, is this a mistake or deliberate? If so, can I safely change &mut self in the trait to &self?

Side note, are the formatting changes from running cargo fmt? There's probably a potential QoL improvement opportunity here to just have the CI run formatting

Formatting is one click away for me, I usually make it a habit to keep my diffs clean.

vks · 2020-05-14T14:40:55Z

I think the use of unsafe should be justified by benchmarks, not the other way around. Looking at the changes, it does not look like they affect performance critical code. I think for some of the changes here the bound checks can be optimized away, but not for all of them.

Benchmarking would be nice to have (for example, employing criterion), but it seems as if it is currently not done for statrs. Personally, I think it makes sense to merge these changes and worry about benchmarking later.

Note that this PR also contains some unrelated changes that are raising the minimally supported Rust version. I think this is reasonable, and the MSRV is not documented for statrs anyway.

boxtown · 2020-05-21T01:42:28Z

Dis-regard the formatting comment I missed the part where you were updating the constructors to use the short-hand.

Btw: In order to do my refactor I changed the function ranks such that it now has no side effects. The tests did not capture any mutation of self, is this a mistake or deliberate? If so, can I safely change &mut self in the trait to &self?

The function used to do an in-place sort but it looks like it's no longer necessary from the changes in this PR. I'll try to do some benchmarking over the weekend to though so let's keep it in for now to avoid flip-flopping if there's an egregious drop in performance (probably unlikely) and we'll create a task to remove the mut from the signature.

Give me some time to do a little more thorough review of the PR (assuming you're mostly finished) otherwise should be good to merge

troublescooter · 2020-05-21T11:50:53Z

My comment might not have been clear, removing mut was of course not necessary, so I put it back in for now in my personal tree. Probably best addressed with a different pr, as somebody might rely on the undocumented behaviour.

Edit: I'd forgotten what I'd done a couple of days back and responded off the cuff. This parragraph is non-sensical. Side-effects don't exist anymore, but mut is still in.

In order to alleviate your concerns I've begun benchmarking with criterion, so if you give me a day or two I'll see if I can get everything together.

troublescooter

On nightly there's partition_at_index that could be a candidate for most of the code in select_inplace. I haven't yet tested, as to do this cleanly would probably involve newtyping [f64] to guarantee no NaNs in the slice, which might be a good idea regardless.

troublescooter · 2020-05-25T13:29:21Z

I was not able to determine whether my changes had any performance impact, because I ran into what looks like undefined behaviour.

Benchmarking order statistic/ranks: RankTieBreaker::First: Warming up for 3.0000 sthread 'main' panicked at 'index out of bounds: the len is 100 but the index
is 18446744073709551590', /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14/src/libcore/slice/mod.rs:515:35
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:77
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:61
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1028
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1412
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:65
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:50
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:188
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:205
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:464
  11: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:373
  12: rust_begin_unwind
             at src/libstd/panicking.rs:302
  13: core::panicking::panic_fmt
             at src/libcore/panicking.rs:139
  14: core::panicking::panic_bounds_check
             at src/libcore/panicking.rs:96
  15: statrs::statistics::slice_statistics::quick_sort_all
[...]

Based on the size of the index being dereferenced, the only reasonable explanation seems to be that quick_sort_all is reading somewhere where it shouldn't.

fix: stylistic issues

troublescooter · 2020-05-25T15:53:24Z

There's one unsafe remaining in the computation of the factorial, which I want to refactor in the future. For now I'm finished.

troublescooter · 2020-05-30T09:03:07Z

At line :569 in src/statistics/slice_statistics.rs an overflowing subtraction occurs, after which an out of bounds read at the next Hoare loop iteration is performed. Later the overflow is caught by primary.swap(a,b) which panics.

@boxtown
Is there anything missing to have this merged?

boxtown

Looks good in general, great work thanks for doing this! Just left one comment, but it's not critical. We can squash and merge if you disagree

src/statistics/slice_statistics.rs

troublescooter · 2020-06-04T20:16:04Z

Agree and fixed.

boxtown · 2020-06-05T23:45:06Z

Thanks again for doing this work, I'll try to update the changelog and cut a release some time this weekend

troublescooter force-pushed the safer branch 2 times, most recently from 6361867 to 7bfcff6 Compare May 11, 2020 14:50

troublescooter commented May 21, 2020

View reviewed changes

troublescooter force-pushed the safer branch from 1788db1 to 9cc4e8e Compare May 25, 2020 13:35

troublescooter added 3 commits May 25, 2020 16:05

bench: add criterion bench for order_statistic

af63c91

fix: refactor unsafes safer

c8fed84

fix: stylistic issues

fix: compiler warnings

9620b53

troublescooter force-pushed the safer branch from 9cc4e8e to 8440987 Compare May 25, 2020 14:06

fix: removed more unsafes

a0bd3b8

troublescooter force-pushed the safer branch from 8440987 to a0bd3b8 Compare May 26, 2020 09:00

fix: cleanup bench

fe67e53

boxtown approved these changes Jun 2, 2020

View reviewed changes

src/statistics/slice_statistics.rs Outdated Show resolved Hide resolved

fix: make sort function clearer

37dc023

boxtown merged commit 83aa6b5 into statrs-dev:master Jun 5, 2020

troublescooter mentioned this pull request Nov 26, 2020

Test coverage for Statistic.rank for n > 10 #13

Closed

jonathanstrong mentioned this pull request Jan 28, 2022

segfault in select_inplace from v0.12 #163

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A stab at removing unsafes #109

A stab at removing unsafes #109

troublescooter commented May 5, 2020

troublescooter commented May 12, 2020

boxtown commented May 13, 2020

troublescooter commented May 14, 2020

troublescooter commented May 14, 2020

vks commented May 14, 2020 •

edited

boxtown commented May 21, 2020

troublescooter commented May 21, 2020 •

edited

troublescooter left a comment

troublescooter commented May 25, 2020

troublescooter commented May 25, 2020

troublescooter commented May 30, 2020

boxtown left a comment

troublescooter commented Jun 4, 2020

boxtown commented Jun 5, 2020

A stab at removing unsafes #109

A stab at removing unsafes #109

Conversation

troublescooter commented May 5, 2020

troublescooter commented May 12, 2020

boxtown commented May 13, 2020

troublescooter commented May 14, 2020

troublescooter commented May 14, 2020

vks commented May 14, 2020 • edited

boxtown commented May 21, 2020

troublescooter commented May 21, 2020 • edited

troublescooter left a comment

Choose a reason for hiding this comment

troublescooter commented May 25, 2020

troublescooter commented May 25, 2020

troublescooter commented May 30, 2020

boxtown left a comment

Choose a reason for hiding this comment

troublescooter commented Jun 4, 2020

boxtown commented Jun 5, 2020

vks commented May 14, 2020 •

edited

troublescooter commented May 21, 2020 •

edited