Implement Power10 trait, with first method is_power_of_ten #15

mohrezaei · 2018-11-15T20:21:50Z

Here is the first method in the new trait. Just sending it across for a quick review before I start adding more.

Do you prefer

one pull request, one commit, all methods
one pull request, one commit per method
one pull request per method?

mohrezaei · 2018-11-15T20:47:59Z

I think I need help with fixing the u128 case. I can't seem to hide the static array from the compiler with a cfg attribute.

I don't understand why it works with 1.20, which doesn't have u128, but not older...

mohrezaei · 2018-11-15T21:33:07Z

Looks like I managed to hide u128 😏 If there is a better way, let me know.

mohrezaei · 2018-11-16T18:56:44Z

FYI, this is making things a bit difficult.

cuviper · 2018-11-16T21:08:37Z

I prefer one PR with all planned methods, so we can treat it as ready-to-publish once merged, without further breaking changes. Adding more methods on a trait is breaking unless there's a default impl.

I don't care how many commits are in your PR, though I do prefer a clean working progression over seeing useless "fix it" kind of commits.

I don't understand why it works with 1.20, which doesn't have u128, but not older...

I believe the reason is that 1.20 has unstable support for u128, so it at least knows how to parse the big literals, where earlier versions didn't. Hiding it in a macro allowed those earlier versions to discard it without trying to tokenize the contents. An alternative workaround would be to use a const expression of smaller literals, like 10000000000 * 10000000000.

cuviper

Please run cargo fmt -- it's going to expand a lot of your one-liners, but I want consistent formatting.

cuviper · 2018-11-16T21:10:32Z

src/power10.rs

+    if v >= 100 { return v == 100; }
+    if v >= 10 { return v == 10; }
+    if v >= 1 { return v == 1; }
+    false


I think these last two lines could be collapsed to just v == 1.
(but LLVM might figure that out anyway...)

cuviper · 2018-11-16T21:13:51Z

src/power10.rs

+fn is_pow10_u64(v: u64) -> bool {
+    let mut hash: u32 = v as u32;
+    hash ^= hash >> 3;
+    hash = hash.rotate_right(1);


Since rust-lang/rust#56009 has attributed your performance regression to rotate_right, I wonder why do you even need that here? That 1 MSB won't contribute to the actual hash index, so can this be a plain shr instead?

Actually, after further analyzing the issue, the better code is:

let mut hash: u32 = v as u32; hash ^= hash >> 3; hash = (hash >> 24) ^ (hash >> 1);

cuviper · 2018-11-16T21:14:17Z

src/power10.rs

+        #[inline]
+        pub fn is_pow10_u128(v: $T) -> bool {
+            let mut hash: u32 = v as u32 | (((v as u64) >> 32) as u32);
+            hash = hash.wrapping_mul(1249991743).rotate_right(25);


Here too, I think those 25 MSBs won't matter, so shr should be fine.

cuviper · 2018-11-16T21:17:08Z

Hmm, actually it looks like I haven't done a crate-wide fmt here before, so that would be a lot of noise in this PR. I need to get to that later...

cuviper · 2018-11-16T21:20:33Z

src/power10.rs

+    x.is_power_of_ten()
+}
+
+// implementation note: the is_power_of_ten algorithm is based on a perfect hash setup with very simple hash functions


Did you use any particular tool to create these perfect hash functions?

I have some code that does that. Nothing clever. It just picks random values and produces a hash until it finds something.

mohrezaei · 2018-11-16T22:02:12Z

So I need a little help/sounding board here. After looking more closely at the performance, I have two different 64-bit implementations (perfect_hash, lz) that behave like so:

RUSTFLAGS=""
perfect_hash: 9000 ns/iter.
lz: 7300 ns/iter.

RUSTFLAGS="-C target-cpu=native"
perfect_hash: 4100 ns/iter
lz: 5000 ns/iter.

I think perfect_hash vectorizes better. I can't make up my mind about which one is the right one for a general purpose library ☹️ help?

cuviper · 2018-11-16T23:03:24Z

Those numbers look close enough that I'd lean toward whichever will be easier to maintain.

The vectorization of your microbenchmark may or may not represent how well it can be optimized in real use, where you're more likely to have other computations involved too. So try not to extrapolate too much from these results.

mohrezaei · 2018-11-16T23:43:41Z

I agree that vectorization is not an important consideration. When I turn off vectorization, lz benefits from native, but perfect_hash does not. I was also worried about 32bit platforms, but a quick check shows that lz with target-cpu=native is better, but worse without native at least on i686.

On balance, I think I'll go with lz for u64. Gotta run the numbers for u32.

Thanks.

mohrezaei · 2018-11-19T16:17:30Z

This should be a good basis for a full review. I did a cargo fmt just on the new files. The arrays became quite long, but I left them as fmt wanted. I'll squash again after review edits.

A few things:

There is a bit of a naming issue right now: Ten vs 10. Power10, floor_log10 vs. next_power_of_ten, is_power_of_ten
I modeled the wrapping/checked/next_power_of_ten the same way as std, but floor_log10 just returns 0 for 0 (no alternate methods).
All methods take &self. That makes a lot of sense for a trait that may be used for things other than primitives. std uses self for similar methods, which I feel is problematic for a Power2 trait down the road. I don't intend to implement Power2.

mohrezaei · 2018-11-28T14:50:07Z

@cuviper ping, just in case this got lost in the holiday shuffle 😄

cuviper · 2018-12-01T01:34:38Z

Hi! This is still in my inbox to review, but I'll at least thrown opinions at your design questions. :)

There is a bit of a naming issue right now: Ten vs 10. Power10, floor_log10 vs. next_power_of_ten, is_power_of_ten

I think the method names are OK. The floats give a precedent for log10, and integer power_of_two gives a precedent for _ten, so we're on par with std inconsistency here. I could accept either way on the trait name, Power10 or PowerTen.

I modeled the wrapping/checked/next_power_of_ten the same way as std, but floor_log10 just returns 0 for 0 (no alternate methods).

I would expect 0 to be a panic case, because log(0) in any base is undefined.

You can also consider that log(1) is exactly 0, and log is strictly increasing, so floor_log(1) should be the least input which returns 0 -- leaving us with no answer for floor_log(0).

Aside on naming - I'm inclined to just call this log10 instead of floor_log10. I feel the flooring aspect is implicit when we're talking about unsigned integers, just like division.

All methods take &self. That makes a lot of sense for a trait that may be used for things other than primitives. std uses self for similar methods, which I feel is problematic for a Power2 trait down the road. I don't intend to implement Power2.

&self is appropriate, and I don't think it's a problem for Power2. Everything we would implement through std methods are Copy types, so they can just dereference to get self by value.

fixed cargo test --release

mohrezaei · 2018-12-07T21:11:10Z

Renamed floor_log10 to log10 and added checked/unchecked variants (wrapping didn't make sense?).
Also made sure tests pass in release mode.

BTW, interesting curiosity: 0f64.log10().floor() == 0 (would've made a good excuse for the method name and zero behavior 😅 )

cuviper · 2018-12-07T21:42:37Z

BTW, interesting curiosity: 0f64.log10().floor() == 0

I get -inf: playground

And even that only makes from the positive side, lim{x->0+} log(x)

mohrezaei · 2018-12-07T23:04:39Z

Right, sorry, this got me confused:
0f64.log10().floor() as u32 == 0
playground

strake · 2018-12-12T19:11:44Z

It seems we ought to rather have a general Power trait with a method is_power_of(self, other) -> bool which checks whether self is a power of other.

cuviper · 2018-12-12T19:19:59Z

I suggested generic powers in #14 too, but I think it's still OK to specialize common bases.

cuviper · 2018-12-14T02:31:08Z

src/power10.rs

+    fn is_power_of_ten(&self) -> bool;
+
+    /// Returns the base 10 logarithm value, truncated down.
+    /// Returns zero for zero in release mode, panics in debug mode.


I'd rather make zero an unconditional panic -- it's a domain error, just like divide-by-zero.

cuviper · 2018-12-14T02:32:20Z

src/power10.rs

+    fn log10(&self) -> u32; //note: u32 return type to allow BigInt types to implement. 10^(2^32) is 4 billion digits
+
+    /// Returns the base 10 logarithm value, truncated down.
+    /// Panics for zero.


A "checked" method usually implies returning an Option, and again checked_div is a good example.

cuviper · 2018-12-14T04:06:53Z

src/power10.rs

+    fn checked_log10(&self) -> u32;
+
+    /// Returns the base 10 logarithm value, truncated down.
+    /// Returns zero for zero. (This matches f64::log10()::floor() for zero.)


I'd remove this one. It only matches f64 given that you got -inf as u32 == 0, but I think this is actually undefined behavior -- rust-lang/rust#10184

…anic on zero

mohrezaei · 2018-12-14T22:45:29Z

The last commit should address the latest review comments. Let me know if there is more to do (I'll squash the commits once the review is done).

Oh, I should clarify one thing: the f64 stuff was mostly a joke. It's so hard to get that through in writing (sigh... 😓)

cuviper · 2018-12-14T23:48:24Z

The benchmarks are failing on nightly:

failures:

---- benchl10_u32_up_to_10000 stdout ----
thread 'benchl10_u32_up_to_10000' panicked at 'undefined value for log of zero', src/power10.rs:763:1
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- benchl10_u64_up_to_10000 stdout ----
thread 'benchl10_u64_up_to_10000' panicked at 'undefined value for log of zero', src/power10.rs:769:1

mohrezaei · 2018-12-14T23:57:21Z

It was the zero panic. Pushed a fix. BTW, the always panic behavior has a 50% cost in performance for u32 (from 0.8 ns per call to 1.2 ns per call).

cuviper · 2018-12-15T00:03:48Z

BTW, the always panic behavior has a 50% cost in performance for u32

Well, microbenchmarks are hard to extrapolate into real use cases, mixed with other computations. For instance, if the optimizer has enough range information from context to know the input can't be zero, then the check will probably be removed altogether.

cuviper requested changes Nov 16, 2018

View reviewed changes

cuviper reviewed Nov 16, 2018

View reviewed changes

Implement Power10 trait

a876ba3

mohrezaei force-pushed the master branch from 403dcd6 to a876ba3 Compare November 19, 2018 16:06

mohrezaei added 3 commits November 19, 2018 11:26

fix for older rust

41b9f9a

fix for rust 1.8.0

8c1da4b

fix for rust 1.8.0

c0c5519

renamed floor_log10 to log10, added checked/unchecked variants

f10417a

fixed cargo test --release

cuviper reviewed Dec 14, 2018

View reviewed changes

removed unchecked_log10, made checked_log10 return Option and log10 p…

617a87c

…anic on zero

avoid zero in the benchmarks for log

230ff4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Power10 trait, with first method is_power_of_ten #15

Implement Power10 trait, with first method is_power_of_ten #15

mohrezaei commented Nov 15, 2018 •

edited

mohrezaei commented Nov 15, 2018 •

edited

mohrezaei commented Nov 15, 2018

mohrezaei commented Nov 16, 2018

cuviper commented Nov 16, 2018

cuviper left a comment

cuviper Nov 16, 2018

mohrezaei Nov 16, 2018

cuviper Nov 16, 2018

mohrezaei Nov 16, 2018

cuviper Nov 16, 2018

mohrezaei Nov 16, 2018

cuviper commented Nov 16, 2018

cuviper Nov 16, 2018

mohrezaei Nov 16, 2018

mohrezaei commented Nov 16, 2018

cuviper commented Nov 16, 2018

mohrezaei commented Nov 16, 2018

mohrezaei commented Nov 19, 2018 •

edited

mohrezaei commented Nov 28, 2018

cuviper commented Dec 1, 2018

mohrezaei commented Dec 7, 2018

cuviper commented Dec 7, 2018

mohrezaei commented Dec 7, 2018

strake commented Dec 12, 2018

cuviper commented Dec 12, 2018

cuviper Dec 14, 2018

cuviper Dec 14, 2018

cuviper Dec 14, 2018

mohrezaei commented Dec 14, 2018

cuviper commented Dec 14, 2018

mohrezaei commented Dec 14, 2018

cuviper commented Dec 15, 2018

Implement Power10 trait, with first method is_power_of_ten #15

Are you sure you want to change the base?

Implement Power10 trait, with first method is_power_of_ten #15

Conversation

mohrezaei commented Nov 15, 2018 • edited

mohrezaei commented Nov 15, 2018 • edited

mohrezaei commented Nov 15, 2018

mohrezaei commented Nov 16, 2018

cuviper commented Nov 16, 2018

cuviper left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cuviper commented Nov 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mohrezaei commented Nov 16, 2018

cuviper commented Nov 16, 2018

mohrezaei commented Nov 16, 2018

mohrezaei commented Nov 19, 2018 • edited

mohrezaei commented Nov 28, 2018

cuviper commented Dec 1, 2018

mohrezaei commented Dec 7, 2018

cuviper commented Dec 7, 2018

mohrezaei commented Dec 7, 2018

strake commented Dec 12, 2018

cuviper commented Dec 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mohrezaei commented Dec 14, 2018

cuviper commented Dec 14, 2018

mohrezaei commented Dec 14, 2018

cuviper commented Dec 15, 2018

mohrezaei commented Nov 15, 2018 •

edited

mohrezaei commented Nov 15, 2018 •

edited

mohrezaei commented Nov 19, 2018 •

edited