Higher quality (0, 1] floats #1346

vks · 2023-10-18T10:13:27Z

Background

Motivation: It's possible to get higher-quality floats without having to add a loop.

Application: I don't have a concrete application, but this approach is able to generate floats < 2^-53, and does not generate 0, which should have a probability of 2^-1075. It can also generate more distinct floats than our current approach.

Feature request

Implement another (0, 1] distribution.

dhardy · 2023-10-18T11:09:13Z

So it uses a maximum of two steps, a bit like Canon's method. Might be generally preferable to #531, but probably still has a significant cost overhead?

At any rate, it may be worth investigating (implementing and benchmarking at least), but not something I'm going to put on my to-do list.

josephlr · 2023-10-24T03:46:20Z

Initial benchmarks for f64 on the OpenClosed01 distrubution (test distr_openclosed01_f64):

Existing int cast + multiply: 1,089 ns/iter (+/- 7) = 7346 MB/s
Implementation in the article: 1,528 ns/iter (+/- 17) = 5235 MB/s
Implementation w/o resampling the exponent: 1,312 ns/iter (+/- 15) = 6097 MB/s
Control, just casing u64 to f64: 991 ns/iter (+/- 10) = 8072 MB/s

EDIT: testing done on a Zen3 x86_64 processor, but I didn't pass -C target-cpu=native, so rep bsf was being used instead of tzcnt. Rerunning with -C target-cpu=native seemed to make all the microbenchmarks slower, even the existing implementation, which is odd.

Existing int cast + multiply: 1,217 ns/iter (+/- 25) = 6573 MB/s
Implementation in the article: 1,617 ns/iter (+/- 100) = 4947 MB/s
Implementation w/o resampling the exponent: 1,458 ns/iter (+/- 20) = 5486 MB/s
Control, just casing u64 to f64: 963 ns/iter (+/- 10) = 8307 MB/s

dhardy · 2023-10-24T08:25:32Z

Thanks. Overhead there is not negligible but is small enough that it could be offered as an alternative to the current implementation under a feature flag, if there is genuine interest in using it.

josephlr mentioned this issue Dec 8, 2023

Closed01: distribution that generates floats in the range [0, 1] #1361

Open

vks mentioned this issue Apr 10, 2024

Investigate replacing Ziggurat sampling with ETF #257

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Higher quality (0, 1] floats #1346

Higher quality (0, 1] floats #1346

vks commented Oct 18, 2023

dhardy commented Oct 18, 2023

josephlr commented Oct 24, 2023 •

edited

dhardy commented Oct 24, 2023

Higher quality (0, 1] floats #1346

Higher quality (0, 1] floats #1346

Comments

vks commented Oct 18, 2023

Background

Feature request

dhardy commented Oct 18, 2023

josephlr commented Oct 24, 2023 • edited

dhardy commented Oct 24, 2023

josephlr commented Oct 24, 2023 •

edited