Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PERT distribution for when mode is very close to (max - min) / 2 #1311

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

LukasKalbertodt
Copy link

There is already a special condition for this case, as the general formula breaks for floats. However, the check uses == which is fishy for floats. This commit instead checks if the difference is smaller than the machine epsilon.

Without this commit, this returns an error (despite being totally valid parameters for PERT):

Pert::new(0.0, 0.48258883, 0.24129441)

@@ -99,7 +99,7 @@ where

let range = max - min;
let mu = (min + max + shape * mode) / (shape + F::from(2.).unwrap());
let v = if mu == mode {
let v = if (mu - mode).abs() < F::epsilon() {
Copy link
Member

@dhardy dhardy May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epsilon is not the appropriate value to use here: e.g. if min and max are very small this test will be too permissive. I'd suggest something like 2 * ε / mu 2εμ.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've read up on floating point comparisons (in particular this) and force pushed some changes. I'm now using 2 * F::EPSILON * max(mu.abs(), mode.abs()). This is something commonly suggested, i.e. scaling the machine epsilon by the maximum of the two inputs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might still go wrong if one of the values is 0. The linked article talks a lot about how that's quite difficult to deal with. And actually, the popular approx crate treats numbers as equal if they differ less than F::epsilon, at least by default. (To the best of my understanding.) This is to deal with numbers close to 0. So even for ULP or relative epsilon comparison, this absolute epsilon comparison is done first. So actually, thinking about it, my initial commit might be fine for small numbers, but it's main problem was probably large numbers (where F::epsilon is smaller than the gap between two neighboring floats).
Uff, this is tricky :/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a new version again. This time, the epsilon is scaled by the largest input. I think with this, everything should be fine. In particular, this deals with "catastrophic cancellation": where the inputs (min, max, mode, shape) are quite large, but due to subtracting them from one another, we keep the quite large rounding error even for mu close to 0. This is also recommended in the linked article:

You’ll need to use an absolute epsilon, whose value might be some small multiple of FLT_EPSILON and the inputs to your calculation.

@LukasKalbertodt LukasKalbertodt force-pushed the fix-pert-distribution branch 3 times, most recently from 880e1ac to ebd1d98 Compare May 20, 2023 11:59
Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your previous suggestion makes the most sense to me:

let thresh = 2 * F::EPSILON * max(mu.abs(), mode.abs());

// to the whole calculation.
//
// https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
let epsilon = shape.max(min.abs()).max(max.abs()) * F::epsilon() * (F::one() + F::one());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must you call this complicated expression epsilon? And why does it involve shape anyway?

There is already a special condition for this case, as the general
formula breaks for floats. However, the check uses `==` which is fishy
for floats. This commit instead checks if the difference is smaller than
the machine epsilon.

Without this commit, this returns an error (despite being totally valid
parameters for PERT):

    Pert::new(0.0, 0.48258883, 0.24129441)
@LukasKalbertodt
Copy link
Author

Your previous suggestion makes the most sense to me:

let thresh = 2 * F::EPSILON * max(mu.abs(), mode.abs());

I think the problem with this is that mu can have an error that is not proportional to its magnitude. If the inputs of that calculation (e.g. min and max) are considerably larger than what mu ends up being, mu has an error that's proportional to those large inputs.

But... I've looked at this quite a while now and I'm not 100% sure about anything. It doesn't help that I don't know the semantic meaning of mu, shape or many other parts. So, I'm fine with merging the threshold calculation you mentioned. I pushed that now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants