ENH: Add the negative binomial distribution to rand_distr. #1296

WarrenWeckesser · 2023-03-05T07:33:50Z

No description provided.

dhardy

The negative binomial makes no sense for r=0. But we can generalise to p=0, since our output is floating-point which has a representation for infinity?

The code style looks good.

As for the implementation, all I can say is that it correlates with the mentioned reference, 10.1007/978-1-4613-8643-8. I tried comparing with 10.1002/0471715816 (Johnson 2005, Univariate Discrete Distributions), but my understanding of statistics is lacking. Perhaps @saona-raimundo would be willing to take a look?

WarrenWeckesser · 2023-04-24T16:22:41Z

But we can generalise to p=0, since our output is floating-point which has a representation for infinity?

If p=0, the probability mass function of the distribution would be $p_k = 0$ for all $k$. This is not a valid probability distribution. I don't think returning inf in this case is a meaningful generalization.

saona-raimundo · 2023-04-24T21:23:27Z

Hi!
I am okay with p=0 in this case, especially considering the interpretation of the Negative Binomial

When `r` is an integer, the negative binomial distribution can be interpreted as the distribution of the number of failures in a sequence of Bernoulli trials that continue until `r` successes occur.

Note that Wikipedia accepts p = 0, even if the probability mass function does not make sense in that case.

Regarding the implementation, I confirm it corresponds to the citation.
Although testing the shape of the density is hard, see #357 for a discussion, have you checked that the behaviour of the implementation is the expected one?

@dhardy, I will check your reference 10.1002/0471715816 to see if there are improved algorithms proposed.

saona-raimundo · 2023-04-25T07:25:50Z

rand_distr/src/negative_binomial.rs

+                // and saved in the NegativeBinomial instance, because it
+                // depends on just the parameters `r` and `p`.  We have to
+                // create a new Poisson instance for each variate generated.
+                Poisson::<F>::new(gamma.sample(rng)).unwrap().sample(rng)


Instead of unwrap, one should take care of the case where gamma.sample(rng) returns a float which should not be accepted.
I suggest introducing a loop which samples until one gets a finite sample.
The Float trait has the method is_finite for this.

Can this happen? The Gamma distribution should return strictly positive values, so Poisson::new should never fail.

Sorry, my point was about handling infinity as the result of simulating the gamma variable.

Nothing ensures that a Gamma samples always a positive and finite float.
Not is the signature of the sample method nor in its documentation.

At some point there was a discussion about who should handle infinity out of the simulation: the library or the user. I thought the decision was that the user should handle infinity floats, maybe I am wrong. This is why I suggest handling a possible infinite value here with a loop.

I agree, I need to fix this. If p is extremely small (e.g. 1e-40), then the scale passed to Gamma is huge (1e+40), and with such a scale, Gamma will generate samples that are infinity.

A simple loop would not be safe if we don't have a bound on how frequently infinity is generated.

Yeah, I was not even thinking on extreme values, just the unwrap.
The thing is, if gamma samples infinity, then the Poisson "should" also infinity.
Then, instead of a loop, one should introduce an if checking if the gamma samples infinity.
If it does, return infinity directly, if it does not, sample the Poisson (created with new and unwrap).

saona-raimundo · 2023-04-25T07:49:19Z

The reference 10.1002/0471715816 "Univariate Discrete Distributions" is really nice!
From pages 221-222, this is the description for the simulation of the negative binomial random variable.

The negative binomial with an integer parameter k = N can be generated as
the sum of N geometric rv’s. Except for low values of N (say N = 2, 3, 4), this
method cannot be advocated as it requires many uniforms for a single output
negative binomial rv. This argument applies a fortiori to the use of the sum of a
Poisson number of logarithmic rv’s.

The method generally recommended for generating negative binomial rv’s
with changing parameters is to generate Poisson rv’s with random parameters
drawn from a gamma distribution [see, e.g., algorithm NB3 in Fishman (1978)].
For fixed parameters the use of a fast general method, such as indexed table
look-up, alias, or frequency table, is recommended.

The reference Fishman, G. S. (1978). Principles of Discrete Event Simulation, New York: Wiley.
is not that easy to find online, but we can assume that they refer to the same method implemented in the PR.

If I am not mistaken, rand_distr does not generally implement distributions by table look-ups, so I think the PR is the way to go.

vks · 2023-04-25T11:12:20Z

For the normal distribution, we are using tables (Ziggurat algorithm).

However, I agree that the approach here is fine.

dhardy · 2023-04-25T15:17:47Z

Great, and thanks for the review. Then are we agreed to merge this (once the above is corrected)? I didn't review in detail.

rand_distr/src/negative_binomial.rs

vks

Looks good, thanks! We just need to update the changelog.

Co-authored-by: Vinzent Steinberg <Vinzent.Steinberg@gmail.com>

dhardy · 2023-05-03T18:13:06Z

Looks ready to merge @WarrenWeckesser?

WarrenWeckesser · 2023-05-03T18:19:15Z

@dhardy, I'm still looking into the issue that @saona-raimundo raised here. I'm checking how extreme values of the parameters might break things.

dhardy · 2024-02-08T10:10:02Z

@WarrenWeckesser are you still working on this?

WarrenWeckesser · 2024-03-08T20:21:48Z

I've been away from this (and most of my other open source work) for much of last year, but I haven't forgotten about it. I have a project that I need to finish up before I can get back to this. That might take a few weeks.

ENH: Add the negative binomial distribution.

4ece871

WarrenWeckesser mentioned this pull request Mar 5, 2023

Negative binomial distribution? #1295

Closed

WarrenWeckesser added 2 commits March 5, 2023 11:36

Add value stability tests for NegativeBinomial.

93632f2

Add a comment about the generation of negative binomial variates.

c459991

dhardy reviewed Apr 17, 2023

View reviewed changes

saona-raimundo reviewed Apr 25, 2023

View reviewed changes

vks reviewed May 1, 2023

View reviewed changes

rand_distr/src/negative_binomial.rs Outdated Show resolved Hide resolved

vks approved these changes May 1, 2023

View reviewed changes

WarrenWeckesser and others added 3 commits May 3, 2023 12:00

Simplify expression that checks for invalid p.

43f3d66

Co-authored-by: Vinzent Steinberg <Vinzent.Steinberg@gmail.com>

Add a comment that a validation expression also catches nan.

cedbd83

rand_distr: Update CHANGELOG.md: new NegativeBinomial distribution.

7643943

WarrenWeckesser mentioned this pull request May 3, 2023

Poisson sample() hangs when lambda is close to max of the float type. #1312

Open

dhardy approved these changes May 3, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add the negative binomial distribution to rand_distr. #1296

ENH: Add the negative binomial distribution to rand_distr. #1296

WarrenWeckesser commented Mar 5, 2023

dhardy left a comment

WarrenWeckesser commented Apr 24, 2023

saona-raimundo commented Apr 24, 2023

saona-raimundo Apr 25, 2023

vks May 1, 2023 •

edited

saona-raimundo May 1, 2023

WarrenWeckesser May 3, 2023

saona-raimundo May 3, 2023

saona-raimundo commented Apr 25, 2023

vks commented Apr 25, 2023

dhardy commented Apr 25, 2023 •

edited

vks left a comment

dhardy commented May 3, 2023

WarrenWeckesser commented May 3, 2023

dhardy commented Feb 8, 2024

WarrenWeckesser commented Mar 8, 2024

ENH: Add the negative binomial distribution to rand_distr. #1296

Are you sure you want to change the base?

ENH: Add the negative binomial distribution to rand_distr. #1296

Conversation

WarrenWeckesser commented Mar 5, 2023

dhardy left a comment

Choose a reason for hiding this comment

WarrenWeckesser commented Apr 24, 2023

saona-raimundo commented Apr 24, 2023

saona-raimundo Apr 25, 2023

Choose a reason for hiding this comment

vks May 1, 2023 • edited

Choose a reason for hiding this comment

saona-raimundo May 1, 2023

Choose a reason for hiding this comment

WarrenWeckesser May 3, 2023

Choose a reason for hiding this comment

saona-raimundo May 3, 2023

Choose a reason for hiding this comment

saona-raimundo commented Apr 25, 2023

vks commented Apr 25, 2023

dhardy commented Apr 25, 2023 • edited

vks left a comment

Choose a reason for hiding this comment

dhardy commented May 3, 2023

WarrenWeckesser commented May 3, 2023

dhardy commented Feb 8, 2024

WarrenWeckesser commented Mar 8, 2024

vks May 1, 2023 •

edited

dhardy commented Apr 25, 2023 •

edited