Using logfactorial instead of loggamma in random_poisson sampler

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Using logfactorial instead of loggamma in random_poisson sampler

zoj613
This post was updated on .
Hi All,

I noticed that the transformed rejection method for generating Poisson
random variables used in numpy makes use of the `random_loggam` function
which directly calculates the log-gamma function. It appears that a
log-factorial lookup table was added a few years back which could be used in
place of random_loggam since the input is always an integer. Is there a
reason for not using this table instead? See link below for the line of
code:

https://github.com/numpy/numpy/blob/6222e283fa0b8fb9ba562dabf6ca9ea7ed65be39/numpy/random/src/distributions/distributions.c#L572




--
Sent from: http://numpy-discussion.10968.n7.nabble.com/
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Using logfactorial instead of loggamma in random_poisson sampler

Warren Weckesser-2
On 3/6/21, zoj613 <[hidden email]> wrote:

> Hi All,
>
> I noticed that the transformed rejection method for generating Poisson
> random variables used in numpy makes use of the `random_loggam` function
> which directly calculates the log-gamma function. It appears that a
> log-factorial lookup table was added a few years back which could be used
> in
> place of random_loggam since the input is always an integer. Is there a
> reason for not using this table instead? See link below for the line of
> code:
>
> https://github.com/numpy/numpy/blob/6222e283fa0b8fb9ba562dabf6ca9ea7ed65be39/numpy/random/src/distributions/distributions.c#L572
>
> Regards
> Zolisa
>

Hi Zolisa,

In the pull request where the C function logfactorial was added
(https://github.com/numpy/numpy/pull/13761), I originally modified the
Poisson code to use logfactorial as you suggest, but Kevin (@bashtage
on github) pointed out that the change could potentially alter the
random stream for the legacy version. Making the change requires
creating separate C functions, one for the legacy code that remains
unchanged, and one for the newer Generator class that would use
logfactorial.  You can see the comments here (click on "Show
resolved"):

    https://github.com/numpy/numpy/pull/13761#pullrequestreview-249973405

At the time, making that change was not a high priority, so I didn't
pursue it. It does make sense to use the logfactorial function there,
and I'd be happy to see it updated, but be aware that making the
change is more work than changing just the function call.

Warren

>
>
> --
> Sent from: http://numpy-discussion.10968.n7.nabble.com/
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Using logfactorial instead of loggamma in random_poisson sampler

zoj613
Ah, I had a suspicion that it was to preserve the random stream but wasn't
too sure. Thanks for the clarification.



--
Sent from: http://numpy-discussion.10968.n7.nabble.com/
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Using logfactorial instead of loggamma in random_poisson sampler

Robert Kern-2
In reply to this post by Warren Weckesser-2
On Sat, Mar 6, 2021 at 1:45 PM Warren Weckesser <[hidden email]> wrote:
At the time, making that change was not a high priority, so I didn't
pursue it. It does make sense to use the logfactorial function there,
and I'd be happy to see it updated, but be aware that making the
change is more work than changing just the function call.

Does it make a big difference? Per NEP 19, even in `Generator`, we do weigh the cost of changing the stream reasonably highly. Improved accuracy is likely worthwhile, but a minor performance improvement is probably not.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Using logfactorial instead of loggamma in random_poisson sampler

bashtage
I did a quick test and using random_loggam was about 6% faster than using logfactorial (on Windows).

Kevin


On Sun, Mar 7, 2021 at 2:40 AM Robert Kern <[hidden email]> wrote:
On Sat, Mar 6, 2021 at 1:45 PM Warren Weckesser <[hidden email]> wrote:
At the time, making that change was not a high priority, so I didn't
pursue it. It does make sense to use the logfactorial function there,
and I'd be happy to see it updated, but be aware that making the
change is more work than changing just the function call.

Does it make a big difference? Per NEP 19, even in `Generator`, we do weigh the cost of changing the stream reasonably highly. Improved accuracy is likely worthwhile, but a minor performance improvement is probably not.

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Using logfactorial instead of loggamma in random_poisson sampler

zoj613
What do you think is the explanation for that? I had assumed that using a
lookup table would be faster considering that the loggam implementation has
loops and makes calls to elementary functions in it.



--
Sent from: http://numpy-discussion.10968.n7.nabble.com/
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion