overhauling numpy.random and randomgen

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

overhauling numpy.random and randomgen

mattip
Thanks to the work of Kevin Sheppard, Robert Kern and others, the branch
to merge randomgen https://github.com/bashtage/randomgen into numpy is
ready for final review.

The branch is here https://github.com/numpy/numpy/pull/13163. It is
fully backward compatible: numpy.random.mtrand,
numpy.random.RandomState, and the various stateful distributions from
RandomState available as numpy.random.* produce the same streams as the
current versions. The branch is intended to implement NEP 19
https://www.numpy.org/neps/nep-0019-rng-policy.html


The biggest change is that now there are a variety of random number
generators available
https://6722-908607-gh.circle-artifacts.com/0/home/circleci/repo/doc/build/html/reference/random/brng/index.html,
and a class numpy.random.RandomGenerator that can produce all the
distributions from RandomState. A RandomGenerator instance is provided
for convenience as numpy.random.gen


Additional enhancements
https://6722-908607-gh.circle-artifacts.com/0/home/circleci/repo/doc/build/html/reference/random/new-or-different.html 
allow convenient use of the new constructs in CFFI, Numba, Ctypes, and
Cython.


There are a few things to address before merging:

- Review the new constructs and other APIS

- Decide which BRNGs to include in the first release

- Check that your packages still work with the new implementations. You
can do this by creating a new virtualenv and installing numpy via pip
install git+https://github.com/mattip/numpy.git@randomgen 
<https://github.com/user/repo.git@branch>


We will try to have a final video call about the branch during the
upcoming meeting May 10-11, more details will follow once we schedule
the call. The goal is to merge it for the upcoming 1.17 release.


The expectation is that this first merge will be followed by
implementation and documentation tweaks and improvements, but we hope to
get the major pieces in place as much as possible now.


Matti


Notes:


Sorry for the long urls, they link to the generated documentation from
CI. They may not be available a few weeks from now.

There is a tracking issue for further work related to the PR
numpy.random https://github.com/numpy/numpy/issues/13164

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: overhauling numpy.random and randomgen

Stephan Hoyer-2
Matti, Kevin and Robert -- thanks for putting this together! I am very excited about these long awaited improvements to numpy.random.

I have a number of concerns about the user facing API, starting with the names "Random Generator" and "Base Random Number Generator," which I suspect will be a source of confusion. In particular, the current docs seem to use the term "random number generator" interchangeably for both.

Rather than "base RNG", what about calling these classes a "random source" or "random stream"? In particular, I would suggest defining two Python classes:
- np.random.Generator as a less redundant name for what is currently called RandomGenerator
- np.random.Source or np.random.Stream as an abstract base class for what are currently called "base RNGs"

Even if we don't yet provide an API for defining sources of randomness outside of NumPy, a base class for sources of randomness is valuable because it clearly defines the shared interface.

There are also a couple of convenience attributes in the user-facing API that I would suggest refining:
- The "brng" attribute of RandomGenerator is not a very descriptive name. I would prefer "stream" or "source", or the more explicit "base_rng" if we stick with that term.
- I don't think we need the "generator" property on base RNG objects. It is fine to require writing np.random.Generator(base) instead. Looking at the implementation, .generator caches the RandomGenerator objects it creates on the base RNG, which creates a reference cycle. Yes, Python can garbage collect reference cycles, but this is still a muddled data model.

Finally, why do we expose the np.random.gen object? I thought part of the idea with the new API was to avoid global mutable state.

On Thu, Apr 18, 2019 at 7:20 AM Matti Picus <[hidden email]> wrote:
Thanks to the work of Kevin Sheppard, Robert Kern and others, the branch
to merge randomgen https://github.com/bashtage/randomgen into numpy is
ready for final review.

The branch is here https://github.com/numpy/numpy/pull/13163. It is
fully backward compatible: numpy.random.mtrand,
numpy.random.RandomState, and the various stateful distributions from
RandomState available as numpy.random.* produce the same streams as the
current versions. The branch is intended to implement NEP 19
https://www.numpy.org/neps/nep-0019-rng-policy.html


The biggest change is that now there are a variety of random number
generators available
https://6722-908607-gh.circle-artifacts.com/0/home/circleci/repo/doc/build/html/reference/random/brng/index.html,
and a class numpy.random.RandomGenerator that can produce all the
distributions from RandomState. A RandomGenerator instance is provided
for convenience as numpy.random.gen


Additional enhancements
https://6722-908607-gh.circle-artifacts.com/0/home/circleci/repo/doc/build/html/reference/random/new-or-different.html
allow convenient use of the new constructs in CFFI, Numba, Ctypes, and
Cython.


There are a few things to address before merging:

- Review the new constructs and other APIS

- Decide which BRNGs to include in the first release

- Check that your packages still work with the new implementations. You
can do this by creating a new virtualenv and installing numpy via pip
install git+https://github.com/mattip/numpy.git@randomgen
<https://github.com/user/repo.git@branch>


We will try to have a final video call about the branch during the
upcoming meeting May 10-11, more details will follow once we schedule
the call. The goal is to merge it for the upcoming 1.17 release.


The expectation is that this first merge will be followed by
implementation and documentation tweaks and improvements, but we hope to
get the major pieces in place as much as possible now.


Matti


Notes:


Sorry for the long urls, they link to the generated documentation from
CI. They may not be available a few weeks from now.

There is a tracking issue for further work related to the PR
numpy.random https://github.com/numpy/numpy/issues/13164

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion