speed of random number generator compared to Julia

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

speed of random number generator compared to Julia

Pierre Haessig-2

Hello,


I have a simulation code which uses intensively random number generation (e.g. normal rng). I wanted to compare the performance of this simulation across some Numpy implementations and some Julia implementation. In the end, in both languages, the respective best performing implementations are dominated by the rng, but Julia wins by a factor of 4-5, because the rng is 4-5x faster.

Here are some timing results:

In IPython (Python 3.5.2 from Anaconda)

import numpy as np
N = 10**6

%timeit np.random.normal(size=N)
10 loops, best of 3: 37.1 ms per loop

 %timeit np.random.uniform(size=N)
100 loops, best of 3: 10.2 ms per loop

In Julia (0.4.7 x86_64 from Debian testing)

N = 10^6

@time randn(N);
  0.007802 seconds (6 allocations: 7.630 MB)

@time rand(N);
  0.002059 seconds (8 allocations: 7.630 MB) (with some variations between trials)

Results are consistent in the sense that generating Gaussian numbers is 3-4 times slower than uniform in Python and in Julia.

But how come Julia is 4-5x faster since Numpy uses C implementation for the entire process ? (Mersenne Twister -> uniform double -> Box-Muller transform to get a Gaussian https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c). Also I noticed that Julia uses a different algorithm (Ziggurat Method from Marsaglia and Tsang , https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but this doesn't explain the difference for uniform rng.


best,

Pierre


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Pierre Haessig-2

Hello,

Le 30/03/2017 à 13:31, Pierre Haessig a écrit :
[....]

But how come Julia is 4-5x faster since Numpy uses C implementation for the entire process ? (Mersenne Twister -> uniform double -> Box-Muller transform to get a Gaussian https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c). Also I noticed that Julia uses a different algorithm (Ziggurat Method from Marsaglia and Tsang , https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but this doesn't explain the difference for uniform rng.

Any ideas?

Do you think Stackoverflow would be a better place for my question?

best,

Pierre


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Jaime Fernández del Río
On Mon, Apr 3, 2017 at 3:20 PM, Pierre Haessig <[hidden email]> wrote:

Hello,

Le 30/03/2017 à 13:31, Pierre Haessig a écrit :
[....]

But how come Julia is 4-5x faster since Numpy uses C implementation for the entire process ? (Mersenne Twister -> uniform double -> Box-Muller transform to get a Gaussian https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c). Also I noticed that Julia uses a different algorithm (Ziggurat Method from Marsaglia and Tsang , https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but this doesn't explain the difference for uniform rng.

Any ideas?

This says that Julia uses this library, which is different from the home brewed version of the Mersenne twister in NumPy. The second link I posted claims their speed comes from generating double precision numbers directly, rather than generating random bytes that have to be converted to doubles, as is the case of NumPy through this magical incantation. They also throw the SIMD acronym around, which likely means their random number generation is parallelized.

My guess is that most of the speed-up comes from the SIMD parallelization: the Mersenne algorithm does a lot of work to produce 32 random bits, so that likely dominates over a couple of arithmetic operations, even if divisions are involved.

Jaime

Do you think Stackoverflow would be a better place for my question?

best,

Pierre


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion




--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Neal Becker

On Mon, Apr 3, 2017 at 9:45 AM Jaime Fernández del Río <[hidden email]> wrote:
On Mon, Apr 3, 2017 at 3:20 PM, Pierre Haessig <[hidden email]> wrote:

Hello,

Le 30/03/2017 à 13:31, Pierre Haessig a écrit :
[....]

But how come Julia is 4-5x faster since Numpy uses C implementation for the entire process ? (Mersenne Twister -> uniform double -> Box-Muller transform to get a Gaussian https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c). Also I noticed that Julia uses a different algorithm (Ziggurat Method from Marsaglia and Tsang , https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but this doesn't explain the difference for uniform rng.

Any ideas?

This says that Julia uses this library, which is different from the home brewed version of the Mersenne twister in NumPy. The second link I posted claims their speed comes from generating double precision numbers directly, rather than generating random bytes that have to be converted to doubles, as is the case of NumPy through this magical incantation. They also throw the SIMD acronym around, which likely means their random number generation is parallelized.

My guess is that most of the speed-up comes from the SIMD parallelization: the Mersenne algorithm does a lot of work to produce 32 random bits, so that likely dominates over a couple of arithmetic operations, even if divisions are involved.

Jaime

Do you think Stackoverflow would be a better place for my question?

best,

Pierre


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion




--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Pierre Haessig-2

Le 03/04/2017 à 15:52, Neal Becker a écrit :
> Take a look here:
> https://bashtage.github.io/ng-numpy-randomstate/doc/index.html
Thanks for the pointer. A very feature-full random generator package.

So it is indeed possible to have in Python/Numpy both the "advanced"
Mersenne Twister (dSFMT) at the lower level and the Ziggurat algorithm
for Gaussian transform on top. Perfect!

In an ideal world, this would be implemented by default in Numpy, but I
understand that this would break the reproducibility of existing codes.

best,
Pierre
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Neal Becker
I think the intention is that this is the next gen of numpy randomstate, and will eventually be merged in.

On Mon, Apr 3, 2017 at 11:47 AM Pierre Haessig <[hidden email]> wrote:

Le 03/04/2017 à 15:52, Neal Becker a écrit :
> Take a look here:
> https://bashtage.github.io/ng-numpy-randomstate/doc/index.html
Thanks for the pointer. A very feature-full random generator package.

So it is indeed possible to have in Python/Numpy both the "advanced"
Mersenne Twister (dSFMT) at the lower level and the Ziggurat algorithm
for Gaussian transform on top. Perfect!

In an ideal world, this would be implemented by default in Numpy, but I
understand that this would break the reproducibility of existing codes.

best,
Pierre
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Pierre Haessig-2
In reply to this post by Jaime Fernández del Río
Le 03/04/2017 à 15:44, Jaime Fernández del Río a écrit :
This says that Julia uses this library, which is different from the home brewed version of the Mersenne twister in NumPy. The second link I posted claims their speed comes from generating double precision numbers directly, rather than generating random bytes that have to be converted to doubles, as is the case of NumPy through this magical incantation. They also throw the SIMD acronym around, which likely means their random number generation is parallelized.

My guess is that most of the speed-up comes from the SIMD parallelization: the Mersenne algorithm does a lot of work to produce 32 random bits, so that likely dominates over a couple of arithmetic operations, even if divisions are involved.
Thanks for the feedback.

I'm not good in enough in reading Julia to be 100% sure, but I feel like that the random.jl (https://github.com/JuliaLang/julia/blob/master/base/random.jl) contains a Julia implementation of Mersenne Twister... but I have no idea whether it is the "fancy" SIMD version or the "old" 32bits version.

best,
Pierre

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Nathaniel Smith
On Apr 3, 2017 8:59 AM, "Pierre Haessig" <[hidden email]> wrote:
Le 03/04/2017 à 15:44, Jaime Fernández del Río a écrit :
This says that Julia uses this library, which is different from the home brewed version of the Mersenne twister in NumPy. The second link I posted claims their speed comes from generating double precision numbers directly, rather than generating random bytes that have to be converted to doubles, as is the case of NumPy through this magical incantation. They also throw the SIMD acronym around, which likely means their random number generation is parallelized.

My guess is that most of the speed-up comes from the SIMD parallelization: the Mersenne algorithm does a lot of work to produce 32 random bits, so that likely dominates over a couple of arithmetic operations, even if divisions are involved.
Thanks for the feedback.

I'm not good in enough in reading Julia to be 100% sure, but I feel like that the random.jl (https://github.com/JuliaLang/julia/blob/master/base/random.jl) contains a Julia implementation of Mersenne Twister... but I have no idea whether it is the "fancy" SIMD version or the "old" 32bits version.

That code contains many references to "dSFMT", which is the name of the "fancy" algorithm. IIUC dSFMT is related to the mersenne twister but is actually a different generator altogether -- advertising that Julia uses the mersenne twister is somewhat misleading IMHO. Of course this is really the fault of the algorithm's designers for creating multiple algorithms that have "mersenne twister" as part of their names...

-n

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: speed of random number generator compared to Julia

Pierre Haessig-2
In reply to this post by Neal Becker
Le 03/04/2017 à 17:49, Neal Becker a écrit :
> I think the intention is that this is the next gen of numpy
> randomstate, and will eventually be merged in.
Ah yes, I found the related issue in the meantime:
https://github.com/numpy/numpy/issues/6967

Thanks again for the pointers.

Pierre

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Loading...