Efficiency of Numpy wheels and simple way to benchmark Numpy installation?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Efficiency of Numpy wheels and simple way to benchmark Numpy installation?

PIERRE AUGIER
Hello,

I don't know if it is a good place to ask such questions. As advised here https://www.scipy.org/scipylib/mailing-lists.html#stackoverflow, I first posted a question on stackoverflow:

https://stackoverflow.com/questions/50475989/efficiency-of-numpy-wheels-and-simple-benchmark-for-numpy-installations

Since I got no feedback, I try here. My questions are:

- When we care about performance, is it a good practice to rely on wheels (especially for Numpy)? Will it be slower than using (for example) a conda built Numpy?

- Are there simple commands to benchmark Numpy installations and get a good idea of their overall performance?

I explain a little bit more in the stackoverflow question...

Pierre Augier
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Efficiency of Numpy wheels and simple way to benchmark Numpy installation?

Nathaniel Smith
Performance is an incredibly multi-dimensional thing. Modern computers are incredibly complex, with layers of interacting caches, different microarchitectural features (do you have AVX2? does your cpu's branch predictor interact in a funny way with your workload?), compiler optimizations that vary from version to version, ... and different parts of numpy are affected differently by an these things.

So, the only really reliable answer to a question like this is, always, that you need to benchmark the application you actually care about in the contexts where it will actually run (or as close as you can get to that).

That said, as a general rule of thumb, the main difference between different numpy builds is which BLAS library they use, which primarily affects the speed of numpy's linear algebra routines. The wheels on pypi use either OpenBLAS (on Windows and Linux), or Accelerate (in MacOS. The conda packages provided as part of the Anaconda distribution normally use Intel's MKL.

All three of these libraries are generally pretty good. They're all serious attempts to make a blazing fast linear algebra library, and much much faster than naive implementations. Generally MKL has a reputation for being somewhat faster than the others, when there's a difference. But again, whether this happens, or is significant, for *your* app is impossible to say without trying it.

-n

On Sun, May 27, 2018, 08:32 PIERRE AUGIER <[hidden email]> wrote:
Hello,

I don't know if it is a good place to ask such questions. As advised here https://www.scipy.org/scipylib/mailing-lists.html#stackoverflow, I first posted a question on stackoverflow:

https://stackoverflow.com/questions/50475989/efficiency-of-numpy-wheels-and-simple-benchmark-for-numpy-installations

Since I got no feedback, I try here. My questions are:

- When we care about performance, is it a good practice to rely on wheels (especially for Numpy)? Will it be slower than using (for example) a conda built Numpy?

- Are there simple commands to benchmark Numpy installations and get a good idea of their overall performance?

I explain a little bit more in the stackoverflow question...

Pierre Augier
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Efficiency of Numpy wheels and simple way to benchmark Numpy installation?

Matthew Brett
Hi,

On Sun, May 27, 2018 at 9:12 PM, Nathaniel Smith <[hidden email]> wrote:

> Performance is an incredibly multi-dimensional thing. Modern computers are
> incredibly complex, with layers of interacting caches, different
> microarchitectural features (do you have AVX2? does your cpu's branch
> predictor interact in a funny way with your workload?), compiler
> optimizations that vary from version to version, ... and different parts of
> numpy are affected differently by an these things.
>
> So, the only really reliable answer to a question like this is, always, that
> you need to benchmark the application you actually care about in the
> contexts where it will actually run (or as close as you can get to that).
>
> That said, as a general rule of thumb, the main difference between different
> numpy builds is which BLAS library they use, which primarily affects the
> speed of numpy's linear algebra routines. The wheels on pypi use either
> OpenBLAS (on Windows and Linux), or Accelerate (in MacOS. The conda packages
> provided as part of the Anaconda distribution normally use Intel's MKL.
>
> All three of these libraries are generally pretty good. They're all serious
> attempts to make a blazing fast linear algebra library, and much much faster
> than naive implementations. Generally MKL has a reputation for being
> somewhat faster than the others, when there's a difference. But again,
> whether this happens, or is significant, for *your* app is impossible to say
> without trying it.

Yes - I'd be surprised if you find a significant difference in
performance for real usage between pip / OpenBLAS and conda / MKL -
but if you do, please let us know, and we'll investigate.

Cheers,

Matthew
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion