From Python to Numpy

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

From Python to Numpy

Nicolas P. Rougier

Dear all,

I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.

Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
Sources are available at: https://github.com/rougier/from-python-to-numpy


Comments/questions/fixes/ideas are of course welcome.


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Chris Barker - NOAA Federal
Nicolas,

From a quick glance, this looks really wonderful! I intend to point my students that are interested in numpy to it.

-CHB


On Thu, Dec 22, 2016 at 8:44 AM, Nicolas P. Rougier <[hidden email]> wrote:

Dear all,

I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.

Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
Sources are available at: https://github.com/rougier/from-python-to-numpy


Comments/questions/fixes/ideas are of course welcome.


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion



--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Kiko-6
In reply to this post by Nicolas P. Rougier


2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier <[hidden email]>:

Dear all,

I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.

Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
Sources are available at: https://github.com/rougier/from-python-to-numpy


Comments/questions/fixes/ideas are of course welcome.

Wow!!! Beautiful.

Thanks for sharing.
 


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Alex Rogozhnikov
Hi Nicolas, 
that's a very nice work!

Comments/questions/fixes/ideas are of course welcome.

Boids example brought my attention too, some comments on it:
- I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.)
- you probably can speed up things with going to sparse arrays 
- and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares

Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo.

Thanks,
Alex.



23 дек. 2016 г., в 12:14, Kiko <[hidden email]> написал(а):



2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier <[hidden email]>:

Dear all,

I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.

Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
Sources are available at: https://github.com/rougier/from-python-to-numpy


Comments/questions/fixes/ideas are of course welcome.

Wow!!! Beautiful.

Thanks for sharing.
 


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Nicolas P. Rougier

> On 30 Dec 2016, at 20:36, Alex Rogozhnikov <[hidden email]> wrote:
>
> Hi Nicolas,
> that's a very nice work!
>
>> Comments/questions/fixes/ideas are of course welcome.
>
> Boids example brought my attention too, some comments on it:
> - I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.)
> - you probably can speed up things with going to sparse arrays
> - and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares
>
> Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo.
>
> Thanks,
> Alex.


Thanks.

I'm not sure to know how to use complex with this example. Could you elaborate ?

For the preliminary splitting, a quadtree (scipy KDTree) could also help a lot but I wanted to stick to numpy only.
A simpler square splitting as you suggest could make thing faster but require some work. I'm not sure yet I see how to restrict analysis to close squares.

Nicolas


>
>
>
>> 23 дек. 2016 г., в 12:14, Kiko <[hidden email]> написал(а):
>>
>>
>>
>> 2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier <[hidden email]>:
>>
>> Dear all,
>>
>> I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
>>
>> Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
>> Sources are available at: https://github.com/rougier/from-python-to-numpy
>>
>>
>> Comments/questions/fixes/ideas are of course welcome.
>>
>> Wow!!! Beautiful.
>>
>> Thanks for sharing.
>>  
>>
>>
>> Nicolas
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Marmaduke Woodman
In reply to this post by Nicolas P. Rougier

> On 22 Dec 2016, at 17:44, Nicolas P. Rougier <[hidden email]> wrote:
>
>
> Dear all,
>
> I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
>
> Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
> Sources are available at: https://github.com/rougier/from-python-to-numpy
>
>
> Comments/questions/fixes/ideas are of course welcome.

I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Nicolas P. Rougier

> I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.


I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself).


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Alex Rogozhnikov
In reply to this post by Nicolas P. Rougier

31 дек. 2016 г., в 2:09, Nicolas P. Rougier <[hidden email]> написал(а):


On 30 Dec 2016, at 20:36, Alex Rogozhnikov <[hidden email]> wrote:

Hi Nicolas, 
that's a very nice work!

Comments/questions/fixes/ideas are of course welcome.

Boids example brought my attention too, some comments on it:
- I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.)
- you probably can speed up things with going to sparse arrays 
- and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares

Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo.

Thanks,
Alex.


Thanks.

I'm not sure to know how to use complex with this example. Could you elaborate ?

Position and velocity are encoded by complex numbers.
Rotation is multiplication by exp(i \phi), translating is adding a complex number.
Distance = abs(x - y). 

I think, that's all operations you need, but maybe I miss something.


For the preliminary splitting, a quadtree (scipy KDTree) could also help a lot but I wanted to stick to numpy only.
A simpler square splitting as you suggest could make thing faster but require some work. I'm not sure yet I see how to restrict analysis to close squares.

Nicolas





23 дек. 2016 г., в 12:14, Kiko <[hidden email]> написал(а):



2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier <[hidden email]>:

Dear all,

I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.

Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
Sources are available at: https://github.com/rougier/from-python-to-numpy


Comments/questions/fixes/ideas are of course welcome.

Wow!!! Beautiful.

Thanks for sharing.



Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Matthew Harrigan
In reply to this post by Nicolas P. Rougier
I also have been stalking this email thread.  First, excellent book!

Regarding the vectorization example mentioned above, one thing to note is that it increases the order of the algorithm relative to the pure python.  The vectorized approach uses correlate, which requires ~(len(seq) * len(sub)) FLOPs.  In the case where the first element in sub is not equal to the vast majority of elements in seq, the basic approach requires ~len(seq) comparisons.  Note that is the case in the SO answer.  One fairly common thing I have seen in vectorized approaches is that the memory or operations required scales worse than strictly required.  It may or may not be an issue, largely depends on the specifics of how its used, but it usually indicates a better approach exists.  That may be worth mentioning here.

Given that, I tried to come up with an "ideal" approach.  stride_tricks can be used to convert seq to a 2D array, and then ideally each row could be compared to sub.  However I can't think of how to do that with numpy function calls other than compare each element in the 2D array, requiring O(n_sub*n_seq) operations again.  array_equal is an example of that.  Python list equality scales better, for instance if x = [0]*n and y = [1]*n, x == y is very fast and the time is independent of the value of n.

It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement.  In the ideal case it is dramatically faster, and even if every element must be compared then its still probably meaningfully faster since the boolean intermediate array isn't created.  Even better would be to get the axis argument in place for generalized ufuncs.  Then this problem could be vectorized in one line with far better performance.  If others think this is a good idea I will post an issue and attempt a solution.


On Sat, Dec 31, 2016 at 5:23 AM, Nicolas P. Rougier <[hidden email]> wrote:

> I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.


I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself).


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Chris Barker - NOAA Federal
> It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement.

How does array_equal() perform?

-CHB
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: From Python to Numpy

Matthew Harrigan
I coded up an all_equal gufunc here.  Benchmark results are also in that repo.  For the specific problem in the book which started this, its 40x faster than the optimized code in the book.  For large arrays which have any early non equal element, its dramatically faster (1000x) than the current alternative.  For large arrays which are all equal, its ~10% faster due to eliminating the intermediate boolean array.  For tiny arrays its much faster due to a single function call instead of at least two, but its debatable how relevant speed is for tiny problems.  Disclaimer: this is my first ufunc I have every written.

On Tue, Jan 10, 2017 at 8:27 PM, Chris Barker - NOAA Federal <[hidden email]> wrote:
> It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement.

How does array_equal() perform?

-CHB
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Loading...