Casting to np.byte before clearing values

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Casting to np.byte before clearing values

Nicolas P. Rougier

Hi all,


I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster.
I imagine there is some kind of special treatment for byte arrays but I've no clue.


# Native float
Z_float = np.ones(1000000, float)
Z_int   = np.ones(1000000, int)

%timeit Z_float[...] = 0
1000 loops, best of 3: 361 µs per loop

%timeit Z_int[...] = 0
1000 loops, best of 3: 366 µs per loop

%timeit Z_float.view(np.byte)[...] = 0
1000 loops, best of 3: 267 µs per loop

%timeit Z_int.view(np.byte)[...] = 0
1000 loops, best of 3: 266 µs per loop


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Casting to np.byte before clearing values

Sebastian Berg
On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
> Hi all,
>
>
> I'm trying to understand why viewing an array as bytes before
> clearing makes the whole operation faster.
> I imagine there is some kind of special treatment for byte arrays but
> I've no clue. 
>

Sure, if its a 1-byte width type, the code will end up calling
`memset`. If it is not, it will end up calling a loop with:

while (N > 0) {
    *dst = output;
    *dst += 8;  /* or whatever element size/stride is */
    --N;
}

now why this gives such a difference, I don't really know, but I guess
it is not too surprising and may depend on other things as well.

- Sebastian


>
> # Native float
> Z_float = np.ones(1000000, float)
> Z_int   = np.ones(1000000, int)
>
> %timeit Z_float[...] = 0
> 1000 loops, best of 3: 361 µs per loop
>
> %timeit Z_int[...] = 0
> 1000 loops, best of 3: 366 µs per loop
>
> %timeit Z_float.view(np.byte)[...] = 0
> 1000 loops, best of 3: 267 µs per loop
>
> %timeit Z_int.view(np.byte)[...] = 0
> 1000 loops, best of 3: 266 µs per loop
>
>
> Nicolas
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Casting to np.byte before clearing values

Nicolas P. Rougier

Thanks for the explanation Sebastian, makes sense.

Nicolas


> On 26 Dec 2016, at 11:48, Sebastian Berg <[hidden email]> wrote:
>
> On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
>> Hi all,
>>
>>
>> I'm trying to understand why viewing an array as bytes before
>> clearing makes the whole operation faster.
>> I imagine there is some kind of special treatment for byte arrays but
>> I've no clue.
>>
>
> Sure, if its a 1-byte width type, the code will end up calling
> `memset`. If it is not, it will end up calling a loop with:
>
> while (N > 0) {
>     *dst = output;
>     *dst += 8;  /* or whatever element size/stride is */
>     --N;
> }
>
> now why this gives such a difference, I don't really know, but I guess
> it is not too surprising and may depend on other things as well.
>
> - Sebastian
>
>
>>
>> # Native float
>> Z_float = np.ones(1000000, float)
>> Z_int   = np.ones(1000000, int)
>>
>> %timeit Z_float[...] = 0
>> 1000 loops, best of 3: 361 µs per loop
>>
>> %timeit Z_int[...] = 0
>> 1000 loops, best of 3: 366 µs per loop
>>
>> %timeit Z_float.view(np.byte)[...] = 0
>> 1000 loops, best of 3: 267 µs per loop
>>
>> %timeit Z_int.view(np.byte)[...] = 0
>> 1000 loops, best of 3: 266 µs per loop
>>
>>
>> Nicolas
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Casting to np.byte before clearing values

Benjamin Root
Might be os-specific, too. Some virtual memory management systems might special case the zeroing out of memory. Try doing the same thing with a different value than zero.

On Dec 26, 2016 6:15 AM, "Nicolas P. Rougier" <[hidden email]> wrote:

Thanks for the explanation Sebastian, makes sense.

Nicolas


> On 26 Dec 2016, at 11:48, Sebastian Berg <[hidden email]> wrote:
>
> On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
>> Hi all,
>>
>>
>> I'm trying to understand why viewing an array as bytes before
>> clearing makes the whole operation faster.
>> I imagine there is some kind of special treatment for byte arrays but
>> I've no clue.
>>
>
> Sure, if its a 1-byte width type, the code will end up calling
> `memset`. If it is not, it will end up calling a loop with:
>
> while (N > 0) {
>     *dst = output;
>     *dst += 8;  /* or whatever element size/stride is */
>     --N;
> }
>
> now why this gives such a difference, I don't really know, but I guess
> it is not too surprising and may depend on other things as well.
>
> - Sebastian
>
>
>>
>> # Native float
>> Z_float = np.ones(1000000, float)
>> Z_int   = np.ones(1000000, int)
>>
>> %timeit Z_float[...] = 0
>> 1000 loops, best of 3: 361 µs per loop
>>
>> %timeit Z_int[...] = 0
>> 1000 loops, best of 3: 366 µs per loop
>>
>> %timeit Z_float.view(np.byte)[...] = 0
>> 1000 loops, best of 3: 267 µs per loop
>>
>> %timeit Z_int.view(np.byte)[...] = 0
>> 1000 loops, best of 3: 266 µs per loop
>>
>>
>> Nicolas
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Casting to np.byte before clearing values

Chris Barker - NOAA Federal
In reply to this post by Nicolas P. Rougier
On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier <[hidden email]> wrote:

I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster.
I imagine there is some kind of special treatment for byte arrays but I've no clue.

I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array.

So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array.

Actually, there is ndarray.fill():

In [7]: %timeit Z_float[...] = 0

1000 loops, best of 3: 380 µs per loop


In [8]: %timeit Z_float.view(np.byte)[...] = 0

1000 loops, best of 3: 271 µs per loop


In [9]: %timeit Z_float.fill(0)

1000 loops, best of 3: 363 µs per loop

which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop.

whereas a .zero() could use a memset, like it does with bytes.

can't say I have a use-case that would justify this, though.

-CHB




 

# Native float
Z_float = np.ones(1000000, float)
Z_int   = np.ones(1000000, int)

%timeit Z_float[...] = 0
1000 loops, best of 3: 361 µs per loop

%timeit Z_int[...] = 0
1000 loops, best of 3: 366 µs per loop

%timeit Z_float.view(np.byte)[...] = 0
1000 loops, best of 3: 267 µs per loop

%timeit Z_int.view(np.byte)[...] = 0
1000 loops, best of 3: 266 µs per loop


Nicolas
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion



--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Casting to np.byte before clearing values

Nicolas P. Rougier

Yes, clearing is not the proper word but the "trick" works only work for 0 (I'll get the same result in both cases).


Nicolas


> On 27 Dec 2016, at 20:52, Chris Barker <[hidden email]> wrote:
>
> On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier <[hidden email]> wrote:
>
> I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster.
> I imagine there is some kind of special treatment for byte arrays but I've no clue.
>
> I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array.
>
> So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array.
>
> Actually, there is ndarray.fill():
>
> In [7]: %timeit Z_float[...] = 0
>
> 1000 loops, best of 3: 380 µs per loop
>
>
> In [8]: %timeit Z_float.view(np.byte)[...] = 0
>
> 1000 loops, best of 3: 271 µs per loop
>
>
> In [9]: %timeit Z_float.fill(0)
>
> 1000 loops, best of 3: 363 µs per loop
>
> which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop.
>
> whereas a .zero() could use a memset, like it does with bytes.
>
> can't say I have a use-case that would justify this, though.
>
> -CHB
>
>
>
>
>  
>
> # Native float
> Z_float = np.ones(1000000, float)
> Z_int   = np.ones(1000000, int)
>
> %timeit Z_float[...] = 0
> 1000 loops, best of 3: 361 µs per loop
>
> %timeit Z_int[...] = 0
> 1000 loops, best of 3: 366 µs per loop
>
> %timeit Z_float.view(np.byte)[...] = 0
> 1000 loops, best of 3: 267 µs per loop
>
> %timeit Z_int.view(np.byte)[...] = 0
> 1000 loops, best of 3: 266 µs per loop
>
>
> Nicolas
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> [hidden email]
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion