Hi all, I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue. # Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int) %timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop %timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop %timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop %timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop Nicolas _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
> Hi all, > > > I'm trying to understand why viewing an array as bytes before > clearing makes the whole operation faster. > I imagine there is some kind of special treatment for byte arrays but > I've no clue. > Sure, if its a 1-byte width type, the code will end up calling `memset`. If it is not, it will end up calling a loop with: while (N > 0) { *dst = output; *dst += 8; /* or whatever element size/stride is */ --N; } now why this gives such a difference, I don't really know, but I guess it is not too surprising and may depend on other things as well. - Sebastian > > # Native float > Z_float = np.ones(1000000, float) > Z_int = np.ones(1000000, int) > > %timeit Z_float[...] = 0 > 1000 loops, best of 3: 361 µs per loop > > %timeit Z_int[...] = 0 > 1000 loops, best of 3: 366 µs per loop > > %timeit Z_float.view(np.byte)[...] = 0 > 1000 loops, best of 3: 267 µs per loop > > %timeit Z_int.view(np.byte)[...] = 0 > 1000 loops, best of 3: 266 µs per loop > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/numpy-discussion NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc (836 bytes) Download Attachment |
Thanks for the explanation Sebastian, makes sense. Nicolas > On 26 Dec 2016, at 11:48, Sebastian Berg <[hidden email]> wrote: > > On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote: >> Hi all, >> >> >> I'm trying to understand why viewing an array as bytes before >> clearing makes the whole operation faster. >> I imagine there is some kind of special treatment for byte arrays but >> I've no clue. >> > > Sure, if its a 1-byte width type, the code will end up calling > `memset`. If it is not, it will end up calling a loop with: > > while (N > 0) { > *dst = output; > *dst += 8; /* or whatever element size/stride is */ > --N; > } > > now why this gives such a difference, I don't really know, but I guess > it is not too surprising and may depend on other things as well. > > - Sebastian > > >> >> # Native float >> Z_float = np.ones(1000000, float) >> Z_int = np.ones(1000000, int) >> >> %timeit Z_float[...] = 0 >> 1000 loops, best of 3: 361 µs per loop >> >> %timeit Z_int[...] = 0 >> 1000 loops, best of 3: 366 µs per loop >> >> %timeit Z_float.view(np.byte)[...] = 0 >> 1000 loops, best of 3: 267 µs per loop >> >> %timeit Z_int.view(np.byte)[...] = 0 >> 1000 loops, best of 3: 266 µs per loop >> >> >> Nicolas >> _______________________________________________ >> NumPy-Discussion mailing list >> [hidden email] >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Might be os-specific, too. Some virtual memory management systems might special case the zeroing out of memory. Try doing the same thing with a different value than zero. On Dec 26, 2016 6:15 AM, "Nicolas P. Rougier" <[hidden email]> wrote:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
In reply to this post by Nicolas P. Rougier
On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier <[hidden email]> wrote:
I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array. So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array. Actually, there is ndarray.fill(): 1000 loops, best of 3: 380 µs per loop In [8]: %timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 271 µs per loop In [9]: %timeit Z_float.fill(0) 1000 loops, best of 3: 363 µs per loop which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop. whereas a .zero() could use a memset, like it does with bytes. can't say I have a use-case that would justify this, though. -CHB
Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [hidden email] _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Yes, clearing is not the proper word but the "trick" works only work for 0 (I'll get the same result in both cases). Nicolas > On 27 Dec 2016, at 20:52, Chris Barker <[hidden email]> wrote: > > On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier <[hidden email]> wrote: > > I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. > I imagine there is some kind of special treatment for byte arrays but I've no clue. > > I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array. > > So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array. > > Actually, there is ndarray.fill(): > > In [7]: %timeit Z_float[...] = 0 > > 1000 loops, best of 3: 380 µs per loop > > > In [8]: %timeit Z_float.view(np.byte)[...] = 0 > > 1000 loops, best of 3: 271 µs per loop > > > In [9]: %timeit Z_float.fill(0) > > 1000 loops, best of 3: 363 µs per loop > > which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop. > > whereas a .zero() could use a memset, like it does with bytes. > > can't say I have a use-case that would justify this, though. > > -CHB > > > > > > > # Native float > Z_float = np.ones(1000000, float) > Z_int = np.ones(1000000, int) > > %timeit Z_float[...] = 0 > 1000 loops, best of 3: 361 µs per loop > > %timeit Z_int[...] = 0 > 1000 loops, best of 3: 366 µs per loop > > %timeit Z_float.view(np.byte)[...] = 0 > 1000 loops, best of 3: 267 µs per loop > > %timeit Z_int.view(np.byte)[...] = 0 > 1000 loops, best of 3: 266 µs per loop > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > [hidden email] > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Free forum by Nabble | Edit this page |