Hi All, Currently functions like trace use the C long type as the default accumulator for integer types of lesser precision: dtype : dtype, optional The problem with this is that the precision of long varies with the platform so that the result varies, see gh-8433 for a complaint about this. There are two possible alternatives that seem reasonable to me:
Thoughts? Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mon, Jan 2, 2017 at 6:27 PM, Charles R Harris
<[hidden email]> wrote: > Hi All, > > Currently functions like trace use the C long type as the default > accumulator for integer types of lesser precision: > >> dtype : dtype, optional >> Determines the data-type of the returned array and of the accumulator >> where the elements are summed. If dtype has the value None and `a` is >> of integer type of precision less than the default integer >> precision, then the default integer precision is used. Otherwise, >> the precision is the same as that of `a`. > > > The problem with this is that the precision of long varies with the platform > so that the result varies, see gh-8433 for a complaint about this. There > are two possible alternatives that seem reasonable to me: > > Use 32 bit accumulators on 32 bit platforms and 64 bit accumulators on 64 > bit platforms. > Always use 64 bit accumulators. This is a special case of a more general question: right now we use the default integer precision (i.e., what you get from np.array([1]), or np.arange, or np.dtype(int)), and it turns out that the default integer precision itself varies in confusing ways, and this is a common source of bugs. Specifically: right now it's 32-bit on 32-bit builds, and 64-bit on 64-bit builds, except on Windows where it's always 32-bit. This matches the default precision of Python 2 'int'. So some options include: - make the default integer precision 64-bits everywhere - make the default integer precision 32-bits on 32-bit systems, and 64-bits on 64-bit systems (including Windows) - leave the default integer precision the same, but make accumulators 64-bits everywhere - leave the default integer precision the same, but make accumulators 64-bits on 64-bit systems (including Windows) - ... Given the prevalence of 64-bit systems these days, and the fact that the current setup makes it very easy to write code that seems to work when tested on a 64-bit system but that silently returns incorrect results on 32-bit systems, it sure would be nice if we could switch to a 64-bit default everywhere. (You could still get 32-bit integers, of course, you'd just have to ask for them explicitly.) Things we'd need to know more about before making a decision: - compatibility: if we flip this switch, how much code breaks? In general correct numpy-using code has to be prepared to handle np.dtype(int) being 64-bits, and in fact there might be more code that accidentally assumes that np.dtype(int) is always 64-bits than there is code that assumes it is always 32-bits. But that's theory; to know how bad this is we would need to try actually running some projects test suites and see whether they break or not. - speed: there's probably some cost to using 64-bit integers on 32-bit systems; how big is the penalty in practice? -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mo, 2017-01-02 at 18:46 -0800, Nathaniel Smith wrote:
> On Mon, Jan 2, 2017 at 6:27 PM, Charles R Harris > <[hidden email]> wrote: > > > > Hi All, > > > > Currently functions like trace use the C long type as the default > > accumulator for integer types of lesser precision: > > <snip> > > Things we'd need to know more about before making a decision: > - compatibility: if we flip this switch, how much code breaks? In > general correct numpy-using code has to be prepared to handle > np.dtype(int) being 64-bits, and in fact there might be more code > that > accidentally assumes that np.dtype(int) is always 64-bits than there > is code that assumes it is always 32-bits. But that's theory; to know > how bad this is we would need to try actually running some projects > test suites and see whether they break or not. > - speed: there's probably some cost to using 64-bit integers on 32- > bit > systems; how big is the penalty in practice? > like the idea of having two different "defaults". There are two issues, one is the change on Python 2 (no inheritance of Python int by default numpy type) and any issues due to increased precision (more RAM usage, code actually expects lower precision somehow, etc.). Cannot say I know for sure, but I would be extremely surprised if there is a speed difference between 32bit vs. 64bit architectures, except the general slowdown you get due to bus speeds, etc. when going to higher bit width. If the inheritance for some reason is a bigger issue, we might limit the change to Python 3. For other possible problems, I think we may have difficulties assessing how much is affected. The problem is, that the most affected thing should be projects only being used on windows, or so. Bigger projects should work fine already (they are more likely to get better due to not being tested as well on 32bit long platforms, especially 64bit windows). Of course limiting the change to python 3, could have the advantage of not affecting older projects which are possibly more likely to be specifically using the current behaviour. So, I would be open to trying the change, I think the idea of at least changing it in python 3 has been brought up a couple of times, including by Julian, so maybe it is time to give it a shot.... It would be interesting to see if anyone knows projects that may be affected (for example because they are designed to only run on windows or limited hardware), and if avoiding to change anything in python 2 might mitigate problems here as well (additionally to avoiding the inheritance change)? Best, Sebastian > -n > _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc (836 bytes) Download Attachment |
On Tue, Jan 3, 2017 at 10:08 AM, Sebastian Berg <[hidden email]> wrote: On Mo, 2017-01-02 at 18:46 -0800, Nathaniel Smith wrote: There have been a number of reports of problems due to the inheritance stemming both from the changing precision and, IIRC, from differences in print format or some such. So I don't expect that there will be no problems, but they will probably not be difficult to fix. Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
In reply to this post by Nathaniel Smith
On Mon, 2 Jan 2017 18:46:08 -0800
Nathaniel Smith <[hidden email]> wrote: > > So some options include: > - make the default integer precision 64-bits everywhere > - make the default integer precision 32-bits on 32-bit systems, and > 64-bits on 64-bit systems (including Windows) Either of those two would be the best IMO. Intuitively, I think people would expect 32-bit ints in 32-bit processes by default, and 64-bit ints in 64-bit processes likewise. So I would slightly favour the latter option. > - leave the default integer precision the same, but make accumulators > 64-bits everywhere > - leave the default integer precision the same, but make accumulators > 64-bits on 64-bit systems (including Windows) Both of these options introduce a confusing discrepancy. > - speed: there's probably some cost to using 64-bit integers on 32-bit > systems; how big is the penalty in practice? Ok, I have fired up a Windows VM to compare 32-bit and 64-bit builds. Numpy version is 1.11.2, Python version is 3.5.2. Keep in mind those are Anaconda builds of Numpy, with MKL enabled for linear algebra; YMMV. For each benchmark, the first number is the result on the 32-bit build, the second number on the 64-bit build. Simple arithmetic ----------------- >>> v = np.ones(1024**2, dtype='int32') >>> %timeit v + v # 1.73 ms per loop | 1.78 ms per loop >>> %timeit v * v # 1.77 ms per loop | 1.79 ms per loop >>> %timeit v // v # 5.89 ms per loop | 5.39 ms per loop >>> v = np.ones(1024**2, dtype='int64') >>> %timeit v + v # 3.54 ms per loop | 3.54 ms per loop >>> %timeit v * v # 5.61 ms per loop | 3.52 ms per loop >>> %timeit v // v # 17.1 ms per loop | 13.9 ms per loop Linear algebra -------------- >>> m = np.ones((1024,1024), dtype='int32') >>> %timeit m @ m # 556 ms per loop | 569 ms per loop >>> m = np.ones((1024,1024), dtype='int64') >>> %timeit m @ m # 3.81 s per loop | 1.01 s per loop Sorting ------- >>> v = np.random.RandomState(42).randint(1000, size=1024**2).astype('int32') >>> %timeit np.sort(v) # 43.4 ms per loop | 44 ms per loop >>> v = np.random.RandomState(42).randint(1000, size=1024**2).astype('int64') >>> %timeit np.sort(v) # 61.5 ms per loop | 45.5 ms per loop Indexing -------- >>> v = np.ones(1024**2, dtype='int32') >>> %timeit v[v[::-1]] # 2.38 ms per loop | 4.63 ms per loop >>> v = np.ones(1024**2, dtype='int64') >>> %timeit v[v[::-1]] # 6.9 ms per loop | 3.63 ms per loop Quick summary: - for very simple operations, 32b and 64b builds can have the same perf on each given bitwidth (though speed is uniformly halved on 64-bit integers when the given operation is SIMD-vectorized) - for more sophisticated operations (such as element-wise multiplication or division, or quicksort, but much more so on the matrix product), 32b builds are competitive with 64b builds on 32-bit ints, but lag behind on 64-bit ints - for indexing, it's desirable to use a "native" width integer, regardless of whether that means 32- or 64-bit Of course the numbers will vary depend on the platform (read: compiler), but some aspects of this comparison will probably translate to other platforms. Regards Antoine. _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Free forum by Nabble | Edit this page |