float16/32: wrong number of digits?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

float16/32: wrong number of digits?

Nico Schlömer
Hi everyone,

I wondered how to express a numpy float exactly in terms of format, and found `%r` quite useful: `float(repr(a)) == a` is guaranteed for Python `float`s. When trying the same thing with lower-precision Python floats, I found this identity not quite fulfilled:
```
import numpy
b = numpy.array([1.0 / 3.0], dtype=np.float16)
float(repr(b[0])) - b[0]
Out[12]: -1.9531250000093259e-06
```
Indeed,
```
b
Out[6]: array([ 0.33325195], dtype=float16)
```
```
repr(b[0])
Out[7]: '0.33325'
```
When counting the bits, a float16 should hold 4.8 decimal digits, so `repr()` seems right. Where does the garbage tail -1.9531250000093259e-06 come from though?

Cheers,
Nico

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: float16/32: wrong number of digits?

Anne Archibald


On Thu, Mar 9, 2017, 11:27 Nico Schlömer <[hidden email]> wrote:
Hi everyone,

I wondered how to express a numpy float exactly in terms of format, and found `%r` quite useful: `float(repr(a)) == a` is guaranteed for Python `float`s. When trying the same thing with lower-precision Python floats, I found this identity not quite fulfilled:
```
import numpy
b = numpy.array([1.0 / 3.0], dtype=np.float16)
float(repr(b[0])) - b[0]
Out[12]: -1.9531250000093259e-06
```
Indeed,
```
b
Out[6]: array([ 0.33325195], dtype=float16)
```
```
repr(b[0])
Out[7]: '0.33325'
```
When counting the bits, a float16 should hold 4.8 decimal digits, so `repr()` seems right. Where does the garbage tail -1.9531250000093259e-06 come from though?

Even more troubling, the high precision numpy types - np.longdouble and its complex version - lose intimation when used with repr. 

The basic problem is (roughly) that all floating-point numbers are converted to python floats before printing. I put some effort into cleaning this up, but the code is messy (actually there are several independent code paths for converting numbers to strings) and the algorithms python uses to make repr work out nicely are nontrivial.

Anne


Cheers,
Nico
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: float16/32: wrong number of digits?

Eric Wieser
In reply to this post by Nico Schlömer
> `float(repr(a)) == a` is guaranteed for Python `float`

And `np.float16(repr(a)) == a` is guaranteed for `np.float16`(and the same is true up to `float128`, which can be platform-dependent). Your code doesn't work because you're deserializing to a higher precision format than you serialized to.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: float16/32: wrong number of digits?

Anne Archibald
On Mon, Mar 13, 2017 at 12:57 PM Eric Wieser <[hidden email]> wrote:
> `float(repr(a)) == a` is guaranteed for Python `float`

And `np.float16(repr(a)) == a` is guaranteed for `np.float16`(and the same
is true up to `float128`, which can be platform-dependent). Your code
doesn't work because you're deserializing to a higher precision format than
you serialized to.
 
I would hesitate to make this guarantee - certainly for old versions of numpy, np.float128(repr(x))!=x in many cases. I submitted a patch, now accepted, that probably accomplishes this on most systems (in fact this is now in the test suite) but if you are using a version of numpy that is a couple of years old, there is no way to convert long doubles to human-readable or back that doesn't lose precision.

To repeat: only in recent versions of numpy can long doubles be converted to human-readable and back without passing through doubles. It is still not possible to use % or format() on them without discarding all precision beyond doubles. If you actually need long doubles (and if you don't, why use them?) make sure your application includes a test for this ability. I recommend checking repr(1+np.finfo(np.longdouble).eps).

Anne

P.S. You can write (I have) a short piece of cython code that will reliably repr and back long doubles, but on old versions of numpy it's just not possible from within python. -A  

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Loading...