

Thanks for all the comments about this issue. Do you know if there's a ticket that's open for this? Is this an easy fix before the 1.0.5 release? Thanks, Will On Fri, Apr 4, 2008 at 3:40 PM, Timothy Hochberg < [hidden email]> wrote:
On Fri, Apr 4, 2008 at 12:47 PM, Robert Kern < [hidden email]> wrote:
On Fri, Apr 4, 2008 at 9:56 AM, Will Lee < [hidden email]> wrote:
> I understand the implication for the floating point comparison and the need
> for allclose. However, I think in a doctest context, this behavior makes
> the doc much harder to read.
Tabling the issue of the fact that we changed behavior for a moment,
this is a fundamental problem with using doctests as unit tests for
numerical code. The floating point results that you get *will* be
different on different machines, but the code will still be correct.
Using allclose() and similar techniques are the best tools available
(although they still suck). Relying on visual representations of these
results is simply an untenable strategy. That is sometimes, but not always the case. Why? Because most of the time that one ends up with simple values, one is starting with arbitrary floating point values and doing at most simple operations on them. Thus a strategy that helps many of my unit tests look better and function reliably is to choose values that can be represented exactly in floating point. If the original value here had been 0.00125 rather than .0012, there would be no problem here. Well almost, you still are vulnerable to the rules for zero padding and what no getting changed and so forth, but in general it's more reliable and prettier.
Of course this isn't always a solution. But I've found it's helpful for a lot cases.
Note that the string
representation of NaNs and Infs are completely different across
platforms.
That said, str(float_numpy_scalar) really should have the same rules
as str(some_python_float). +1

Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 Umberto Eco
 . __ . \ . . [hidden email]
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Fri, Apr 4, 2008 at 1:47 PM, Robert Kern < [hidden email]> wrote:
On Fri, Apr 4, 2008 at 9:56 AM, Will Lee < [hidden email]> wrote:
> I understand the implication for the floating point comparison and the need
> for allclose. However, I think in a doctest context, this behavior makes
> the doc much harder to read.
Tabling the issue of the fact that we changed behavior for a moment,
this is a fundamental problem with using doctests as unit tests for
numerical code. The floating point results that you get *will* be
different on different machines, but the code will still be correct.
Using allclose() and similar techniques are the best tools available
(although they still suck). Relying on visual representations of these
results is simply an untenable strategy. Note that the string
representation of NaNs and Infs are completely different across
platforms.
That said, str(float_numpy_scalar) really should have the same rules
as str(some_python_float).
For all different precisions? And what should the rules be. I note that numpy doesn't distinguish between repr and str, maybe we could specify different behavior for the two.
Chuck
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 7:31 PM, Charles R Harris
< [hidden email]> wrote:
> > That said, str(float_numpy_scalar) really should have the same rules
> > as str(some_python_float).
>
> For all different precisions?
No. I should have said str(float64_numpy_scalar). I am content to
leave the other types alone.
> And what should the rules be.
All Python does is use a lower decimal precision for __str__ than __repr__.
> I note that
> numpy doesn't distinguish between repr and str, maybe we could specify
> different behavior for the two.
Yes, precisely.

Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 Umberto Eco
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 6:38 PM, Robert Kern < [hidden email]> wrote:
On Thu, Apr 10, 2008 at 7:31 PM, Charles R Harris
< [hidden email]> wrote:
> > That said, str(float_numpy_scalar) really should have the same rules
> > as str(some_python_float).
>
> For all different precisions?
No. I should have said str(float64_numpy_scalar). I am content to
leave the other types alone.
> And what should the rules be.
All Python does is use a lower decimal precision for __str__ than __repr__.
> I note that
> numpy doesn't distinguish between repr and str, maybe we could specify
> different behavior for the two.
Yes, precisely.
Well, I know where to do that and have a ticket for it. What I would also like to do is use float.h for setting the repr precision, but I am not sure I can count on its presence as it only became part of the spec in 1999. Then again, that's almost ten years ago. Anyway, python on my machine generates 12 significant digits. Is that common to everyone?
Chuck
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 7:57 PM, Charles R Harris
< [hidden email]> wrote:
>
> On Thu, Apr 10, 2008 at 6:38 PM, Robert Kern < [hidden email]> wrote:
> >
> > On Thu, Apr 10, 2008 at 7:31 PM, Charles R Harris
> > < [hidden email]> wrote:
> > > > That said, str(float_numpy_scalar) really should have the same rules
> > > > as str(some_python_float).
> > >
> > > For all different precisions?
> >
> > No. I should have said str(float64_numpy_scalar). I am content to
> > leave the other types alone.
> >
> > > And what should the rules be.
> >
> > All Python does is use a lower decimal precision for __str__ than
> __repr__.
> >
> >
> > > I note that
> > > numpy doesn't distinguish between repr and str, maybe we could specify
> > > different behavior for the two.
> >
> > Yes, precisely.
>
> Well, I know where to do that and have a ticket for it. What I would also
> like to do is use float.h for setting the repr precision, but I am not sure
> I can count on its presence as it only became part of the spec in 1999. Then
> again, that's almost ten years ago. Anyway, python on my machine generates
> 12 significant digits. Is that common to everyone?
Here is the relevant portion of Objects/floatobject.c:
/* Precisions used by repr() and str(), respectively.
The repr() precision (17 significant decimal digits) is the minimal number
that is guaranteed to have enough precision so that if the number is read
back in the exact same binary value is recreated. This is true for IEEE
floating point by design, and also happens to work for all other modern
hardware.
The str() precision is chosen so that in most cases, the rounding noise
created by various operations is suppressed, while giving plenty of
precision for practical use.
*/
#define PREC_REPR 17
#define PREC_STR 12
svn blame tells me that those have been there unchanged since 1999.
You may want to steal the function format_float() that is defined in
that file, too.

Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 Umberto Eco
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 7:06 PM, Robert Kern < [hidden email]> wrote:
On Thu, Apr 10, 2008 at 7:57 PM, Charles R Harris
< [hidden email]> wrote:
>
> On Thu, Apr 10, 2008 at 6:38 PM, Robert Kern < [hidden email]> wrote:
> >
> > On Thu, Apr 10, 2008 at 7:31 PM, Charles R Harris
> > < [hidden email]> wrote:
> > > > That said, str(float_numpy_scalar) really should have the same rules
> > > > as str(some_python_float).
> > >
> > > For all different precisions?
> >
> > No. I should have said str(float64_numpy_scalar). I am content to
> > leave the other types alone.
> >
> > > And what should the rules be.
> >
> > All Python does is use a lower decimal precision for __str__ than
> __repr__.
> >
> >
> > > I note that
> > > numpy doesn't distinguish between repr and str, maybe we could specify
> > > different behavior for the two.
> >
> > Yes, precisely.
>
> Well, I know where to do that and have a ticket for it. What I would also
> like to do is use float.h for setting the repr precision, but I am not sure
> I can count on its presence as it only became part of the spec in 1999. Then
> again, that's almost ten years ago. Anyway, python on my machine generates
> 12 significant digits. Is that common to everyone?
Here is the relevant portion of Objects/floatobject.c:
/* Precisions used by repr() and str(), respectively.
The repr() precision (17 significant decimal digits) is the minimal number
that is guaranteed to have enough precision so that if the number is read
back in the exact same binary value is recreated. This is true for IEEE
floating point by design, and also happens to work for all other modern
hardware.
The str() precision is chosen so that in most cases, the rounding noise
created by various operations is suppressed, while giving plenty of
precision for practical use.
*/
#define PREC_REPR 17
#define PREC_STR 12
svn blame tells me that those have been there unchanged since 1999.
I left this note on my ticket.
You may want to steal the function format_float() that is defined in
that file, too.
I'll look at it. At the moment we use %.*g
Chuck
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 7:06 PM, Robert Kern < [hidden email]> wrote:
On Thu, Apr 10, 2008 at 7:57 PM, Charles R Harris
< [hidden email]> wrote:
>
> On Thu, Apr 10, 2008 at 6:38 PM, Robert Kern < [hidden email]> wrote:
> >
> > On Thu, Apr 10, 2008 at 7:31 PM, Charles R Harris
> > < [hidden email]> wrote:
> > > > That said, str(float_numpy_scalar) really should have the same rules
> > > > as str(some_python_float).
> > >
> > > For all different precisions?
> >
> > No. I should have said str(float64_numpy_scalar). I am content to
> > leave the other types alone.
> >
> > > And what should the rules be.
> >
> > All Python does is use a lower decimal precision for __str__ than
> __repr__.
> >
> >
> > > I note that
> > > numpy doesn't distinguish between repr and str, maybe we could specify
> > > different behavior for the two.
> >
> > Yes, precisely.
>
> Well, I know where to do that and have a ticket for it. What I would also
> like to do is use float.h for setting the repr precision, but I am not sure
> I can count on its presence as it only became part of the spec in 1999. Then
> again, that's almost ten years ago. Anyway, python on my machine generates
> 12 significant digits. Is that common to everyone?
Here is the relevant portion of Objects/floatobject.c:
/* Precisions used by repr() and str(), respectively.
The repr() precision (17 significant decimal digits) is the minimal number
that is guaranteed to have enough precision so that if the number is read
back in the exact same binary value is recreated. This is true for IEEE
floating point by design, and also happens to work for all other modern
hardware.
The str() precision is chosen so that in most cases, the rounding noise
created by various operations is suppressed, while giving plenty of
precision for practical use.
*/
#define PREC_REPR 17
#define PREC_STR 12
OK, I've committed a change that fixes the problem that started this thread, but I'm going to leave the ticket open for a while until I decide what to do about longdouble. The precisions are now
#define FLOATPREC_REPR 8 #define FLOATPREC_STR 6 #define DOUBLEPREC_REPR 17 #define DOUBLEPREC_STR 12 #if SIZEOF_LONGDOUBLE == SIZEOF_DOUBLE #define LONGDOUBLEPREC_REPR DOUBLEPREC_REPR #define LONGDOUBLEPREC_STR DOUBLEPREC_STR
#else /* More than probably needed on Intel FP */ #define LONGDOUBLEPREC_REPR 20 #define LONGDOUBLEPREC_STR 12 #endif
I'm open to suggestions.
Chuck
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 8:58 PM, Charles R Harris
< [hidden email]> wrote:
> I'm open to suggestions.
I have nothing better to offer than what you've done. Thank you!

Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 Umberto Eco
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 10, 2008 at 8:01 PM, Robert Kern < [hidden email]> wrote:
On Thu, Apr 10, 2008 at 8:58 PM, Charles R Harris
> I'm open to suggestions.
I have nothing better to offer than what you've done. Thank you!
OK, but it looks like I need to implement our own conversion to strings functions to correctly display longdouble. PyOS_snprintf is what we are currently using and it only takes doubles. Grrrr.
Chuck
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


These values should really be determined at compile time, not hardwired in at lines 611621 of scalartypes.inc.src. Maybe use the values in float.h, which on my machine give
The current values we are using are 8, 17, and 22 whereas the values above are supposed to guarantee reversible conversion to and from decimal. Of course, that doesn't seem to be the case in practice, they seem to need at least one more digit. The other question is if all the common compilers support float.h
The numbers above were generated by
Chuck