Buildbot errors.

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Buildbot errors.

Charles R Harris
The python 2.6 buildbots are showing 5 failures that are being hidden by valgrind.

Chuck

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Charles R Harris


On Fri, May 23, 2008 at 12:10 AM, Charles R Harris <[hidden email]> wrote:
The python 2.6 buildbots are showing 5 failures that are being hidden by valgrind.

They seem to have fixed themselves,  they were probably related to the API addition I made, then moved. However, it is a bad thing that the errors were covered up by Valgrind and not reported.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Jarrod Millman
On Thu, May 22, 2008 at 11:25 PM, Charles R Harris
<[hidden email]> wrote:
> On Fri, May 23, 2008 at 12:10 AM, Charles R Harris
> <[hidden email]> wrote:
>>
>> The python 2.6 buildbots are showing 5 failures that are being hidden by
>> valgrind.
>
> They seem to have fixed themselves,  they were probably related to the API
> addition I made, then moved. However, it is a bad thing that the errors were
> covered up by Valgrind and not reported.

I didn't understand what you meant by this yesterday, but I see what
you are talking about now:
http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio
http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio

You said they fixed themselves, but the failures are in the most
recent buildbot reports.  This is the only thing I am concerned about
before branching, so hopefully someone can look at this and let me
know whether the failures are indeed fixed.

Thanks,

--
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Anne Archibald
2008/5/24 Jarrod Millman <[hidden email]>:

> On Thu, May 22, 2008 at 11:25 PM, Charles R Harris
> <[hidden email]> wrote:
>> On Fri, May 23, 2008 at 12:10 AM, Charles R Harris
>> <[hidden email]> wrote:
>>>
>>> The python 2.6 buildbots are showing 5 failures that are being hidden by
>>> valgrind.
>>
>> They seem to have fixed themselves,  they were probably related to the API
>> addition I made, then moved. However, it is a bad thing that the errors were
>> covered up by Valgrind and not reported.
>
> I didn't understand what you meant by this yesterday, but I see what
> you are talking about now:
> http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio
> http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio
>
> You said they fixed themselves, but the failures are in the most
> recent buildbot reports.  This is the only thing I am concerned about
> before branching, so hopefully someone can look at this and let me
> know whether the failures are indeed fixed.

They do not appear on my machine (a pentium-M running Ubuntu). I
should point out that they are only actually three distinct errors,
because one of the test suites is being run twice.They are not exactly
subtle tests; they're checking that seterr induces the raising of
exceptions from np.arange(3)/0, np.sqrt(-np.arange(3)), and
np.array([1.])/np.array([0.]).

The tests pass on the x86_64 machine I have access to (a
multiprocessor Opteron running Knoppix of all things). That is, it
only has python 2.4 (don't ask) so the tests can't be run, but running
the same tests by hand produces the expected results. This particular
feature - seterr - is the sort of thing an overaggressive optimizer
can easily butcher, though, so it could easily be the result of the
particular configuration on the buildbot machine.

I think somebody with access to the buildbot machine needs to see
what's going on. In particular: does a manually-compiled numpy exhibit
the problem? How clean does the buildbot make its environment? Do the
functions behave correctly from an interactive session? Do other
seterr conditions have the same problem?

Anne
P.S. Please ignore the alarming-looking buildbot failure; it is due to
operator headspace. -A
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

cdavid
Anne Archibald wrote:
>
> They do not appear on my machine (a pentium-M running Ubuntu). I
> should point out that they are only actually three distinct errors,
> because one of the test suites is being run twice.They are not exactly
> subtle tests; they're checking that seterr induces the raising of
> exceptions from np.arange(3)/0, np.sqrt(-np.arange(3)), and
> np.array([1.])/np.array([0.]).
>  

(resending, I did not see that build logs were so big)

Here are the build logs for 3 configurations (all on r5226):

   - Ubuntu hardy on 32 bits + python2.6
   - Ubuntu hardy on 64 bits + python2.6 (inside vmware)
   - RHEL 5 64 bits + python2.5

None of them show the error. But RHEL (which is arguably the nearest to
FC) do not have all the seterr tests, I don't know why: test_divide is
not run with RHEL, only test_dividerr is).

Can the error with seterr linked to CPU fpu state flag ?

cheers,

David

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion

logs.tbz2 (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

cdavid
In reply to this post by Anne Archibald
Anne Archibald wrote:
> This particular
> feature - seterr - is the sort of thing an overaggressive optimizer
> can easily butcher, though, so it could easily be the result of the
> particular configuration on the buildbot machine.

gcc claims to be IEEE compliant at all level of optimizations (-O*):

"""
 -ffast-math:

           Sets -fno-math-errno, -funsafe-math-optimizations,
-fno-trapping-math, -ffinite-math-only, -fno-rounding-math,
-fno-signaling-nans and fcx-limited-range.

           This option causes the preprocessor macro "__FAST_MATH__" to
be defined.

           This option should never be turned on by any -O option since
it can result in incorrect output for programs which depend on an exact
implementation of IEEE or ISO rules/specifications for math functions.

"""

So I don't think that's the problem.

cheers,

David
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Charles R Harris
In reply to this post by Jarrod Millman


On Sat, May 24, 2008 at 12:25 AM, Jarrod Millman <[hidden email]> wrote:
On Thu, May 22, 2008 at 11:25 PM, Charles R Harris
<[hidden email]> wrote:
> On Fri, May 23, 2008 at 12:10 AM, Charles R Harris
> <[hidden email]> wrote:
>>
>> The python 2.6 buildbots are showing 5 failures that are being hidden by
>> valgrind.
>
> They seem to have fixed themselves,  they were probably related to the API
> addition I made, then moved. However, it is a bad thing that the errors were
> covered up by Valgrind and not reported.

I didn't understand what you meant by this yesterday, but I see what
you are talking about now:
http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio
http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio

You said they fixed themselves, but the failures are in the most
recent buildbot reports.  This is the only thing I am concerned about
before branching, so hopefully someone can look at this and let me
know whether the failures are indeed fixed.

That's  actually a bit different, I saw the 5 failures after the main sequence of tests and those are OK now. Valgrind seems to run things twice and the new failures are in the valgrind tests, I missed those.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Charles R Harris


On Sat, May 24, 2008 at 8:47 AM, Charles R Harris <[hidden email]> wrote:


On Sat, May 24, 2008 at 12:25 AM, Jarrod Millman <[hidden email]> wrote:
On Thu, May 22, 2008 at 11:25 PM, Charles R Harris
<[hidden email]> wrote:
> On Fri, May 23, 2008 at 12:10 AM, Charles R Harris
> <[hidden email]> wrote:
>>
>> The python 2.6 buildbots are showing 5 failures that are being hidden by
>> valgrind.
>
> They seem to have fixed themselves,  they were probably related to the API
> addition I made, then moved. However, it is a bad thing that the errors were
> covered up by Valgrind and not reported.

I didn't understand what you meant by this yesterday, but I see what
you are talking about now:
http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio
http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio

You said they fixed themselves, but the failures are in the most
recent buildbot reports.  This is the only thing I am concerned about
before branching, so hopefully someone can look at this and let me
know whether the failures are indeed fixed.

That's  actually a bit different, I saw the 5 failures after the main sequence of tests and those are OK now. Valgrind seems to run things twice and the new failures are in the valgrind tests, I missed those.
 
I take that back,  I got confused looking through the output. The errors are the same and only seem to happen when valgrind runs the tests.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Anne Archibald
2008/5/24 Charles R Harris <[hidden email]>:
>
> I take that back,  I got confused looking through the output. The errors are
> the same and only seem to happen when valgrind runs the tests.

Sounds like maybe valgrind is not IEEE clean:

"""
As of version 3.0.0, Valgrind has the following limitations in its
implementation of x86/AMD64 floating point relative to IEEE754.
"""
+
"""
Numeric exceptions in FP code: IEEE754 defines five types of numeric
exception that can happen: invalid operation (sqrt of negative number,
etc), division by zero, overflow, underflow, inexact (loss of
precision).

For each exception, two courses of action are defined by IEEE754:
either (1) a user-defined exception handler may be called, or (2) a
default action is defined, which "fixes things up" and allows the
computation to proceed without throwing an exception.

Currently Valgrind only supports the default fixup actions. Again,
feedback on the importance of exception support would be appreciated.

When Valgrind detects that the program is trying to exceed any of
these limitations (setting exception handlers, rounding mode, or
precision control), it can print a message giving a traceback of where
this has happened, and continue execution. This behaviour used to be
the default, but the messages are annoying and so showing them is now
disabled by default. Use --show-emwarns=yes to see them.
"""

Anne
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Buildbot errors.

Charles R Harris


On Sat, May 24, 2008 at 1:33 PM, Anne Archibald <[hidden email]> wrote:
2008/5/24 Charles R Harris <[hidden email]>:
>
> I take that back,  I got confused looking through the output. The errors are
> the same and only seem to happen when valgrind runs the tests.

Sounds like maybe valgrind is not IEEE clean:

"""
As of version 3.0.0, Valgrind has the following limitations in its
implementation of x86/AMD64 floating point relative to IEEE754.
"""
+
"""
Numeric exceptions in FP code: IEEE754 defines five types of numeric
exception that can happen: invalid operation (sqrt of negative number,
etc), division by zero, overflow, underflow, inexact (loss of
precision).

For each exception, two courses of action are defined by IEEE754:
either (1) a user-defined exception handler may be called, or (2) a
default action is defined, which "fixes things up" and allows the
computation to proceed without throwing an exception.

Currently Valgrind only supports the default fixup actions. Again,
feedback on the importance of exception support would be appreciated.

When Valgrind detects that the program is trying to exceed any of
these limitations (setting exception handlers, rounding mode, or
precision control), it can print a message giving a traceback of where
this has happened, and continue execution. This behaviour used to be
the default, but the messages are annoying and so showing them is now
disabled by default. Use --show-emwarns=yes to see them.
"""

Thanks for following that up, Anne.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion