proposed changes to array printing in 1.14

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

proposed changes to array printing in 1.14

Allan Haldane
Hello all,

There are various updates to array printing in preparation for numpy
1.14. See https://github.com/numpy/numpy/pull/9139/

Some are quite likely to break other projects' doc-tests which expect a
particular str or repr of arrays, so I'd like to warn the list in case
anyone has opinions.

The current proposed changes, from most to least painful by my
reckoning, are:

1.
For float arrays, an extra space previously used for the sign position
will now be omitted in many cases. Eg, `repr(arange(4.))` will now
return 'array([0., 1., 2., 3.])' instead of 'array([ 0.,  1.,  2.,  3.])'.

2.
The printing of 0d arrays is overhauled. This is a bit finicky to
describe, please see the release note in the PR. As an example of the
effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
"array('2005-04-04', dtype='datetime64[D]')" instead of
"array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

3.
User-defined dtypes which did not properly implement their `repr` (and
`str`) should do so now. Otherwise it now falls back to
`object.__repr__`, which will return something ugly like
`<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
only implementing the `item` method and the repr of that would be
printed. But no longer, because this risks infinite recursions.).

4.
Bool arrays of size 1 with a 'True' value will now omit a space, so that
`repr(array([True]))` is now 'array([True])' instead of
'array([ True])'.

Allan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Marten van Kerkwijk
To add to Allan's message: point (2), the printing of 0-d arrays, is
the one that is the most important in the sense that it rectifies a
really strange situation, where the printing cannot be logically
controlled by the same mechanism that controls >=1-d arrays (see PR).

While point 3 can also be considered a bug fix, 1 & 4 are at some
level matters of taste; my own reason for supporting their
implementation now is that the 0-d arrays already forces me (or,
specifically, astropy) to rewrite quite a few doctests, and I'd rather
have everything in one go -- in this respect, it is a pity that this
is separate from the earlier change in printing for structured arrays
(which was also much for the better, but broke a lot of doctests).

-- Marten



On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]> wrote:

> Hello all,
>
> There are various updates to array printing in preparation for numpy
> 1.14. See https://github.com/numpy/numpy/pull/9139/
>
> Some are quite likely to break other projects' doc-tests which expect a
> particular str or repr of arrays, so I'd like to warn the list in case
> anyone has opinions.
>
> The current proposed changes, from most to least painful by my
> reckoning, are:
>
> 1.
> For float arrays, an extra space previously used for the sign position
> will now be omitted in many cases. Eg, `repr(arange(4.))` will now
> return 'array([0., 1., 2., 3.])' instead of 'array([ 0.,  1.,  2.,  3.])'.
>
> 2.
> The printing of 0d arrays is overhauled. This is a bit finicky to
> describe, please see the release note in the PR. As an example of the
> effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
> instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
> "array('2005-04-04', dtype='datetime64[D]')" instead of
> "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".
>
> 3.
> User-defined dtypes which did not properly implement their `repr` (and
> `str`) should do so now. Otherwise it now falls back to
> `object.__repr__`, which will return something ugly like
> `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
> only implementing the `item` method and the repr of that would be
> printed. But no longer, because this risks infinite recursions.).
>
> 4.
> Bool arrays of size 1 with a 'True' value will now omit a space, so that
> `repr(array([True]))` is now 'array([True])' instead of
> 'array([ True])'.
>
> Allan
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Juan Nunez-Iglesias
To reiterate my point on a previous thread, I don't think this should happen until NumPy 2.0. This *will* break a massive number of doctests, and what's worse, it will do so in a way that makes it difficult to support doctesting for both 1.13 and 1.14. I don't see a big enough benefit to these changes to justify breaking everyone's tests before an API-breaking version bump.

On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <[hidden email]>, wrote:
To add to Allan's message: point (2), the printing of 0-d arrays, is
the one that is the most important in the sense that it rectifies a
really strange situation, where the printing cannot be logically
controlled by the same mechanism that controls >=1-d arrays (see PR).

While point 3 can also be considered a bug fix, 1 & 4 are at some
level matters of taste; my own reason for supporting their
implementation now is that the 0-d arrays already forces me (or,
specifically, astropy) to rewrite quite a few doctests, and I'd rather
have everything in one go -- in this respect, it is a pity that this
is separate from the earlier change in printing for structured arrays
(which was also much for the better, but broke a lot of doctests).

-- Marten



On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]> wrote:
Hello all,

There are various updates to array printing in preparation for numpy
1.14. See https://github.com/numpy/numpy/pull/9139/

Some are quite likely to break other projects' doc-tests which expect a
particular str or repr of arrays, so I'd like to warn the list in case
anyone has opinions.

The current proposed changes, from most to least painful by my
reckoning, are:

1.
For float arrays, an extra space previously used for the sign position
will now be omitted in many cases. Eg, `repr(arange(4.))` will now
return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.

2.
The printing of 0d arrays is overhauled. This is a bit finicky to
describe, please see the release note in the PR. As an example of the
effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
"array('2005-04-04', dtype='datetime64[D]')" instead of
"array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

3.
User-defined dtypes which did not properly implement their `repr` (and
`str`) should do so now. Otherwise it now falls back to
`object.__repr__`, which will return something ugly like
`<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
only implementing the `item` method and the repr of that would be
printed. But no longer, because this risks infinite recursions.).

4.
Bool arrays of size 1 with a 'True' value will now omit a space, so that
`repr(array([True]))` is now 'array([True])' instead of
'array([ True])'.

Allan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Sebastian Berg
On Fri, 2017-06-30 at 17:55 +1000, Juan Nunez-Iglesias wrote:
> To reiterate my point on a previous thread, I don't think this should
> happen until NumPy 2.0. This *will* break a massive number of
> doctests, and what's worse, it will do so in a way that makes it
> difficult to support doctesting for both 1.13 and 1.14. I don't see a
> big enough benefit to these changes to justify breaking everyone's
> tests before an API-breaking version bump.
>

Just so we are on the same page, nobody is planning a NumPy 2.0, so
insisting on not changing anything until a possible NumPy 2.0 is almost
like saying it should never happen. Of course we could enmass
deprecations and at some point do many at once and call it 2.0, but I
am not sure that helps anyone, when compared to saying that we do
deprecations for 1-2 years at least, and longer if someone complains.

The question is, do you really see a big advantage in fixing a
gazillion tests at once over doing a small part of the fixes one after
another? The "big step" thing did not work too well for Python 3....

- Sebastian


> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk
> <[hidden email]>, wrote:
> > To add to Allan's message: point (2), the printing of 0-d arrays,
> > is
> > the one that is the most important in the sense that it rectifies a
> > really strange situation, where the printing cannot be logically
> > controlled by the same mechanism that controls >=1-d arrays (see
> > PR).
> >
> > While point 3 can also be considered a bug fix, 1 & 4 are at some
> > level matters of taste; my own reason for supporting their
> > implementation now is that the 0-d arrays already forces me (or,
> > specifically, astropy) to rewrite quite a few doctests, and I'd
> > rather
> > have everything in one go -- in this respect, it is a pity that
> > this
> > is separate from the earlier change in printing for structured
> > arrays
> > (which was also much for the better, but broke a lot of doctests).
> >
> > -- Marten
> >
> >
> >
> > On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <allanhaldane@gmail.
> > com> wrote:
> > > Hello all,
> > >
> > > There are various updates to array printing in preparation for
> > > numpy
> > > 1.14. See https://github.com/numpy/numpy/pull/9139/
> > >
> > > Some are quite likely to break other projects' doc-tests which
> > > expect a
> > > particular str or repr of arrays, so I'd like to warn the list in
> > > case
> > > anyone has opinions.
> > >
> > > The current proposed changes, from most to least painful by my
> > > reckoning, are:
> > >
> > > 1.
> > > For float arrays, an extra space previously used for the sign
> > > position
> > > will now be omitted in many cases. Eg, `repr(arange(4.))` will
> > > now
> > > return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2.,
> > > 3.])'.
> > >
> > > 2.
> > > The printing of 0d arrays is overhauled. This is a bit finicky to
> > > describe, please see the release note in the PR. As an example of
> > > the
> > > effect of this, the `repr(np.array(0.))` now prints as
> > > 'array(0.)`
> > > instead of 'array(0.0)'. Also the repr of 0d datetime arrays is
> > > now like
> > > "array('2005-04-04', dtype='datetime64[D]')" instead of
> > > "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".
> > >
> > > 3.
> > > User-defined dtypes which did not properly implement their `repr`
> > > (and
> > > `str`) should do so now. Otherwise it now falls back to
> > > `object.__repr__`, which will return something ugly like
> > > `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend
> > > on
> > > only implementing the `item` method and the repr of that would be
> > > printed. But no longer, because this risks infinite recursions.).
> > >
> > > 4.
> > > Bool arrays of size 1 with a 'True' value will now omit a space,
> > > so that
> > > `repr(array([True]))` is now 'array([True])' instead of
> > > 'array([ True])'.
> > >
> > > Allan
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [hidden email]
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> >  _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Gael Varoquaux
In reply to this post by Juan Nunez-Iglesias
Indeed, for scikit-learn, this would be a major problem.

Gaël

On Fri, Jun 30, 2017 at 05:55:52PM +1000, Juan Nunez-Iglesias wrote:
> To reiterate my point on a previous thread, I don't think this should happen
> until NumPy 2.0. This *will* break a massive number of doctests, and what's
> worse, it will do so in a way that makes it difficult to support doctesting for
> both 1.13 and 1.14. I don't see a big enough benefit to these changes to
> justify breaking everyone's tests before an API-breaking version bump.

> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <[hidden email]>,
> wrote:

>     To add to Allan's message: point (2), the printing of 0-d arrays, is
>     the one that is the most important in the sense that it rectifies a
>     really strange situation, where the printing cannot be logically
>     controlled by the same mechanism that controls >=1-d arrays (see PR).

>     While point 3 can also be considered a bug fix, 1 & 4 are at some
>     level matters of taste; my own reason for supporting their
>     implementation now is that the 0-d arrays already forces me (or,
>     specifically, astropy) to rewrite quite a few doctests, and I'd rather
>     have everything in one go -- in this respect, it is a pity that this
>     is separate from the earlier change in printing for structured arrays
>     (which was also much for the better, but broke a lot of doctests).

>     -- Marten



>     On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]>
>     wrote:

>         Hello all,

>         There are various updates to array printing in preparation for numpy
>         1.14. See https://github.com/numpy/numpy/pull/9139/

>         Some are quite likely to break other projects' doc-tests which expect a
>         particular str or repr of arrays, so I'd like to warn the list in case
>         anyone has opinions.

>         The current proposed changes, from most to least painful by my
>         reckoning, are:

>         1.
>         For float arrays, an extra space previously used for the sign position
>         will now be omitted in many cases. Eg, `repr(arange(4.))` will now
>         return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.

>         2.
>         The printing of 0d arrays is overhauled. This is a bit finicky to
>         describe, please see the release note in the PR. As an example of the
>         effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
>         instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now
>         like
>         "array('2005-04-04', dtype='datetime64[D]')" instead of
>         "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

>         3.
>         User-defined dtypes which did not properly implement their `repr` (and
>         `str`) should do so now. Otherwise it now falls back to
>         `object.__repr__`, which will return something ugly like
>         `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
>         only implementing the `item` method and the repr of that would be
>         printed. But no longer, because this risks infinite recursions.).

>         4.
>         Bool arrays of size 1 with a 'True' value will now omit a space, so
>         that
>         `repr(array([True]))` is now 'array([True])' instead of
>         'array([ True])'.

>         Allan
>         _______________________________________________
>         NumPy-Discussion mailing list
>         [hidden email]
>         https://mail.python.org/mailman/listinfo/numpy-discussion

>     _______________________________________________
>     NumPy-Discussion mailing list
>     [hidden email]
>     https://mail.python.org/mailman/listinfo/numpy-discussion


> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion


--
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Allan Haldane
In reply to this post by Juan Nunez-Iglesias
On 06/30/2017 03:55 AM, Juan Nunez-Iglesias wrote:
> To reiterate my point on a previous thread, I don't think this should
> happen until NumPy 2.0. This *will* break a massive number of doctests,
> and what's worse, it will do so in a way that makes it difficult to
> support doctesting for both 1.13 and 1.14. I don't see a big enough
> benefit to these changes to justify breaking everyone's tests before an
> API-breaking version bump.

I am still on the fence about exactly how annoying this change would be,
and it is is good to hear whether this affects you and how badly.

Yes, someone would have to spend an hour removing a hundred spaces in
doctests, and the 1.13 to 1.14 period is trickier (but virtualenv
helps). But none of your end users are going to have their scripts
break, there are no new warnings or exceptions.

A followup questions is, to what degree can we compromise? Would it be
acceptable to skip the big change #1, but keep the other 3 changes? I
expect they affect far fewer doctests. Or, for instance, I could scale
back #1 so it only affects size-1 (or perhaps, only size-0) arrays. What
amount of change would be OK, and how is changing a small number of
doctests different from changing more?

Also, let me clarify the motivations for the changes. As Marten noted,
change #2 is what motivated all the other changes. Currently 0d arrays
print in their own special way which was making it very hard to
implement fixes to voidtype str/repr, and the datetime and other 0d
reprs are weird. The fix is to make 0d arrays print using the same
code-path as higher-d ndarrays, but then we ended up with reprs like
"array( 1.)" because of the space for the sign position. So I removed
the space from the sign position for all float arrays. But as I noted I
probably could remove it for only size-1 or 0d arrays and still fix my
problem, even though I think it might be pretty hacky to implement in
the numpy code.

Allan



>
> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk
> <[hidden email]>, wrote:
>> To add to Allan's message: point (2), the printing of 0-d arrays, is
>> the one that is the most important in the sense that it rectifies a
>> really strange situation, where the printing cannot be logically
>> controlled by the same mechanism that controls >=1-d arrays (see PR).
>>
>> While point 3 can also be considered a bug fix, 1 & 4 are at some
>> level matters of taste; my own reason for supporting their
>> implementation now is that the 0-d arrays already forces me (or,
>> specifically, astropy) to rewrite quite a few doctests, and I'd rather
>> have everything in one go -- in this respect, it is a pity that this
>> is separate from the earlier change in printing for structured arrays
>> (which was also much for the better, but broke a lot of doctests).
>>
>> -- Marten
>>
>>
>>
>> On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane
>> <[hidden email]> wrote:
>>> Hello all,
>>>
>>> There are various updates to array printing in preparation for numpy
>>> 1.14. See https://github.com/numpy/numpy/pull/9139/
>>>
>>> Some are quite likely to break other projects' doc-tests which expect a
>>> particular str or repr of arrays, so I'd like to warn the list in case
>>> anyone has opinions.
>>>
>>> The current proposed changes, from most to least painful by my
>>> reckoning, are:
>>>
>>> 1.
>>> For float arrays, an extra space previously used for the sign position
>>> will now be omitted in many cases. Eg, `repr(arange(4.))` will now
>>> return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.
>>>
>>> 2.
>>> The printing of 0d arrays is overhauled. This is a bit finicky to
>>> describe, please see the release note in the PR. As an example of the
>>> effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
>>> instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
>>> "array('2005-04-04', dtype='datetime64[D]')" instead of
>>> "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".
>>>
>>> 3.
>>> User-defined dtypes which did not properly implement their `repr` (and
>>> `str`) should do so now. Otherwise it now falls back to
>>> `object.__repr__`, which will return something ugly like
>>> `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
>>> only implementing the `item` method and the repr of that would be
>>> printed. But no longer, because this risks infinite recursions.).
>>>
>>> 4.
>>> Bool arrays of size 1 with a 'True' value will now omit a space, so that
>>> `repr(array([True]))` is now 'array([True])' instead of
>>> 'array([ True])'.
>>>
>>> Allan
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> [hidden email]
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

CJ Carey
Is it feasible/desirable to provide a doctest runner that ignores whitespace? That would allow downstream projects to fix their doctests on 1.14+ with a one-line change, without breaking tests on 1.13.

On Fri, Jun 30, 2017 at 11:11 AM, Allan Haldane <[hidden email]> wrote:
On 06/30/2017 03:55 AM, Juan Nunez-Iglesias wrote:
To reiterate my point on a previous thread, I don't think this should happen until NumPy 2.0. This *will* break a massive number of doctests, and what's worse, it will do so in a way that makes it difficult to support doctesting for both 1.13 and 1.14. I don't see a big enough benefit to these changes to justify breaking everyone's tests before an API-breaking version bump.

I am still on the fence about exactly how annoying this change would be, and it is is good to hear whether this affects you and how badly.

Yes, someone would have to spend an hour removing a hundred spaces in doctests, and the 1.13 to 1.14 period is trickier (but virtualenv helps). But none of your end users are going to have their scripts break, there are no new warnings or exceptions.

A followup questions is, to what degree can we compromise? Would it be acceptable to skip the big change #1, but keep the other 3 changes? I expect they affect far fewer doctests. Or, for instance, I could scale back #1 so it only affects size-1 (or perhaps, only size-0) arrays. What amount of change would be OK, and how is changing a small number of doctests different from changing more?

Also, let me clarify the motivations for the changes. As Marten noted, change #2 is what motivated all the other changes. Currently 0d arrays print in their own special way which was making it very hard to implement fixes to voidtype str/repr, and the datetime and other 0d reprs are weird. The fix is to make 0d arrays print using the same code-path as higher-d ndarrays, but then we ended up with reprs like "array( 1.)" because of the space for the sign position. So I removed the space from the sign position for all float arrays. But as I noted I probably could remove it for only size-1 or 0d arrays and still fix my problem, even though I think it might be pretty hacky to implement in the numpy code.

Allan





On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <[hidden email]>, wrote:
To add to Allan's message: point (2), the printing of 0-d arrays, is
the one that is the most important in the sense that it rectifies a
really strange situation, where the printing cannot be logically
controlled by the same mechanism that controls >=1-d arrays (see PR).

While point 3 can also be considered a bug fix, 1 & 4 are at some
level matters of taste; my own reason for supporting their
implementation now is that the 0-d arrays already forces me (or,
specifically, astropy) to rewrite quite a few doctests, and I'd rather
have everything in one go -- in this respect, it is a pity that this
is separate from the earlier change in printing for structured arrays
(which was also much for the better, but broke a lot of doctests).

-- Marten



On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]> wrote:
Hello all,

There are various updates to array printing in preparation for numpy
1.14. See https://github.com/numpy/numpy/pull/9139/

Some are quite likely to break other projects' doc-tests which expect a
particular str or repr of arrays, so I'd like to warn the list in case
anyone has opinions.

The current proposed changes, from most to least painful by my
reckoning, are:

1.
For float arrays, an extra space previously used for the sign position
will now be omitted in many cases. Eg, `repr(arange(4.))` will now
return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.

2.
The printing of 0d arrays is overhauled. This is a bit finicky to
describe, please see the release note in the PR. As an example of the
effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
"array('2005-04-04', dtype='datetime64[D]')" instead of
"array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

3.
User-defined dtypes which did not properly implement their `repr` (and
`str`) should do so now. Otherwise it now falls back to
`object.__repr__`, which will return something ugly like
`<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
only implementing the `item` method and the repr of that would be
printed. But no longer, because this risks infinite recursions.).

4.
Bool arrays of size 1 with a 'True' value will now omit a space, so that
`repr(array([True]))` is now 'array([True])' instead of
'array([ True])'.

Allan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Allan Haldane
On 06/30/2017 03:04 PM, CJ Carey wrote:
> Is it feasible/desirable to provide a doctest runner that ignores
> whitespace? That would allow downstream projects to fix their doctests on
> 1.14+ with a one-line change, without breaking tests on 1.13.

Good idea. I have already implemented this actually, see the updated PR.
https://github.com/numpy/numpy/pull/9139/

Whether or not the sign position is padded can now be controlled by setting

    >>> np.set_printoptions(pad_sign=True)
    >>> np.set_printoptions(pad_sign=False)

When pad_sign is True, it gives the old behavior, except for size-1
arrays where it still omits the sign position. (Maybe I should limit it
even more, to 0d arrays?)

When pad_sign is False (currently default in the PR), it removes the
sign padding everywhere if possible.

Allan



> On Fri, Jun 30, 2017 at 11:11 AM, Allan Haldane <[hidden email]>
> wrote:
>
>> On 06/30/2017 03:55 AM, Juan Nunez-Iglesias wrote:
>>
>>> To reiterate my point on a previous thread, I don't think this should
>>> happen until NumPy 2.0. This *will* break a massive number of doctests, and
>>> what's worse, it will do so in a way that makes it difficult to support
>>> doctesting for both 1.13 and 1.14. I don't see a big enough benefit to
>>> these changes to justify breaking everyone's tests before an API-breaking
>>> version bump.
>>>
>>
>> I am still on the fence about exactly how annoying this change would be,
>> and it is is good to hear whether this affects you and how badly.
>>
>> Yes, someone would have to spend an hour removing a hundred spaces in
>> doctests, and the 1.13 to 1.14 period is trickier (but virtualenv helps).
>> But none of your end users are going to have their scripts break, there are
>> no new warnings or exceptions.
>>
>> A followup questions is, to what degree can we compromise? Would it be
>> acceptable to skip the big change #1, but keep the other 3 changes? I
>> expect they affect far fewer doctests. Or, for instance, I could scale back
>> #1 so it only affects size-1 (or perhaps, only size-0) arrays. What amount
>> of change would be OK, and how is changing a small number of doctests
>> different from changing more?
>>
>> Also, let me clarify the motivations for the changes. As Marten noted,
>> change #2 is what motivated all the other changes. Currently 0d arrays
>> print in their own special way which was making it very hard to implement
>> fixes to voidtype str/repr, and the datetime and other 0d reprs are weird.
>> The fix is to make 0d arrays print using the same code-path as higher-d
>> ndarrays, but then we ended up with reprs like "array( 1.)" because of the
>> space for the sign position. So I removed the space from the sign position
>> for all float arrays. But as I noted I probably could remove it for only
>> size-1 or 0d arrays and still fix my problem, even though I think it might
>> be pretty hacky to implement in the numpy code.
>>
>> Allan
>>
>>
>>
>>
>>
>>> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <
>>> [hidden email]>, wrote:
>>>
>>>> To add to Allan's message: point (2), the printing of 0-d arrays, is
>>>> the one that is the most important in the sense that it rectifies a
>>>> really strange situation, where the printing cannot be logically
>>>> controlled by the same mechanism that controls >=1-d arrays (see PR).
>>>>
>>>> While point 3 can also be considered a bug fix, 1 & 4 are at some
>>>> level matters of taste; my own reason for supporting their
>>>> implementation now is that the 0-d arrays already forces me (or,
>>>> specifically, astropy) to rewrite quite a few doctests, and I'd rather
>>>> have everything in one go -- in this respect, it is a pity that this
>>>> is separate from the earlier change in printing for structured arrays
>>>> (which was also much for the better, but broke a lot of doctests).
>>>>
>>>> -- Marten
>>>>
>>>>
>>>>
>>>> On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]>
>>>> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> There are various updates to array printing in preparation for numpy
>>>>> 1.14. See https://github.com/numpy/numpy/pull/9139/
>>>>>
>>>>> Some are quite likely to break other projects' doc-tests which expect a
>>>>> particular str or repr of arrays, so I'd like to warn the list in case
>>>>> anyone has opinions.
>>>>>
>>>>> The current proposed changes, from most to least painful by my
>>>>> reckoning, are:
>>>>>
>>>>> 1.
>>>>> For float arrays, an extra space previously used for the sign position
>>>>> will now be omitted in many cases. Eg, `repr(arange(4.))` will now
>>>>> return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.
>>>>>
>>>>> 2.
>>>>> The printing of 0d arrays is overhauled. This is a bit finicky to
>>>>> describe, please see the release note in the PR. As an example of the
>>>>> effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
>>>>> instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
>>>>> "array('2005-04-04', dtype='datetime64[D]')" instead of
>>>>> "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".
>>>>>
>>>>> 3.
>>>>> User-defined dtypes which did not properly implement their `repr` (and
>>>>> `str`) should do so now. Otherwise it now falls back to
>>>>> `object.__repr__`, which will return something ugly like
>>>>> `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
>>>>> only implementing the `item` method and the repr of that would be
>>>>> printed. But no longer, because this risks infinite recursions.).
>>>>>
>>>>> 4.
>>>>> Bool arrays of size 1 with a 'True' value will now omit a space, so that
>>>>> `repr(array([True]))` is now 'array([True])' instead of
>>>>> 'array([ True])'.
>>>>>
>>>>> Allan
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> [hidden email]
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> [hidden email]
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> [hidden email]
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Allan Haldane
In reply to this post by Gael Varoquaux
On 06/30/2017 09:17 AM, Gael Varoquaux wrote:
> Indeed, for scikit-learn, this would be a major problem.
>
> Gaël

I just ran the scikit-learn tests.

With the new behavior (removed whitespace), I do get 70 total failures:

    $ make test-doc
    Ran 39 tests in 39.503s
    FAILED (SKIP=3, failures=19)

    $ make test
    Ran 8122 tests in 387.650s
    FAILED (SKIP=58, failures=51)

After setting `np.set_printoptions(pad_sign=True)` (see other email) I
get only 1 failure in total, which is due to the presence of a 0d array
in gaussian_process.rst.

So it looks like the pad_sign option as currently implemented is good
enough to avoid almost all doctest errors.

Allan



> On Fri, Jun 30, 2017 at 05:55:52PM +1000, Juan Nunez-Iglesias wrote:
>> To reiterate my point on a previous thread, I don't think this should happen
>> until NumPy 2.0. This *will* break a massive number of doctests, and what's
>> worse, it will do so in a way that makes it difficult to support doctesting for
>> both 1.13 and 1.14. I don't see a big enough benefit to these changes to
>> justify breaking everyone's tests before an API-breaking version bump.
>
>> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <[hidden email]>,
>> wrote:
>
>>     To add to Allan's message: point (2), the printing of 0-d arrays, is
>>     the one that is the most important in the sense that it rectifies a
>>     really strange situation, where the printing cannot be logically
>>     controlled by the same mechanism that controls >=1-d arrays (see PR).
>
>>     While point 3 can also be considered a bug fix, 1 & 4 are at some
>>     level matters of taste; my own reason for supporting their
>>     implementation now is that the 0-d arrays already forces me (or,
>>     specifically, astropy) to rewrite quite a few doctests, and I'd rather
>>     have everything in one go -- in this respect, it is a pity that this
>>     is separate from the earlier change in printing for structured arrays
>>     (which was also much for the better, but broke a lot of doctests).
>
>>     -- Marten
>
>
>
>>     On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]>
>>     wrote:
>
>>         Hello all,
>
>>         There are various updates to array printing in preparation for numpy
>>         1.14. See https://github.com/numpy/numpy/pull/9139/
>
>>         Some are quite likely to break other projects' doc-tests which expect a
>>         particular str or repr of arrays, so I'd like to warn the list in case
>>         anyone has opinions.
>
>>         The current proposed changes, from most to least painful by my
>>         reckoning, are:
>
>>         1.
>>         For float arrays, an extra space previously used for the sign position
>>         will now be omitted in many cases. Eg, `repr(arange(4.))` will now
>>         return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.
>
>>         2.
>>         The printing of 0d arrays is overhauled. This is a bit finicky to
>>         describe, please see the release note in the PR. As an example of the
>>         effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
>>         instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now
>>         like
>>         "array('2005-04-04', dtype='datetime64[D]')" instead of
>>         "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".
>
>>         3.
>>         User-defined dtypes which did not properly implement their `repr` (and
>>         `str`) should do so now. Otherwise it now falls back to
>>         `object.__repr__`, which will return something ugly like
>>         `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
>>         only implementing the `item` method and the repr of that would be
>>         printed. But no longer, because this risks infinite recursions.).
>
>>         4.
>>         Bool arrays of size 1 with a 'True' value will now omit a space, so
>>         that
>>         `repr(array([True]))` is now 'array([True])' instead of
>>         'array([ True])'.
>
>>         Allan
>>         _______________________________________________
>>         NumPy-Discussion mailing list
>>         [hidden email]
>>         https://mail.python.org/mailman/listinfo/numpy-discussion
>
>>     _______________________________________________
>>     NumPy-Discussion mailing list
>>     [hidden email]
>>     https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

ralfgommers
In reply to this post by CJ Carey


On Sat, Jul 1, 2017 at 7:04 AM, CJ Carey <[hidden email]> wrote:
Is it feasible/desirable to provide a doctest runner that ignores whitespace?

Yes, and yes. Due to doctest being in the stdlib that is going to take forever to have any effect though; a separate our-sane-doctest module would be the way to ship this I think.

And not only whitespace, also provide sane floating point comparison behavior (AstroPy has something for that that can be reused: https://github.com/astropy/astropy/issues/6312) as well as things a bit more specific to the needs of scientific Python projects like ignoring the hashes in returned matplotlib objects.
 
That would allow downstream projects to fix their doctests on 1.14+ with a one-line change, without breaking tests on 1.13.

It's worth reading https://docs.python.org/2/library/doctest.html#soapbox. At least the first 2 paragraphs; the rest is mainly an illustration of why doctest default behavior is evil ("doctest also makes an excellent tool for regression testing" - eh, no). The only valid reason nowadays to use doctests is to test that doc examples run and are correct. None of {whitespace, blank lines, small floating point differences between platforms/libs, hashes} are valid reasons to get a test failure.

At the moment there's no polished alternative to using stdlib doctest, so I'm sympathetic to the argument of "this causes a lot of work". On the other hand, exact repr's are not part of the NumPy (or Python for that matter) backwards compatibility guarantees. So imho we should provide that alternative to doctest, and then no longer worry about these kinds of changes and just make them.

Until we have that alternative, I think https://github.com/scipy/scipy/blob/master/tools/refguide_check.py may be useful to other projects - it checks that your examples are not broken, without doing the detailed string comparisons that are so fragile.

Ralf

 

On Fri, Jun 30, 2017 at 11:11 AM, Allan Haldane <[hidden email]> wrote:
On 06/30/2017 03:55 AM, Juan Nunez-Iglesias wrote:
To reiterate my point on a previous thread, I don't think this should happen until NumPy 2.0. This *will* break a massive number of doctests, and what's worse, it will do so in a way that makes it difficult to support doctesting for both 1.13 and 1.14. I don't see a big enough benefit to these changes to justify breaking everyone's tests before an API-breaking version bump.

I am still on the fence about exactly how annoying this change would be, and it is is good to hear whether this affects you and how badly.

Yes, someone would have to spend an hour removing a hundred spaces in doctests, and the 1.13 to 1.14 period is trickier (but virtualenv helps). But none of your end users are going to have their scripts break, there are no new warnings or exceptions.

A followup questions is, to what degree can we compromise? Would it be acceptable to skip the big change #1, but keep the other 3 changes? I expect they affect far fewer doctests. Or, for instance, I could scale back #1 so it only affects size-1 (or perhaps, only size-0) arrays. What amount of change would be OK, and how is changing a small number of doctests different from changing more?

Also, let me clarify the motivations for the changes. As Marten noted, change #2 is what motivated all the other changes. Currently 0d arrays print in their own special way which was making it very hard to implement fixes to voidtype str/repr, and the datetime and other 0d reprs are weird. The fix is to make 0d arrays print using the same code-path as higher-d ndarrays, but then we ended up with reprs like "array( 1.)" because of the space for the sign position. So I removed the space from the sign position for all float arrays. But as I noted I probably could remove it for only size-1 or 0d arrays and still fix my problem, even though I think it might be pretty hacky to implement in the numpy code.

Allan





On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <[hidden email]>, wrote:
To add to Allan's message: point (2), the printing of 0-d arrays, is
the one that is the most important in the sense that it rectifies a
really strange situation, where the printing cannot be logically
controlled by the same mechanism that controls >=1-d arrays (see PR).

While point 3 can also be considered a bug fix, 1 & 4 are at some
level matters of taste; my own reason for supporting their
implementation now is that the 0-d arrays already forces me (or,
specifically, astropy) to rewrite quite a few doctests, and I'd rather
have everything in one go -- in this respect, it is a pity that this
is separate from the earlier change in printing for structured arrays
(which was also much for the better, but broke a lot of doctests).

-- Marten



On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]> wrote:
Hello all,

There are various updates to array printing in preparation for numpy
1.14. See https://github.com/numpy/numpy/pull/9139/

Some are quite likely to break other projects' doc-tests which expect a
particular str or repr of arrays, so I'd like to warn the list in case
anyone has opinions.

The current proposed changes, from most to least painful by my
reckoning, are:

1.
For float arrays, an extra space previously used for the sign position
will now be omitted in many cases. Eg, `repr(arange(4.))` will now
return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.

2.
The printing of 0d arrays is overhauled. This is a bit finicky to
describe, please see the release note in the PR. As an example of the
effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
"array('2005-04-04', dtype='datetime64[D]')" instead of
"array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

3.
User-defined dtypes which did not properly implement their `repr` (and
`str`) should do so now. Otherwise it now falls back to
`object.__repr__`, which will return something ugly like
`<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
only implementing the `item` method and the repr of that would be
printed. But no longer, because this risks infinite recursions.).

4.
Bool arrays of size 1 with a 'True' value will now omit a space, so that
`repr(array([True]))` is now 'array([True])' instead of
'array([ True])'.

Allan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Gael Varoquaux
In reply to this post by Allan Haldane
One problem is that it becomes hard (impossible?) for downstream packages
such as scikit-learn to doctest under multiple versions of the numpy.
Past experience has shown that it could be useful.

Gaël

On Fri, Jun 30, 2017 at 06:30:53PM -0400, Allan Haldane wrote:
> On 06/30/2017 09:17 AM, Gael Varoquaux wrote:
> > Indeed, for scikit-learn, this would be a major problem.

> > Gaël

> I just ran the scikit-learn tests.

> With the new behavior (removed whitespace), I do get 70 total failures:

>     $ make test-doc
>     Ran 39 tests in 39.503s
>     FAILED (SKIP=3, failures=19)

>     $ make test
>     Ran 8122 tests in 387.650s
>     FAILED (SKIP=58, failures=51)

> After setting `np.set_printoptions(pad_sign=True)` (see other email) I
> get only 1 failure in total, which is due to the presence of a 0d array
> in gaussian_process.rst.

> So it looks like the pad_sign option as currently implemented is good
> enough to avoid almost all doctest errors.

> Allan



> > On Fri, Jun 30, 2017 at 05:55:52PM +1000, Juan Nunez-Iglesias wrote:
> >> To reiterate my point on a previous thread, I don't think this should happen
> >> until NumPy 2.0. This *will* break a massive number of doctests, and what's
> >> worse, it will do so in a way that makes it difficult to support doctesting for
> >> both 1.13 and 1.14. I don't see a big enough benefit to these changes to
> >> justify breaking everyone's tests before an API-breaking version bump.

> >> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <[hidden email]>,
> >> wrote:

> >>     To add to Allan's message: point (2), the printing of 0-d arrays, is
> >>     the one that is the most important in the sense that it rectifies a
> >>     really strange situation, where the printing cannot be logically
> >>     controlled by the same mechanism that controls >=1-d arrays (see PR).

> >>     While point 3 can also be considered a bug fix, 1 & 4 are at some
> >>     level matters of taste; my own reason for supporting their
> >>     implementation now is that the 0-d arrays already forces me (or,
> >>     specifically, astropy) to rewrite quite a few doctests, and I'd rather
> >>     have everything in one go -- in this respect, it is a pity that this
> >>     is separate from the earlier change in printing for structured arrays
> >>     (which was also much for the better, but broke a lot of doctests).

> >>     -- Marten



> >>     On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <[hidden email]>
> >>     wrote:

> >>         Hello all,

> >>         There are various updates to array printing in preparation for numpy
> >>         1.14. See https://github.com/numpy/numpy/pull/9139/

> >>         Some are quite likely to break other projects' doc-tests which expect a
> >>         particular str or repr of arrays, so I'd like to warn the list in case
> >>         anyone has opinions.

> >>         The current proposed changes, from most to least painful by my
> >>         reckoning, are:

> >>         1.
> >>         For float arrays, an extra space previously used for the sign position
> >>         will now be omitted in many cases. Eg, `repr(arange(4.))` will now
> >>         return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.

> >>         2.
> >>         The printing of 0d arrays is overhauled. This is a bit finicky to
> >>         describe, please see the release note in the PR. As an example of the
> >>         effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
> >>         instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now
> >>         like
> >>         "array('2005-04-04', dtype='datetime64[D]')" instead of
> >>         "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

> >>         3.
> >>         User-defined dtypes which did not properly implement their `repr` (and
> >>         `str`) should do so now. Otherwise it now falls back to
> >>         `object.__repr__`, which will return something ugly like
> >>         `<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
> >>         only implementing the `item` method and the repr of that would be
> >>         printed. But no longer, because this risks infinite recursions.).

> >>         4.
> >>         Bool arrays of size 1 with a 'True' value will now omit a space, so
> >>         that
> >>         `repr(array([True]))` is now 'array([True])' instead of
> >>         'array([ True])'.

> >>         Allan
> >>         _______________________________________________
> >>         NumPy-Discussion mailing list
> >>         [hidden email]
> >>         https://mail.python.org/mailman/listinfo/numpy-discussion

> >>     _______________________________________________
> >>     NumPy-Discussion mailing list
> >>     [hidden email]
> >>     https://mail.python.org/mailman/listinfo/numpy-discussion


> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> [hidden email]
> >> https://mail.python.org/mailman/listinfo/numpy-discussion



> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

--
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Robert Kern-2
On Fri, Jun 30, 2017 at 4:47 PM, Gael Varoquaux <[hidden email]> wrote:
>
> One problem is that it becomes hard (impossible?) for downstream packages
> such as scikit-learn to doctest under multiple versions of the numpy.
> Past experience has shown that it could be useful.

It's not that hard: wrap the new `set_printoptions(pad=True)` in a `try:` block to catch the error under old versions.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Juan Nunez-Iglesias
I agree that shipping a sane/sanitising doctest runner would go 95% of the way to alleviating my concerns. 

Regarding 2.0, this is the whole point of semantic versioning: downstream packages can pin their dependency as 1.x and know that they
- will continue to work with any updates
- won’t make their users choose between new NumPy 1.x features and running their software.

The Python 3.x transition was a huge fail, but the version numbering was not the problem.

I do have sympathy for Ralf’s argument that "exact repr's are not part of the NumPy (or Python for that matter) backwards compatibility guarantees”. But it is such a foundational project in Scientific Python that I think extreme care is warranted, beyond any official guarantees. (Hence this thread, yes. Thank you!)

Incidentally, I don’t think "array( 1.)” is such a tragic repr fail. I actually would welcome it because I’ve tried to JSON-serialise these buggers quite a few times because I didn’t realise they were 0d arrays instead of floats. So why exactly is it so bad that there is a space there?

Anyway, all this is (mostly) moot if the next NumPy ships with this doctest++ thingy. That would be an enormously valuable contribution to the whole ecosystem.

Thanks,

Juan.

On 1 Jul 2017, 9:56 AM +1000, Robert Kern <[hidden email]>, wrote:
On Fri, Jun 30, 2017 at 4:47 PM, Gael Varoquaux <[hidden email]> wrote:
>
> One problem is that it becomes hard (impossible?) for downstream packages
> such as scikit-learn to doctest under multiple versions of the numpy.
> Past experience has shown that it could be useful.

It's not that hard: wrap the new `set_printoptions(pad=True)` in a `try:` block to catch the error under old versions.

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Nathaniel Smith
On Fri, Jun 30, 2017 at 7:23 PM, Juan Nunez-Iglesias <[hidden email]> wrote:
> I agree that shipping a sane/sanitising doctest runner would go 95% of the
> way to alleviating my concerns.
>
> Regarding 2.0, this is the whole point of semantic versioning: downstream
> packages can pin their dependency as 1.x and know that they
> - will continue to work with any updates
> - won’t make their users choose between new NumPy 1.x features and running
> their software.

Semantic versioning is somewhere between useless and harmful for
non-trivial projects. It's a lovely idea, it would be lovely if it
worked, but in practice it either means you make every release a major
release, which doesn't help anything, or else you never make a major
release until eventually everyone gets so frustrated that they fork
the project or do a python 3 style break-everything major release,
which is a cure that's worse than the original disease.

NumPy's strategy instead is to make small, controlled, rolling
breaking changes in 1.x releases. Every release breaks something for
someone somewhere, but ideally only after debate and appropriate
warning, and hopefully most release don't break things for *you*.
Change is going to happen one way or another, and it's easier to
manage a small amount of breakage every few releases than to manage a
giant chunk all at once. (The latter just seems easier because it's in
the future, so your brain is like "eh I'm sure I'll be fine" until you
get there and realize how doomed you are.)

Plus, the reality is that every numpy release ever made has
accidentally broken something for someone somewhere, so instead of
lying to ourselves and pretending that we can keep things perfectly
backwards compatible at all times, we might as well acknowledge that
and try to manage the cost of breakage and make them worthwhile. Heck,
even bug fixes are frequently compatibility-breaking changes in
reality, and here we are debating whether tweaking whitespace in reprs
is a compatibility-breaking change. There's no line of demarcation
between breaking changes and non-breaking changes, just shades of
grey, and we can do better engineering if our processes acknowledge
that.

Another critique of semantic versioning:
  https://gist.github.com/jashkenas/cbd2b088e20279ae2c8e

The Google philosophy of "error budgets", which is somewhat analogous
to the argument I'm making for a compatibility-breakage budget:
  https://www.usenix.org/node/189332
  https://landing.google.com/sre/book/chapters/service-level-objectives.html#xref_risk-management_global-chubby-planned-outage

-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: proposed changes to array printing in 1.14

Robert Kern-2
In reply to this post by Juan Nunez-Iglesias
On Fri, Jun 30, 2017 at 7:23 PM, Juan Nunez-Iglesias <[hidden email]> wrote:

> I do have sympathy for Ralf’s argument that "exact repr's are not part of the NumPy (or Python for that matter) backwards compatibility guarantees”. But it is such a foundational project in Scientific Python that I think extreme care is warranted, beyond any official guarantees. (Hence this thread, yes. Thank you!)

I would also like to make another distinction here: I don't think anyone's actual *code* has broken because of this change. To my knowledge, it is only downstream projects' *doctests* that break. This might deserve *some* care on our part (beyond notification and keeping it out of a 1.x.y bugfix release), but "extreme care" is just not warranted.

> Anyway, all this is (mostly) moot if the next NumPy ships with this doctest++ thingy. That would be an enormously valuable contribution to the whole ecosystem.

I'd recommend just making an independent project on Github and posting it as its own project to PyPI when you think it's ready. We'll link to it in our documentation. I don't think that it ought to be part of numpy and stuck on our release cadence.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Loading...