Changed behavior of np.gradient

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Changed behavior of np.gradient

Ariel Rokem-2

Hi everyone, 

>>> import numpy as np

>>> np.__version__

'1.9.0'

>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))

[array([[ 2.,  2., -1.],

       [ 2.,  2., -1.]]), array([[-0.5,  2.5,  5.5],

       [ 1. ,  1. ,  1. ]])]


On the other hand: 

>>> import numpy as np

>>> np.__version__

'1.8.2'

>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))

[array([[ 2.,  2., -1.],

       [ 2.,  2., -1.]]), array([[ 1. ,  2.5,  4. ],

       [ 1. ,  1. ,  1. ]])]


For what it's worth, the 1.8 version of this function seems to be in agreement with the Matlab equivalent function ('gradient'): 

>> gradient([[1, 2, 6]; [3, 4, 5]])


ans =


    1.0000    2.5000    4.0000

    1.0000    1.0000    1.0000


This seems like a regression to me, but maybe it's an improvement? 

Cheers, 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Derek Homeier
On 4 Oct 2014, at 08:37 pm, Ariel Rokem <[hidden email]> wrote:

> >>> import numpy as np
> >>> np.__version__
> '1.9.0'
> >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))
> [array([[ 2.,  2., -1.],
>        [ 2.,  2., -1.]]), array([[-0.5,  2.5,  5.5],
>        [ 1. ,  1. ,  1. ]])]
>
> On the other hand:
> >>> import numpy as np
> >>> np.__version__
> '1.8.2'
> >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))
> [array([[ 2.,  2., -1.],
>        [ 2.,  2., -1.]]), array([[ 1. ,  2.5,  4. ],
>        [ 1. ,  1. ,  1. ]])]
>
> For what it's worth, the 1.8 version of this function seems to be in agreement with the Matlab equivalent function ('gradient'):
> >> gradient([[1, 2, 6]; [3, 4, 5]])
> ans =
>     1.0000    2.5000    4.0000
>     1.0000    1.0000    1.0000
>
> This seems like a regression to me, but maybe it's an improvement?
>
Technically yes, the function has been changed to use 2nd-order differences where possible,
as is described in the docstring. Someone missed to update the example though, which still
quotes the 1.8 results.
And if the loss of Matlab-compliance is seen as a disadvantage, maybe there is a case for
re-enabling the old behaviour via keyword argument?

Cheers,
                                        Derek

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Ariel Rokem-2
On Sat, Oct 4, 2014 at 12:29 PM, Derek Homeier <[hidden email]> wrote:
On 4 Oct 2014, at 08:37 pm, Ariel Rokem <[hidden email]> wrote:

> >>> import numpy as np
> >>> np.__version__
> '1.9.0'
> >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))
> [array([[ 2.,  2., -1.],
>        [ 2.,  2., -1.]]), array([[-0.5,  2.5,  5.5],
>        [ 1. ,  1. ,  1. ]])]
>
> On the other hand:
> >>> import numpy as np
> >>> np.__version__
> '1.8.2'
> >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))
> [array([[ 2.,  2., -1.],
>        [ 2.,  2., -1.]]), array([[ 1. ,  2.5,  4. ],
>        [ 1. ,  1. ,  1. ]])]
>
> For what it's worth, the 1.8 version of this function seems to be in agreement with the Matlab equivalent function ('gradient'):
> >> gradient([[1, 2, 6]; [3, 4, 5]])
> ans =
>     1.0000    2.5000    4.0000
>     1.0000    1.0000    1.0000
>
> This seems like a regression to me, but maybe it's an improvement?
>
Technically yes, the function has been changed to use 2nd-order differences where possible,
as is described in the docstring. Someone missed to update the example though, which still
quotes the 1.8 results.
And if the loss of Matlab-compliance is seen as a disadvantage, maybe there is a case for
re-enabling the old behaviour via keyword argument?



Thanks for clarifying - I see that now in the docstring as well. It went from: "The gradient is computed using central differences in the interior and first differences at the boundaries." to "The gradient is computed using second order accurate central differences in the interior and second order accurate one-sides (forward or backwards) differences at the boundaries.". 

I think that the docstring in 1.9 is fine (has the 1.9 result). The docs online (for all of numpy) are still on version 1.8, though. 

I think that enabling the old behavior might be useful, if only so that I can write code that behaves consistently across these two versions of numpy. For now, I might just copy over the 1.8 code into my project. 

Cheers, 

Ariel 


 
Cheers,
                                        Derek

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Derek Homeier

Hi Ariel,

> I think that the docstring in 1.9 is fine (has the 1.9 result). The docs online (for all of numpy) are still on version 1.8, though.
>
> I think that enabling the old behavior might be useful, if only so that I can write code that behaves consistently across these two versions of numpy. For now, I might just copy over the 1.8 code into my project.
>
Hmm, I got this with 1.9.0:

Examples
    --------
    >>> x = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
    >>> np.gradient(x)
    array([ 1. ,  1.5,  2.5,  3.5,  4.5,  5. ])
    >>> np.gradient(x, 2)
    array([ 0.5 ,  0.75,  1.25,  1.75,  2.25,  2.5 ])
   
    >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float))
    [array([[ 2.,  2., -1.],
           [ 2.,  2., -1.]]),
    array([[ 1. ,  2.5,  4. ],
           [ 1. ,  1. ,  1. ]])]

In [5]: x =np.array([1, 2, 4, 7, 11, 16], dtype=np.float)

In [6]: print(np.gradient(x))
[ 0.5  1.5  2.5  3.5  4.5  5.5]

In [7]: print(np.gradient(x, 2))
[ 0.25  0.75  1.25  1.75  2.25  2.75]


I think there is a point for supporting the old behaviour besides backwards-compatibility or any sort of Matlab-compliance
as I’d probably like to be able to restrict a function to linear/1st order differences in cases where I know the input to be
not well-behaved.

+1 for an order=2 or maxorder=2 flag

Cheers,
                                                Derek

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Stéfan van der Walt

On Oct 4, 2014 10:14 PM, "Derek Homeier" <[hidden email]> wrote:
>
> +1 for an order=2 or maxorder=2 flag

If you parameterize that flag, users will want to change its value (above two). Perhaps rather use a boolean flag such as "second_order" or "high_order", unless it seems feasible to include additional orders in the future.

Stéfan


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Charles R Harris


On Sat, Oct 4, 2014 at 3:16 PM, Stéfan van der Walt <[hidden email]> wrote:

On Oct 4, 2014 10:14 PM, "Derek Homeier" <[hidden email]> wrote:
>
> +1 for an order=2 or maxorder=2 flag

If you parameterize that flag, users will want to change its value (above two). Perhaps rather use a boolean flag such as "second_order" or "high_order", unless it seems feasible to include additional orders in the future.

How about 'matlab={True, False}'. There is an open issue for this and it would be good to decide before 1.9.1 comes out.

Chuck


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Nathaniel Smith
In reply to this post by Stéfan van der Walt

On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>
> On Oct 4, 2014 10:14 PM, "Derek Homeier" <[hidden email]> wrote:
> >
> > +1 for an order=2 or maxorder=2 flag
>
> If you parameterize that flag, users will want to change its value (above two). Perhaps rather use a boolean flag such as "second_order" or "high_order", unless it seems feasible to include additional orders in the future.

Predicting the future is hard :-). And in particular high_order= would create all kinds of confusion if in the future we added 3rd order approximations but high_order=True continued to mean 2nd order because of compatibility. I like maxorder (or max_order would be more pep8ish I guess) because it leaves our options open. (Similar to how it's often better to have a kwarg that can take two possible string values than to have a boolean kwarg. It makes current code more explicit and makes future enhancements easier.)

-n


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Charles R Harris


On Tue, Oct 14, 2014 at 10:57 AM, Nathaniel Smith <[hidden email]> wrote:

On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>
> On Oct 4, 2014 10:14 PM, "Derek Homeier" <[hidden email]> wrote:
> >
> > +1 for an order=2 or maxorder=2 flag
>
> If you parameterize that flag, users will want to change its value (above two). Perhaps rather use a boolean flag such as "second_order" or "high_order", unless it seems feasible to include additional orders in the future.

Predicting the future is hard :-). And in particular high_order= would create all kinds of confusion if in the future we added 3rd order approximations but high_order=True continued to mean 2nd order because of compatibility. I like maxorder (or max_order would be more pep8ish I guess) because it leaves our options open. (Similar to how it's often better to have a kwarg that can take two possible string values than to have a boolean kwarg. It makes current code more explicit and makes future enhancements easier.)


I think maxorder is a bit misleading. The both versions are second order in the interior while at the ends the old is first order and the new is second order. Maybe edge_order?

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Nathaniel Smith

On 14 Oct 2014 18:29, "Charles R Harris" <[hidden email]> wrote:
>
>
>
> On Tue, Oct 14, 2014 at 10:57 AM, Nathaniel Smith <[hidden email]> wrote:
>>
>> On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>> >
>> > On Oct 4, 2014 10:14 PM, "Derek Homeier" <[hidden email]> wrote:
>> > >
>> > > +1 for an order=2 or maxorder=2 flag
>> >
>> > If you parameterize that flag, users will want to change its value (above two). Perhaps rather use a boolean flag such as "second_order" or "high_order", unless it seems feasible to include additional orders in the future.
>>
>> Predicting the future is hard :-). And in particular high_order= would create all kinds of confusion if in the future we added 3rd order approximations but high_order=True continued to mean 2nd order because of compatibility. I like maxorder (or max_order would be more pep8ish I guess) because it leaves our options open. (Similar to how it's often better to have a kwarg that can take two possible string values than to have a boolean kwarg. It makes current code more explicit and makes future enhancements easier.)
>
>
> I think maxorder is a bit misleading. The both versions are second order in the interior while at the ends the old is first order and the new is second order. Maybe edge_order?

Ah, that makes sense. edge_order makes more sense to me too then - and we can always add interior_order to complement it later, if appropriate.

The other thing to decide on is the default. Is the 2nd order version generally preferred (modulo compatibility)? If so then it might make sense to keep it the default, given that there are already numpy's in the wild with that version, so we can't fully guarantee compatibility even if we wanted to. But what do others think?

-n


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Charles R Harris


On Tue, Oct 14, 2014 at 11:50 AM, Nathaniel Smith <[hidden email]> wrote:

On 14 Oct 2014 18:29, "Charles R Harris" <[hidden email]> wrote:
>
>
>
> On Tue, Oct 14, 2014 at 10:57 AM, Nathaniel Smith <[hidden email]> wrote:
>>
>> On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>> >
>> > On Oct 4, 2014 10:14 PM, "Derek Homeier" <[hidden email]> wrote:
>> > >
>> > > +1 for an order=2 or maxorder=2 flag
>> >
>> > If you parameterize that flag, users will want to change its value (above two). Perhaps rather use a boolean flag such as "second_order" or "high_order", unless it seems feasible to include additional orders in the future.
>>
>> Predicting the future is hard :-). And in particular high_order= would create all kinds of confusion if in the future we added 3rd order approximations but high_order=True continued to mean 2nd order because of compatibility. I like maxorder (or max_order would be more pep8ish I guess) because it leaves our options open. (Similar to how it's often better to have a kwarg that can take two possible string values than to have a boolean kwarg. It makes current code more explicit and makes future enhancements easier.)
>
>
> I think maxorder is a bit misleading. The both versions are second order in the interior while at the ends the old is first order and the new is second order. Maybe edge_order?

Ah, that makes sense. edge_order makes more sense to me too then - and we can always add interior_order to complement it later, if appropriate.

The other thing to decide on is the default. Is the 2nd order version generally preferred (modulo compatibility)? If so then it might make sense to keep it the default, given that there are already numpy's in the wild with that version, so we can't fully guarantee compatibility even if we wanted to. But what do others think?

I'd be inclined to keep the older as the default and regard adding the keyword as a bugfix. I should have caught the incompatibility in review.

Chuck
 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Nathaniel Smith
On Tue, Oct 14, 2014 at 10:33 PM, Charles R Harris
<[hidden email]> wrote:

>
> On Tue, Oct 14, 2014 at 11:50 AM, Nathaniel Smith <[hidden email]> wrote:
>>
>> On 14 Oct 2014 18:29, "Charles R Harris" <[hidden email]>
>> wrote:
>> >
>> >
>> >
>> > On Tue, Oct 14, 2014 at 10:57 AM, Nathaniel Smith <[hidden email]> wrote:
>> >>
>> >> On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>> >> >
>> >> > On Oct 4, 2014 10:14 PM, "Derek Homeier"
>> >> > <[hidden email]> wrote:
>> >> > >
>> >> > > +1 for an order=2 or maxorder=2 flag
>> >> >
>> >> > If you parameterize that flag, users will want to change its value
>> >> > (above two). Perhaps rather use a boolean flag such as "second_order" or
>> >> > "high_order", unless it seems feasible to include additional orders in the
>> >> > future.
>> >>
>> >> Predicting the future is hard :-). And in particular high_order= would
>> >> create all kinds of confusion if in the future we added 3rd order
>> >> approximations but high_order=True continued to mean 2nd order because of
>> >> compatibility. I like maxorder (or max_order would be more pep8ish I guess)
>> >> because it leaves our options open. (Similar to how it's often better to
>> >> have a kwarg that can take two possible string values than to have a boolean
>> >> kwarg. It makes current code more explicit and makes future enhancements
>> >> easier.)
>> >
>> >
>> > I think maxorder is a bit misleading. The both versions are second order
>> > in the interior while at the ends the old is first order and the new is
>> > second order. Maybe edge_order?
>>
>> Ah, that makes sense. edge_order makes more sense to me too then - and we
>> can always add interior_order to complement it later, if appropriate.
>>
>> The other thing to decide on is the default. Is the 2nd order version
>> generally preferred (modulo compatibility)? If so then it might make sense
>> to keep it the default, given that there are already numpy's in the wild
>> with that version, so we can't fully guarantee compatibility even if we
>> wanted to. But what do others think?
>
> I'd be inclined to keep the older as the default and regard adding the
> keyword as a bugfix. I should have caught the incompatibility in review.

I don't have any code that uses gradient, so I don't have a great
sense of the trade-offs here.

- Usually if we have a change that produces increased accuracy, we
make the increased accuracy the default. Otherwise no-one ever uses
it, and everyone gets less accurate results than they would otherwise.
(I don't have a great sense of how much this change affects accuracy
though.)

- If the change in output per se is a serious problem for people, then
it's not one we can fix at this point -- 1.9.0 is out there and people
are using it anyway, so those who have the problem already need to
take some affirmative action to fix it. (Also, it's kinda weird to
change a function's behaviour and add a new argument in a point
release!)

So I'd like to hear from people affected by this -- would you prefer
to have the 2nd order boundary calculations by default, you just need
some way to workaround the immediate problems in existing code? Or do
you prefer the old default remain the default, with 2nd order boundary
calculations something that must be requested by hand every time?

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Ariel Rokem-2

On Thu, Oct 16, 2014 at 10:22 AM, Nathaniel Smith <[hidden email]> wrote:
On Tue, Oct 14, 2014 at 10:33 PM, Charles R Harris
<[hidden email]> wrote:
>
> On Tue, Oct 14, 2014 at 11:50 AM, Nathaniel Smith <[hidden email]> wrote:
>>
>> On 14 Oct 2014 18:29, "Charles R Harris" <[hidden email]>
>> wrote:
>> >
>> >
>> >
>> > On Tue, Oct 14, 2014 at 10:57 AM, Nathaniel Smith <[hidden email]> wrote:
>> >>
>> >> On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>> >> >
>> >> > On Oct 4, 2014 10:14 PM, "Derek Homeier"
>> >> > <[hidden email]> wrote:
>> >> > >
>> >> > > +1 for an order=2 or maxorder=2 flag
>> >> >
>> >> > If you parameterize that flag, users will want to change its value
>> >> > (above two). Perhaps rather use a boolean flag such as "second_order" or
>> >> > "high_order", unless it seems feasible to include additional orders in the
>> >> > future.
>> >>
>> >> Predicting the future is hard :-). And in particular high_order= would
>> >> create all kinds of confusion if in the future we added 3rd order
>> >> approximations but high_order=True continued to mean 2nd order because of
>> >> compatibility. I like maxorder (or max_order would be more pep8ish I guess)
>> >> because it leaves our options open. (Similar to how it's often better to
>> >> have a kwarg that can take two possible string values than to have a boolean
>> >> kwarg. It makes current code more explicit and makes future enhancements
>> >> easier.)
>> >
>> >
>> > I think maxorder is a bit misleading. The both versions are second order
>> > in the interior while at the ends the old is first order and the new is
>> > second order. Maybe edge_order?
>>
>> Ah, that makes sense. edge_order makes more sense to me too then - and we
>> can always add interior_order to complement it later, if appropriate.
>>
>> The other thing to decide on is the default. Is the 2nd order version
>> generally preferred (modulo compatibility)? If so then it might make sense
>> to keep it the default, given that there are already numpy's in the wild
>> with that version, so we can't fully guarantee compatibility even if we
>> wanted to. But what do others think?
>
> I'd be inclined to keep the older as the default and regard adding the
> keyword as a bugfix. I should have caught the incompatibility in review.

I don't have any code that uses gradient, so I don't have a great
sense of the trade-offs here.

- Usually if we have a change that produces increased accuracy, we
make the increased accuracy the default. Otherwise no-one ever uses
it, and everyone gets less accurate results than they would otherwise.
(I don't have a great sense of how much this change affects accuracy
though.)

- If the change in output per se is a serious problem for people, then
it's not one we can fix at this point -- 1.9.0 is out there and people
are using it anyway, so those who have the problem already need to
take some affirmative action to fix it. (Also, it's kinda weird to
change a function's behaviour and add a new argument in a point
release!)

So I'd like to hear from people affected by this -- would you prefer
to have the 2nd order boundary calculations by default, you just need
some way to workaround the immediate problems in existing code? Or do
you prefer the old default remain the default, with 2nd order boundary
calculations something that must be requested by hand every time?

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Since I started this discussion, I'll chime in. I don't have a strong preference for either mode that stems from a computational/scientific principle. As Nathaniel suggested - I have resorted to simply copying the 1.8 version of the function into my algorithm implementation, with the hope of removing that down the line. In that respect, I have a very weak preference for preserving the (1.8) status quo per default.  

Thanks! 


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Benjamin Root-2
It isn't really a question of accuracy. It breaks unit tests and reproducibility elsewhere. My vote is to revert to the old behavior in 1.9.1.

Ben Root

On Thu, Oct 16, 2014 at 6:10 PM, Ariel Rokem <[hidden email]> wrote:

On Thu, Oct 16, 2014 at 10:22 AM, Nathaniel Smith <[hidden email]> wrote:
On Tue, Oct 14, 2014 at 10:33 PM, Charles R Harris
<[hidden email]> wrote:
>
> On Tue, Oct 14, 2014 at 11:50 AM, Nathaniel Smith <[hidden email]> wrote:
>>
>> On 14 Oct 2014 18:29, "Charles R Harris" <[hidden email]>
>> wrote:
>> >
>> >
>> >
>> > On Tue, Oct 14, 2014 at 10:57 AM, Nathaniel Smith <[hidden email]> wrote:
>> >>
>> >> On 4 Oct 2014 22:17, "Stéfan van der Walt" <[hidden email]> wrote:
>> >> >
>> >> > On Oct 4, 2014 10:14 PM, "Derek Homeier"
>> >> > <[hidden email]> wrote:
>> >> > >
>> >> > > +1 for an order=2 or maxorder=2 flag
>> >> >
>> >> > If you parameterize that flag, users will want to change its value
>> >> > (above two). Perhaps rather use a boolean flag such as "second_order" or
>> >> > "high_order", unless it seems feasible to include additional orders in the
>> >> > future.
>> >>
>> >> Predicting the future is hard :-). And in particular high_order= would
>> >> create all kinds of confusion if in the future we added 3rd order
>> >> approximations but high_order=True continued to mean 2nd order because of
>> >> compatibility. I like maxorder (or max_order would be more pep8ish I guess)
>> >> because it leaves our options open. (Similar to how it's often better to
>> >> have a kwarg that can take two possible string values than to have a boolean
>> >> kwarg. It makes current code more explicit and makes future enhancements
>> >> easier.)
>> >
>> >
>> > I think maxorder is a bit misleading. The both versions are second order
>> > in the interior while at the ends the old is first order and the new is
>> > second order. Maybe edge_order?
>>
>> Ah, that makes sense. edge_order makes more sense to me too then - and we
>> can always add interior_order to complement it later, if appropriate.
>>
>> The other thing to decide on is the default. Is the 2nd order version
>> generally preferred (modulo compatibility)? If so then it might make sense
>> to keep it the default, given that there are already numpy's in the wild
>> with that version, so we can't fully guarantee compatibility even if we
>> wanted to. But what do others think?
>
> I'd be inclined to keep the older as the default and regard adding the
> keyword as a bugfix. I should have caught the incompatibility in review.

I don't have any code that uses gradient, so I don't have a great
sense of the trade-offs here.

- Usually if we have a change that produces increased accuracy, we
make the increased accuracy the default. Otherwise no-one ever uses
it, and everyone gets less accurate results than they would otherwise.
(I don't have a great sense of how much this change affects accuracy
though.)

- If the change in output per se is a serious problem for people, then
it's not one we can fix at this point -- 1.9.0 is out there and people
are using it anyway, so those who have the problem already need to
take some affirmative action to fix it. (Also, it's kinda weird to
change a function's behaviour and add a new argument in a point
release!)

So I'd like to hear from people affected by this -- would you prefer
to have the 2nd order boundary calculations by default, you just need
some way to workaround the immediate problems in existing code? Or do
you prefer the old default remain the default, with 2nd order boundary
calculations something that must be requested by hand every time?

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Since I started this discussion, I'll chime in. I don't have a strong preference for either mode that stems from a computational/scientific principle. As Nathaniel suggested - I have resorted to simply copying the 1.8 version of the function into my algorithm implementation, with the hope of removing that down the line. In that respect, I have a very weak preference for preserving the (1.8) status quo per default.  

Thanks! 


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Nathaniel Smith
On Fri, Oct 17, 2014 at 2:23 AM, Benjamin Root <[hidden email]> wrote:
> It isn't really a question of accuracy. It breaks unit tests and
> reproducibility elsewhere. My vote is to revert to the old behavior in
> 1.9.1.

Why would one want the 2nd order differences at all, if they're not
more accurate? Should we just revert the patch entirely? I assumed the
change had some benefit...

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Benjamin Root-2
That isn't what I meant. Higher order doesn't "necessarily" mean more accurate. The results simply have different properties. The user needs to choose the differentiation order that they need. One interesting effect in data assimilation/modeling is that even-order differentiation can often have detrimental effects while higher odd order differentiation are better, but it is highly dependent upon the model.

This change in gradient broke a unit test in matplotlib (for a new feature, so it isn't *that* critical). We didn't notice it at first because we weren't testing numpy 1.9 at the time. I want the feature (I have need for it elsewhere), but I don't want the change in default behavior.

Cheers!
Ben Root
 

On Thu, Oct 16, 2014 at 9:31 PM, Nathaniel Smith <[hidden email]> wrote:
On Fri, Oct 17, 2014 at 2:23 AM, Benjamin Root <[hidden email]> wrote:
> It isn't really a question of accuracy. It breaks unit tests and
> reproducibility elsewhere. My vote is to revert to the old behavior in
> 1.9.1.

Why would one want the 2nd order differences at all, if they're not
more accurate? Should we just revert the patch entirely? I assumed the
change had some benefit...

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Matthew Brett
Hi,

On Thu, Oct 16, 2014 at 6:38 PM, Benjamin Root <[hidden email]> wrote:

> That isn't what I meant. Higher order doesn't "necessarily" mean more
> accurate. The results simply have different properties. The user needs to
> choose the differentiation order that they need. One interesting effect in
> data assimilation/modeling is that even-order differentiation can often have
> detrimental effects while higher odd order differentiation are better, but
> it is highly dependent upon the model.
>
> This change in gradient broke a unit test in matplotlib (for a new feature,
> so it isn't *that* critical). We didn't notice it at first because we
> weren't testing numpy 1.9 at the time. I want the feature (I have need for
> it elsewhere), but I don't want the change in default behavior.

I think it would be a bad idea to revert now.

I suspect, if you revert, then a lot of other code will assume the <
1.9.0, >= 1.9.1  behavior.  In that case, the code will work as
expected most of the time, except when combined with 1.9.0, which
could be seriously surprising, and often missed.   If you keep the new
behavior, then it will be clearer that other code will have to adapt
to this change >= 1.9.0 - surprise, but predictable surprise, if you
see what I mean...

Matthew
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Charles R Harris


On Thu, Oct 16, 2014 at 8:25 PM, Matthew Brett <[hidden email]> wrote:
Hi,

On Thu, Oct 16, 2014 at 6:38 PM, Benjamin Root <[hidden email]> wrote:
> That isn't what I meant. Higher order doesn't "necessarily" mean more
> accurate. The results simply have different properties. The user needs to
> choose the differentiation order that they need. One interesting effect in
> data assimilation/modeling is that even-order differentiation can often have
> detrimental effects while higher odd order differentiation are better, but
> it is highly dependent upon the model.
>
> This change in gradient broke a unit test in matplotlib (for a new feature,
> so it isn't *that* critical). We didn't notice it at first because we
> weren't testing numpy 1.9 at the time. I want the feature (I have need for
> it elsewhere), but I don't want the change in default behavior.

I think it would be a bad idea to revert now.

I suspect, if you revert, then a lot of other code will assume the <
1.9.0, >= 1.9.1  behavior.  In that case, the code will work as
expected most of the time, except when combined with 1.9.0, which
could be seriously surprising, and often missed.   If you keep the new
behavior, then it will be clearer that other code will have to adapt
to this change >= 1.9.0 - surprise, but predictable surprise, if you
see what I mean...

1.9.1 will be out in a week or so. To be honest, these days I regard the 1.x.0 releases as sort of an advanced release candidate. I think there are just a lot more changes going in between releases and the release gets a lot more testing than the official release candidates.

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Benjamin Root-2
In reply to this post by Matthew Brett
I see this as a regression. We don't keep regressions around for backwards compatibility, we fix them.

Ben

On Thu, Oct 16, 2014 at 10:25 PM, Matthew Brett <[hidden email]> wrote:
Hi,

On Thu, Oct 16, 2014 at 6:38 PM, Benjamin Root <[hidden email]> wrote:
> That isn't what I meant. Higher order doesn't "necessarily" mean more
> accurate. The results simply have different properties. The user needs to
> choose the differentiation order that they need. One interesting effect in
> data assimilation/modeling is that even-order differentiation can often have
> detrimental effects while higher odd order differentiation are better, but
> it is highly dependent upon the model.
>
> This change in gradient broke a unit test in matplotlib (for a new feature,
> so it isn't *that* critical). We didn't notice it at first because we
> weren't testing numpy 1.9 at the time. I want the feature (I have need for
> it elsewhere), but I don't want the change in default behavior.

I think it would be a bad idea to revert now.

I suspect, if you revert, then a lot of other code will assume the <
1.9.0, >= 1.9.1  behavior.  In that case, the code will work as
expected most of the time, except when combined with 1.9.0, which
could be seriously surprising, and often missed.   If you keep the new
behavior, then it will be clearer that other code will have to adapt
to this change >= 1.9.0 - surprise, but predictable surprise, if you
see what I mean...

Matthew
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Changed behavior of np.gradient

Nathaniel Smith
In reply to this post by Benjamin Root-2

On 17 Oct 2014 02:38, "Benjamin Root" <[hidden email]> wrote:
>
> That isn't what I meant. Higher order doesn't "necessarily" mean more accurate. The results simply have different properties. The user needs to choose the differentiation order that they need. One interesting effect in data assimilation/modeling is that even-order differentiation can often have detrimental effects while higher odd order differentiation are better, but it is highly dependent upon the model.

To be clear, we aren't talking about different degrees of differentiation, we're talking about different approximations to the first derivative. I just looked up the original pull request and it contains a pretty convincing graph in which the old code has large systematic errors and the new code doesn't:

  https://github.com/numpy/numpy/issues/3603

I think the claim is that the old code had approximation error that grows like O(1/n), and the new code has errors like O(1/n**2). (Don't ask me what n is though.)

> This change in gradient broke a unit test in matplotlib (for a new feature, so it isn't *that* critical). We didn't notice it at first because we weren't testing numpy 1.9 at the time. I want the feature (I have need for it elsewhere), but I don't want the change in default behavior.

You say it's bad, the original poster says it's good, how are we poor maintainers to know what to do? :-) Can you say any more about why you you prefer so-called lower accuracy approximations here by default?

-n


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion