ENH: Proposal to add np.neighborwise in PR#9514

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

ENH: Proposal to add np.neighborwise in PR#9514

Joseph Fox-Rabinovitz
I would like to propose the addition of a new function,
`np.neighborwise` in PR#9514. It is based on the discussion relating
to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
`np.neighborwise` in PR#9428. This function accepts an array `a`, a
vectorized function of two arguments `func`, and applies the function
to all of the neighboring elements of the array across multiple
dimensions. There are options for masking out parts of the calculation
and for applying the function recursively.

The name of the function is not written in stone. The current name is
taken directly from PR#9428 because I can not think of a better one.

This function can serve as a backend for the existing `np.diff`, which
has been re-implemented in this PR, as well as for the `ratio`
function I propsed earlier. This adds the diagonal diffs feature,
which is tested and backwards compatible. `ratio` can be implemented
very simply with or without a mask. With a mask, it can be expressed
`np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
args[1])` (The conversion to bool is done automatically).

The one potentially non-backwards-compatible API change that this PR
introduces is that `np.diff` now returns an `ndarray` version of the
input, instead of the original array itself if `n==0`. Previously, the
exact input reference was returned for `n==0`. I very seriously doubt
that this feature was ever used outside the numpy test suite anyway.
The advantage of this change is that an invalid axis input can now be
caught before returning the unaltered array. If this change is
considered too drastic, I can remove it without removing the axis
check.

The two main differences between this PR and PR#9428 are the addition
of masks to the computation, and the interpretation of multiple axes.
PR#9428 applies `func` successively along each axis. This provides no
way of doing diagonal diffs. I chose to shift along all the axes
simultaneously before applying `func`. To clarify with an example, if
we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
version I propose here just takes the diagonal diff `4 - 1 = 3`.
Besides being more intuitive in my opinion, taking diagonal diffs
actually adds a new feature that can not be obtained directly by
taking successive diffs.

Please let me know your thoughts.

Regards,

    -Joe
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ENH: Proposal to add np.neighborwise in PR#9514

Benjamin Root
So, this is a kernel mechanism?

On Fri, Aug 4, 2017 at 6:31 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
I would like to propose the addition of a new function,
`np.neighborwise` in PR#9514. It is based on the discussion relating
to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
`np.neighborwise` in PR#9428. This function accepts an array `a`, a
vectorized function of two arguments `func`, and applies the function
to all of the neighboring elements of the array across multiple
dimensions. There are options for masking out parts of the calculation
and for applying the function recursively.

The name of the function is not written in stone. The current name is
taken directly from PR#9428 because I can not think of a better one.

This function can serve as a backend for the existing `np.diff`, which
has been re-implemented in this PR, as well as for the `ratio`
function I propsed earlier. This adds the diagonal diffs feature,
which is tested and backwards compatible. `ratio` can be implemented
very simply with or without a mask. With a mask, it can be expressed
`np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
args[1])` (The conversion to bool is done automatically).

The one potentially non-backwards-compatible API change that this PR
introduces is that `np.diff` now returns an `ndarray` version of the
input, instead of the original array itself if `n==0`. Previously, the
exact input reference was returned for `n==0`. I very seriously doubt
that this feature was ever used outside the numpy test suite anyway.
The advantage of this change is that an invalid axis input can now be
caught before returning the unaltered array. If this change is
considered too drastic, I can remove it without removing the axis
check.

The two main differences between this PR and PR#9428 are the addition
of masks to the computation, and the interpretation of multiple axes.
PR#9428 applies `func` successively along each axis. This provides no
way of doing diagonal diffs. I chose to shift along all the axes
simultaneously before applying `func`. To clarify with an example, if
we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
version I propose here just takes the diagonal diff `4 - 1 = 3`.
Besides being more intuitive in my opinion, taking diagonal diffs
actually adds a new feature that can not be obtained directly by
taking successive diffs.

Please let me know your thoughts.

Regards,

    -Joe
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ENH: Proposal to add np.neighborwise in PR#9514

Tim Cera
In reply to this post by Joseph Fox-Rabinovitz
As noted https://github.com/numpy/numpy/pull/303 a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage.

IMHO it is better suited in numpy with a better name so that people don't miss it.

Kindest regards,
Tim

On Fri, Aug 4, 2017 at 6:33 PM Joseph Fox-Rabinovitz <[hidden email]> wrote:
I would like to propose the addition of a new function,
`np.neighborwise` in PR#9514. It is based on the discussion relating
to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
`np.neighborwise` in PR#9428. This function accepts an array `a`, a
vectorized function of two arguments `func`, and applies the function
to all of the neighboring elements of the array across multiple
dimensions. There are options for masking out parts of the calculation
and for applying the function recursively.

The name of the function is not written in stone. The current name is
taken directly from PR#9428 because I can not think of a better one.

This function can serve as a backend for the existing `np.diff`, which
has been re-implemented in this PR, as well as for the `ratio`
function I propsed earlier. This adds the diagonal diffs feature,
which is tested and backwards compatible. `ratio` can be implemented
very simply with or without a mask. With a mask, it can be expressed
`np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
args[1])` (The conversion to bool is done automatically).

The one potentially non-backwards-compatible API change that this PR
introduces is that `np.diff` now returns an `ndarray` version of the
input, instead of the original array itself if `n==0`. Previously, the
exact input reference was returned for `n==0`. I very seriously doubt
that this feature was ever used outside the numpy test suite anyway.
The advantage of this change is that an invalid axis input can now be
caught before returning the unaltered array. If this change is
considered too drastic, I can remove it without removing the axis
check.

The two main differences between this PR and PR#9428 are the addition
of masks to the computation, and the interpretation of multiple axes.
PR#9428 applies `func` successively along each axis. This provides no
way of doing diagonal diffs. I chose to shift along all the axes
simultaneously before applying `func`. To clarify with an example, if
we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
version I propose here just takes the diagonal diff `4 - 1 = 3`.
Besides being more intuitive in my opinion, taking diagonal diffs
actually adds a new feature that can not be obtained directly by
taking successive diffs.

Please let me know your thoughts.

Regards,

    -Joe
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ENH: Proposal to add np.neighborwise in PR#9514

Stefan van der Walt
On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote:
As noted https://github.com/numpy/numpy/pull/303 a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage.

IMHO it is better suited in numpy with a better name so that people don't miss it.

Is this essentially `scipy.ndimage.generic_filter`?

Stéfan


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ENH: Proposal to add np.neighborwise in PR#9514

Tim Cera
It you're into reading ancient history here is the link to the discussion where Zachary Pincus makes the same observation and my response was to close the PR because I could use scipy.ndimage.generic_filter, even though at least through my eyes, my implementation was nicer.

On Sat, Aug 5, 2017 at 7:56 PM Stefan van der Walt <[hidden email]> wrote:
On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote:
As noted https://github.com/numpy/numpy/pull/303 a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage.

IMHO it is better suited in numpy with a better name so that people don't miss it.

Is this essentially `scipy.ndimage.generic_filter`?

Stéfan

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ENH: Proposal to add np.neighborwise in PR#9514

Juan Nunez-Iglesias
It’s nice that this is pure Python / NumPy vectorized, whereas generic_filter requires some compilation to get good performance. (Tim, although your implementation is nice and readable, it would have been very slow for any significant volumes.)

However, my feeling is that this function is too specialized for a foundational package like NumPy. As Sebastian Berg pointed out on one of the PRs, it can cause confusion when there are many ways of achieving the same outcome. imho, the One Way to do this kind of operation is using generic_filter together with LowLevelCallable. My two blog posts on the topic:

https://ilovesymposia.com/2017/03/12/scipys-new-lowlevelcallable-is-a-game-changer/
https://ilovesymposia.com/2017/03/15/prettier-lowlevelcallables-with-numba-jit-and-decorators/

This has the advantage that it’s even more general. (In fact, it avoids the repeated-applications-vs-diagonal-application argument altogether. These are simply two different kernels.)

Perhaps ndimage lacks discoverability to other fields… But I think that can be better solved with documentation, rather than duplicating functionality and cluttering the NumPy API.

Sorry!

Juan.

On 6 Aug 2017, 6:02 AM +0200, Tim Cera <[hidden email]>, wrote:
It you're into reading ancient history here is the link to the discussion where Zachary Pincus makes the same observation and my response was to close the PR because I could use scipy.ndimage.generic_filter, even though at least through my eyes, my implementation was nicer.

On Sat, Aug 5, 2017 at 7:56 PM Stefan van der Walt <[hidden email]> wrote:
On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote:
As noted https://github.com/numpy/numpy/pull/303 a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage.

IMHO it is better suited in numpy with a better name so that people don't miss it.

Is this essentially `scipy.ndimage.generic_filter`?

Stéfan

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Loading...