Adding an nd generalization of np.ma.mask_rowscols

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding an nd generalization of np.ma.mask_rowscols

Eric Wieser

Today, numpy has a np.ma.mask_rowcols function, which stretches masks along
the full length of an axis. For example, given the matrix::

>>> a2d = np.zeros((3, 3), dtype=int)
>>> a2d[1, 1] = 1
>>> a2d = np.ma.masked_equal(a2d, 1)
>>> print(a2d)
[[0 0 0]
 [0 -- 0]
 [0 0 0]]

The API allows::

>>> print(np.ma.mask_rowcols(a2d, axis=0))
[[0 0 0]
 [-- -- --]
 [0 0 0]]

>>> print(np.ma.mask_rowcols(a2d, axis=1))
[[0 -- 0]
 [0 -- 0]
 [0 -- 0]]

>>> print(np.ma.mask_rowcols(a2d, axis=None))
[[0 -- 0]
 [-- -- --]
 [0 -- 0]]

However, this function only works for 2D arrays.
It would be useful to generalize this to work on ND arrays as well.

Unfortunately, the current function is messy to generalize, because axis=0 means “spread the mask along axis 1”, and vice versa. Additionally, the name is not particularly good for an ND function.

My proposal in PR 14998 is to introduce a new function, mask_extend_axis, which fixes this shortcoming.
Given an 3D array::

>>> a3d = np.zeros((2, 2, 2), dtype=int)
>>> a3d[0, 0, 0] = 1
>>> a3d = np.ma.masked_equal(a3d, 1)
>>> print(a3d)
[[[-- 0]
  [0 0]]

 [[0 0]
  [0 0]]]

This, in my opinion, has clearer axis semantics:

>>> print(np.ma.mask_extend_axis(a2d, axis=0))
[[[-- 0]
  [0 0]]

 [[-- 0]
  [0 0]]]

>>> print(np.ma.mask_extend_axis(a2d, axis=1))
[[[-- 0]
  [-- 0]]

 [[0 0]
  [0 0]]]

>>> print(np.ma.mask_extend_axis(a2d, axis=2))
[[[-- --]
  [0 0]]

 [[0 0]
  [0 0]]]

Stretching over multiple axes remains possible:

>>> print(np.ma.mask_extend_axis(a2d, axis=(1, 2)))
[[[-- --]
  [-- 0]]

 [[0 0]
  [0 0]]]

# extending sequentially is not the same as extending in parallel
>>> print(np.ma.mask_extend_axis(np.ma.mask_extend_axis(a2d, axis=1), axis=2))
[[[-- --]
  [-- --]]

 [[0 0]
  [0 0]]]

Questions for the mailing list then:

  • Can you think of a better name than mask_extend_axis?
  • Does my proposed meaning of axis make more sense to you than the one used by mask_rowcols?

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Adding an nd generalization of np.ma.mask_rowscols

Hameer Abbasi-2

IMHO, masked arrays and extending masks like that is a weird API. I would prefer a more functional approach: Where we take in an input 1-D or N-D boolean array in addition to a masked array with multiple axes over which to extend the mask.

 

From: NumPy-Discussion <numpy-discussion-bounces+hameerabbasi=[hidden email]> on behalf of Eric Wieser <[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Friday, 17. January 2020 at 11:41
To: Discussion of Numerical Python <[hidden email]>
Subject: [Numpy-discussion] Adding an nd generalization of np.ma.mask_rowscols

 

Today, numpy has a np.ma.mask_rowcols function, which stretches masks along
the full length of an axis. For example, given the matrix::

>>> a2d = np.zeros((3, 3), dtype=int)
>>> a2d[1, 1] = 1
>>> a2d = np.ma.masked_equal(a2d, 1)
>>> print(a2d)
[[0 0 0]
 [0 -- 0]
 [0 0 0]]

The API allows::

>>> print(np.ma.mask_rowcols(a2d, axis=0))
[[0 0 0]
 [-- -- --]
 [0 0 0]]
 
>>> print(np.ma.mask_rowcols(a2d, axis=1))
[[0 -- 0]
 [0 -- 0]
 [0 -- 0]]
 
>>> print(np.ma.mask_rowcols(a2d, axis=None))
[[0 -- 0]
 [-- -- --]
 [0 -- 0]]

However, this function only works for 2D arrays.
It would be useful to generalize this to work on ND arrays as well.

Unfortunately, the current function is messy to generalize, because axis=0 means “spread the mask along axis 1”, and vice versa. Additionally, the name is not particularly good for an ND function.

My proposal in PR 14998 is to introduce a new function, mask_extend_axis, which fixes this shortcoming.
Given an 3D array::

>>> a3d = np.zeros((2, 2, 2), dtype=int)
>>> a3d[0, 0, 0] = 1
>>> a3d = np.ma.masked_equal(a3d, 1)
>>> print(a3d)
[[[-- 0]
  [0 0]]
 
 [[0 0]
  [0 0]]]

This, in my opinion, has clearer axis semantics:

>>> print(np.ma.mask_extend_axis(a2d, axis=0))
[[[-- 0]
  [0 0]]
 
 [[-- 0]
  [0 0]]]
 
>>> print(np.ma.mask_extend_axis(a2d, axis=1))
[[[-- 0]
  [-- 0]]
 
 [[0 0]
  [0 0]]]
 
>>> print(np.ma.mask_extend_axis(a2d, axis=2))
[[[-- --]
  [0 0]]
 
 [[0 0]
  [0 0]]]

Stretching over multiple axes remains possible:

>>> print(np.ma.mask_extend_axis(a2d, axis=(1, 2)))
[[[-- --]
  [-- 0]]
 
 [[0 0]
  [0 0]]]
 
# extending sequentially is not the same as extending in parallel
>>> print(np.ma.mask_extend_axis(np.ma.mask_extend_axis(a2d, axis=1), axis=2))
[[[-- --]
  [-- --]]
 
 [[0 0]
  [0 0]]]

Questions for the mailing list then:

·         Can you think of a better name than mask_extend_axis?

·         Does my proposed meaning of axis make more sense to you than the one used by mask_rowcols?

_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Adding an nd generalization of np.ma.mask_rowscols

Sebastian Berg
In reply to this post by Eric Wieser
On Fri, 2020-01-17 at 10:39 +0000, Eric Wieser wrote:
> Today, numpy has a np.ma.mask_rowcols function, which stretches masks
> along
> the full length of an axis. For example, given the matrix::
>
<snip>

> Questions for the mailing list then:
>

The additional question: I think I am good with adding a new name if we
cannot reasonably reuse the old ones.

> Can you think of a better name than mask_extend_axis?

Doubt it is good, but to put it there: extend_mask_along_axis

"along" shows up 2-3 times, although "along" really is the default for
most things in NumPy.

Tried thesaurus for "extend", the main other word seemed "spread" (but
it is very different from the current choice).

> Does my proposed meaning of axis make more sense to you than the one
> used by mask_rowcols?

It does to me (although I hardly ever use masked arrays). The `axis`
argument usually denotes the axis being operated on/along.

- Sebastian

> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Adding an nd generalization of np.ma.mask_rowscols

Hameer Abbasi
In reply to this post by Eric Wieser

IMHO, masked arrays and extending masks like that is a weird API. I would prefer a more functional approach: Where we take in an input 1-D or N-D boolean array in addition to a masked array with multiple axes over which to extend the mask.

 

From: NumPy-Discussion <numpy-discussion-bounces+einstein.edison=[hidden email]> on behalf of Eric Wieser <wieser.eric+[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Friday, 17. January 2020 at 11:40
To: Discussion of Numerical Python <[hidden email]>
Subject: [Numpy-discussion] Adding an nd generalization of np.ma.mask_rowscols

 

Today, numpy has a np.ma.mask_rowcols function, which stretches masks along
the full length of an axis. For example, given the matrix::

>>> a2d = np.zeros((3, 3), dtype=int)
>>> a2d[1, 1] = 1
>>> a2d = np.ma.masked_equal(a2d, 1)
>>> print(a2d)
[[0 0 0]
 [0 -- 0]
 [0 0 0]]

The API allows::

>>> print(np.ma.mask_rowcols(a2d, axis=0))
[[0 0 0]
 [-- -- --]
 [0 0 0]]
 
>>> print(np.ma.mask_rowcols(a2d, axis=1))
[[0 -- 0]
 [0 -- 0]
 [0 -- 0]]
 
>>> print(np.ma.mask_rowcols(a2d, axis=None))
[[0 -- 0]
 [-- -- --]
 [0 -- 0]]

However, this function only works for 2D arrays.
It would be useful to generalize this to work on ND arrays as well.

Unfortunately, the current function is messy to generalize, because axis=0 means “spread the mask along axis 1”, and vice versa. Additionally, the name is not particularly good for an ND function.

My proposal in PR 14998 is to introduce a new function, mask_extend_axis, which fixes this shortcoming.
Given an 3D array::

>>> a3d = np.zeros((2, 2, 2), dtype=int)
>>> a3d[0, 0, 0] = 1
>>> a3d = np.ma.masked_equal(a3d, 1)
>>> print(a3d)
[[[-- 0]
  [0 0]]
 
 [[0 0]
  [0 0]]]

This, in my opinion, has clearer axis semantics:

>>> print(np.ma.mask_extend_axis(a2d, axis=0))
[[[-- 0]
  [0 0]]
 
 [[-- 0]
  [0 0]]]
 
>>> print(np.ma.mask_extend_axis(a2d, axis=1))
[[[-- 0]
  [-- 0]]
 
 [[0 0]
  [0 0]]]
 
>>> print(np.ma.mask_extend_axis(a2d, axis=2))
[[[-- --]
  [0 0]]
 
 [[0 0]
  [0 0]]]

Stretching over multiple axes remains possible:

>>> print(np.ma.mask_extend_axis(a2d, axis=(1, 2)))
[[[-- --]
  [-- 0]]
 
 [[0 0]
  [0 0]]]
 
# extending sequentially is not the same as extending in parallel
>>> print(np.ma.mask_extend_axis(np.ma.mask_extend_axis(a2d, axis=1), axis=2))
[[[-- --]
  [-- --]]
 
 [[0 0]
  [0 0]]]

Questions for the mailing list then:

·         Can you think of a better name than mask_extend_axis?

·         Does my proposed meaning of axis make more sense to you than the one used by mask_rowcols?


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Adding an nd generalization of np.ma.mask_rowscols

Eric Wieser

IMHO, masked arrays and extending masks like that is a weird API.

To give some context, I needed this nd generalization internally in order to fix the issue on these lines.

I would prefer a more functional approach

Can you elaborate on that with an example of the API you’d prefer instead of np.ma.mask_extend_axis(a, axis=(0, 1))?


On Fri, 17 Jan 2020 at 15:30, Hameer Abbasi <[hidden email]> wrote:

IMHO, masked arrays and extending masks like that is a weird API. I would prefer a more functional approach: Where we take in an input 1-D or N-D boolean array in addition to a masked array with multiple axes over which to extend the mask.

 

From: NumPy-Discussion <numpy-discussion-bounces+einstein.edison=[hidden email]> on behalf of Eric Wieser <[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Friday, 17. January 2020 at 11:40
To: Discussion of Numerical Python <[hidden email]>
Subject: [Numpy-discussion] Adding an nd generalization of np.ma.mask_rowscols

 

Today, numpy has a np.ma.mask_rowcols function, which stretches masks along
the full length of an axis. For example, given the matrix::

>>> a2d = np.zeros((3, 3), dtype=int)
>>> a2d[1, 1] = 1
>>> a2d = np.ma.masked_equal(a2d, 1)
>>> print(a2d)
[[0 0 0]
 [0 -- 0]
 [0 0 0]]

The API allows::

>>> print(np.ma.mask_rowcols(a2d, axis=0))
[[0 0 0]
 [-- -- --]
 [0 0 0]]
 
>>> print(np.ma.mask_rowcols(a2d, axis=1))
[[0 -- 0]
 [0 -- 0]
 [0 -- 0]]
 
>>> print(np.ma.mask_rowcols(a2d, axis=None))
[[0 -- 0]
 [-- -- --]
 [0 -- 0]]

However, this function only works for 2D arrays.
It would be useful to generalize this to work on ND arrays as well.

Unfortunately, the current function is messy to generalize, because axis=0 means “spread the mask along axis 1”, and vice versa. Additionally, the name is not particularly good for an ND function.

My proposal in PR 14998 is to introduce a new function, mask_extend_axis, which fixes this shortcoming.
Given an 3D array::

>>> a3d = np.zeros((2, 2, 2), dtype=int)
>>> a3d[0, 0, 0] = 1
>>> a3d = np.ma.masked_equal(a3d, 1)
>>> print(a3d)
[[[-- 0]
  [0 0]]
 
 [[0 0]
  [0 0]]]

This, in my opinion, has clearer axis semantics:

>>> print(np.ma.mask_extend_axis(a2d, axis=0))
[[[-- 0]
  [0 0]]
 
 [[-- 0]
  [0 0]]]
 
>>> print(np.ma.mask_extend_axis(a2d, axis=1))
[[[-- 0]
  [-- 0]]
 
 [[0 0]
  [0 0]]]
 
>>> print(np.ma.mask_extend_axis(a2d, axis=2))
[[[-- --]
  [0 0]]
 
 [[0 0]
  [0 0]]]

Stretching over multiple axes remains possible:

>>> print(np.ma.mask_extend_axis(a2d, axis=(1, 2)))
[[[-- --]
  [-- 0]]
 
 [[0 0]
  [0 0]]]
 
# extending sequentially is not the same as extending in parallel
>>> print(np.ma.mask_extend_axis(np.ma.mask_extend_axis(a2d, axis=1), axis=2))
[[[-- --]
  [-- --]]
 
 [[0 0]
  [0 0]]]

Questions for the mailing list then:

·         Can you think of a better name than mask_extend_axis?

·         Does my proposed meaning of axis make more sense to you than the one used by mask_rowcols?

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion