ENH: Proposal to add atleast_nd function

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

ENH: Proposal to add atleast_nd function

Joseph Fox-Rabinovitz
I've created PR#18386 to add a function called atleast_nd to numpy and numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and atleast_3d functions.

I proposed a similar idea about four and a half years ago: https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html, PR#7804. The reception was ambivalent, but a couple of folks have asked me about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through broadcasting and reshaping

If this meets with approval, there are a couple of interface issues that probably need to be hashed out:

- The consensus was that this function should accept a single array, rather than a tuple, or multiple arrays as the other atleast_nd functions do. Does that need to be revisited?
- Right now, a `pos` argument specifies where to place new axes, if any. That can be specified in different ways. Another way might be to specify the offset of the existing dimensions, or something entirely different.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Sebastian Berg
On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Juan Nunez-Iglesias-2
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.

Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Stephan Hoyer-2
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Benjamin Root
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Joseph Fox-Rabinovitz
The original functions appear to have been written for things like *stack originally, which actually goes a long way to explaining the inconsistent argument list.

- Joe


On Thu, Feb 11, 2021, 12:41 Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Eric Wieser
In reply to this post by Benjamin Root
> I find that the at_least{1,2,3}d functions are useful for sanitizing inputs

IMO, this type of "sanitization" goes against "In the face of ambiguity, refuse the temptation to guess".
Instead of using `at_least{n}d`, it could be argued that `if np.ndim(x) != n: raise ValueError` is a safer bet, which forces the user to think about what's actually going on, and saves them from silent headaches.

Of course, this is just an argument for discouraging users from using these functions, and for the fact that we perhaps should not have had them in the first place.
Given we already have some of them, adding `atleast_nd` probably isn't going to make things any worse.
In principle, it could actually make things better, as we could put a "Notes" section in the new function docs that describes the XY problem that makes atleast_nd look like a better solution that it is and presents better alternatives, and the other three function docs could link there.

Eric

On Thu, 11 Feb 2021 at 17:41, Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Stephan Hoyer-2
In reply to this post by Benjamin Root
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Benjamin Root
My original usecase for these was dealing with output data from Matlab where those users would use `squeeze()` quite liberally. In addition, there was the problem of the implicit squeeze() in the numpy's loadtxt() for which I added the ndmin kwarg for in case an input CSV file had just one row or no rows.

np.atleast_1d() is used in matplotlib in a bunch of places where inputs are allowed to be scalar or lists.

On Thu, Feb 11, 2021 at 1:15 PM Stephan Hoyer <[hidden email]> wrote:
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Eric Wieser
In reply to this post by Stephan Hoyer-2
I did a quick search of matplotlib, and found a few uses of all three functions:

https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
  This one isn't really numpy at all, and is really just a shorthand for normalizing an argument `x=n` to `x=[n, n]`
https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
   This one is the classic "either multivariate or single-variable data" thing endemic to the SciPy ecosystem.
  Matplotlib has their own `_check_1d` function for input sanitization, although github says it's only used to parse the arguments to `plot`, which at this point are fairly established as being flexible.
https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
  This just looks like "defensive programming", and if the argument isn't already 3d then something is probably wrong.

This isn't an exhaustive list, just a handful of different situations the functions were used.

Eric



On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer <[hidden email]> wrote:
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Juan Nunez-Iglesias-2
both napari and scikit-image use atleast_ a few times. I don’t have many examples of where I used nd because it didn’t exist. But I have the very distinct impression of needing it repeatedly. In some places, I’ve used `np.broadcast_to` to signal the same intention, where `atleast_nd` would have been the more readable solution.

I don’t buy the argument that it’s just a way to mask errors. NumPy broadcasting also has that same potential but I hope no one would seriously consider deprecating it. Indeed, even if we accept that we (library authors) should force users to provide an array of the right dimensionality, that still argues for making it convenient for users to do that!

I don’t feel super strongly about this. But I think atleast_nd is a move in a positive direction and I’d prefer  it to what’s there now:

In [1]: import numpy as np
In [2]: np.atleast_3d(np.ones(4)).shape
Out[2]: (1, 4, 1)

There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

Juan.

On 12 Feb 2021, at 5:32 am, Eric Wieser <[hidden email]> wrote:

I did a quick search of matplotlib, and found a few uses of all three functions:

https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
  This one isn't really numpy at all, and is really just a shorthand for normalizing an argument `x=n` to `x=[n, n]`
https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
   This one is the classic "either multivariate or single-variable data" thing endemic to the SciPy ecosystem.
  Matplotlib has their own `_check_1d` function for input sanitization, although github says it's only used to parse the arguments to `plot`, which at this point are fairly established as being flexible.
https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
  This just looks like "defensive programming", and if the argument isn't already 3d then something is probably wrong.

This isn't an exhaustive list, just a handful of different situations the functions were used.

Eric



On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer <[hidden email]> wrote:
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

ralfgommers


On Fri, Feb 12, 2021 at 3:32 AM Juan Nunez-Iglesias <[hidden email]> wrote:
both napari and scikit-image use atleast_ a few times. I don’t have many examples of where I used nd because it didn’t exist. But I have the very distinct impression of needing it repeatedly. In some places, I’ve used `np.broadcast_to` to signal the same intention, where `atleast_nd` would have been the more readable solution.

I don’t buy the argument that it’s just a way to mask errors. NumPy broadcasting also has that same potential but I hope no one would seriously consider deprecating it. Indeed, even if we accept that we (library authors) should force users to provide an array of the right dimensionality, that still argues for making it convenient for users to do that!

I don’t feel super strongly about this. But I think atleast_nd is a move in a positive direction and I’d prefer  it to what’s there now:

In [1]: import numpy as np
In [2]: np.atleast_3d(np.ones(4)).shape
Out[2]: (1, 4, 1)

There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

Yes that's pretty weird. I'm also not sure there's a reason.

It would be good that, if atleast_nd is not going to replicate this behavior, atleast_3d was deprecated (perhaps a release or two after introduction of atleast_nd).

Not having `atleast_3d(x) == atleast_nd(x, pos=3)` is unnecessarily confusing.

Ralf


Juan.

On 12 Feb 2021, at 5:32 am, Eric Wieser <[hidden email]> wrote:

I did a quick search of matplotlib, and found a few uses of all three functions:

https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
  This one isn't really numpy at all, and is really just a shorthand for normalizing an argument `x=n` to `x=[n, n]`
https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
   This one is the classic "either multivariate or single-variable data" thing endemic to the SciPy ecosystem.
  Matplotlib has their own `_check_1d` function for input sanitization, although github says it's only used to parse the arguments to `plot`, which at this point are fairly established as being flexible.
https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
  This just looks like "defensive programming", and if the argument isn't already 3d then something is probably wrong.

This isn't an exhaustive list, just a handful of different situations the functions were used.

Eric



On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer <[hidden email]> wrote:
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Eric Wieser
In reply to this post by Juan Nunez-Iglesias-2
> There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

My guess is that the historical motivation was to allow grayscale `(H, W)` images to be converted into `(H, W, 1)` images so that they can be broadcast against `(H, W, 3)` RGB images.

Eric

On Fri, 12 Feb 2021 at 02:32, Juan Nunez-Iglesias <[hidden email]> wrote:
both napari and scikit-image use atleast_ a few times. I don’t have many examples of where I used nd because it didn’t exist. But I have the very distinct impression of needing it repeatedly. In some places, I’ve used `np.broadcast_to` to signal the same intention, where `atleast_nd` would have been the more readable solution.

I don’t buy the argument that it’s just a way to mask errors. NumPy broadcasting also has that same potential but I hope no one would seriously consider deprecating it. Indeed, even if we accept that we (library authors) should force users to provide an array of the right dimensionality, that still argues for making it convenient for users to do that!

I don’t feel super strongly about this. But I think atleast_nd is a move in a positive direction and I’d prefer  it to what’s there now:

In [1]: import numpy as np
In [2]: np.atleast_3d(np.ones(4)).shape
Out[2]: (1, 4, 1)

There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

Juan.

On 12 Feb 2021, at 5:32 am, Eric Wieser <[hidden email]> wrote:

I did a quick search of matplotlib, and found a few uses of all three functions:

https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
  This one isn't really numpy at all, and is really just a shorthand for normalizing an argument `x=n` to `x=[n, n]`
https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
   This one is the classic "either multivariate or single-variable data" thing endemic to the SciPy ecosystem.
  Matplotlib has their own `_check_1d` function for input sanitization, although github says it's only used to parse the arguments to `plot`, which at this point are fairly established as being flexible.
https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
  This just looks like "defensive programming", and if the argument isn't already 3d then something is probably wrong.

This isn't an exhaustive list, just a handful of different situations the functions were used.

Eric



On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer <[hidden email]> wrote:
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <[hidden email]> wrote:
for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed).

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]> wrote:
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`.


I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?
 
Juan.

On 11 Feb 2021, at 9:48 am, Sebastian Berg <[hidden email]> wrote:

On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
I've created PR#18386 to add a function called atleast_nd to numpy and
numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
atleast_3d functions.

I proposed a similar idea about four and a half years ago:
PR#7804. The reception was ambivalent, but a couple of folks have asked me
about this, so I'm bringing it back.

Some pros:

- This closes issue #12336
- There are a couple of Stack Overflow questions that would benefit
- Been asked about this a couple of times
- Implementation of three existing atleast_*d functions gets easier
- Looks nicer that the equivalent broadcasting and reshaping

Some cons:

- Cluttering up the API
- Maintenance burden (but not a big one)
- This is just a utility function, which can be achieved through
broadcasting and reshaping


My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed.  It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved.
ndim  int
The minimum number of dimensions required.
pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0.
Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.
Notes
This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by numpy’s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python’s functools.partial function. An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array([[[[ 3.]]]])
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
[[[[1, 2, 3]]]]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Sebastian Berg
In reply to this post by ralfgommers
On Fri, 2021-02-12 at 11:13 +0100, Ralf Gommers wrote:

> On Fri, Feb 12, 2021 at 3:32 AM Juan Nunez-Iglesias
> <[hidden email]>
> wrote:
>
> > both napari and scikit-image use atleast_ a few times. I don’t have
> > many
> > examples of where I used nd because it didn’t exist. But I have the
> > very
> > distinct impression of needing it repeatedly. In some places, I’ve
> > used
> > `np.broadcast_to` to signal the same intention, where `atleast_nd`
> > would
> > have been the more readable solution.
> >
> > I don’t buy the argument that it’s just a way to mask errors. NumPy
> > broadcasting also has that same potential but I hope no one would
> > seriously
> > consider deprecating it. Indeed, even if we accept that we (library
> > authors) should force users to provide an array of the right
> > dimensionality, that still argues for making it convenient for
> > users to do
> > that!
> >
> > I don’t feel super strongly about this. But I think atleast_nd is a
> > move
> > in a positive direction and I’d prefer  it to what’s there now:
> >
> > In [1]: import numpy as np
> > In [2]: np.atleast_3d(np.ones(4)).shape
> > Out[2]: (1, 4, 1)
> >
> > There might be some linear algebraic reason why those axis
> > positions make
> > sense, but I’m not aware of it...
> >
>
> Yes that's pretty weird. I'm also not sure there's a reason.
>
> It would be good that, if atleast_nd is not going to replicate this
> behavior, atleast_3d was deprecated (perhaps a release or two after
> introduction of atleast_nd).
>
Planning to replace `atleast_3d` (not right now but soon), sounds like
a good way forward. "1, 2, nd" is pretty good. `atleast_3d` seems not
used all that much and is an odd one out. Having the `nd` version
should make a future deprecation painless, so long term we will be
better off.

- Sebastian


> Not having `atleast_3d(x) == atleast_nd(x, pos=3)` is unnecessarily
> confusing.
>
> Ralf
>
>
> > Juan.
> >
> > On 12 Feb 2021, at 5:32 am, Eric Wieser <
> > [hidden email]>
> > wrote:
> >
> > I did a quick search of matplotlib, and found a few uses of all
> > three
> > functions:
> >
> > *
> > https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
> >   This one isn't really numpy at all, and is really just a
> > shorthand for
> > normalizing an argument `x=n` to `x=[n, n]`
> > *
> > https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
> >    This one is the classic "either multivariate or single-variable
> > data"
> > thing endemic to the SciPy ecosystem.
> > *
> > https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326
> >   Matplotlib has their own `_check_1d` function for input
> > sanitization,
> > although github says it's only used to parse the arguments to
> > `plot`, which
> > at this point are fairly established as being flexible.
> > *
> > https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
> >   This just looks like "defensive programming", and if the argument
> > isn't
> > already 3d then something is probably wrong.
> >
> > This isn't an exhaustive list, just a handful of different
> > situations the
> > functions were used.
> >
> > Eric
> >
> >
> >
> > On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer <[hidden email]>
> > wrote:
> >
> > > On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root <
> > > [hidden email]>
> > > wrote:
> > >
> > > > for me, I find that the at_least{1,2,3}d functions are useful
> > > > for
> > > > sanitizing inputs. Having an at_leastnd() function can be
> > > > viewed as a step
> > > > towards cleaning up the API, not cluttering it (although,
> > > > deprecations of
> > > > the existing functions probably should be long given how long
> > > > they have
> > > > existed).
> > > >
> > >
> > > I would love to see examples of this -- perhaps in matplotlib?
> > >
> > > My thinking is that in most cases it's probably a better idea to
> > > keep the
> > > interface simpler, and raise an error for lower-dimensional
> > > arrays.
> > > Automatic conversion is convenient (and endemic within the SciPy
> > > ecosystem), but is also a common source of bugs.
> > >
> > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer <[hidden email]>
> > > wrote:
> > > >
> > > > > On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias <
> > > > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > I totally agree with the namespace clutter concern, but
> > > > > > honestly, I
> > > > > > would use `atleast_nd` with its `pos` argument (I might
> > > > > > rename it to
> > > > > > `position`, `axis`, or `axis_position`) any day over
> > > > > > `at_least{1,2,3}d`,
> > > > > > for which I had no idea where the new axes would end up.
> > > > > >
> > > > > > So, I’m in favour of including it, and optionally
> > > > > > deprecating
> > > > > > `atleast_{1,2,3}d`.
> > > > > >
> > > > > >
> > > > > I appreciate that `atleast_nd` feels more sensible than
> > > > > `at_least{1,2,3}d`, but I don't think "better" than a pattern
> > > > > we would not
> > > > > recommend is a good enough reason for inclusion in NumPy. It
> > > > > needs to stand
> > > > > on its own.
> > > > >
> > > > > What would be the recommended use-cases for this new
> > > > > function?
> > > > > Have any libraries building on top of NumPy implemented a
> > > > > version of
> > > > > this?
> > > > >
> > > > >
> > > > > > Juan.
> > > > > >
> > > > > > On 11 Feb 2021, at 9:48 am, Sebastian Berg <
> > > > > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz
> > > > > > wrote:
> > > > > >
> > > > > > I've created PR#18386 to add a function called atleast_nd
> > > > > > to numpy and
> > > > > > numpy.ma. This would generalize the existing atleast_1d,
> > > > > > atleast_2d,
> > > > > > and
> > > > > > atleast_3d functions.
> > > > > >
> > > > > > I proposed a similar idea about four and a half years ago:
> > > > > >
> > > > > > https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
> > > > > > ,
> > > > > > PR#7804. The reception was ambivalent, but a couple of
> > > > > > folks have
> > > > > > asked me
> > > > > > about this, so I'm bringing it back.
> > > > > >
> > > > > > Some pros:
> > > > > >
> > > > > > - This closes issue #12336
> > > > > > - There are a couple of Stack Overflow questions that would
> > > > > > benefit
> > > > > > - Been asked about this a couple of times
> > > > > > - Implementation of three existing atleast_*d functions
> > > > > > gets easier
> > > > > > - Looks nicer that the equivalent broadcasting and
> > > > > > reshaping
> > > > > >
> > > > > > Some cons:
> > > > > >
> > > > > > - Cluttering up the API
> > > > > > - Maintenance burden (but not a big one)
> > > > > > - This is just a utility function, which can be achieved
> > > > > > through
> > > > > > broadcasting and reshaping
> > > > > >
> > > > > >
> > > > > > My main concern would be the namespace cluttering. I can't
> > > > > > say I use
> > > > > > even the `atleast_2d` etc. functions personally, so I would
> > > > > > tend to be
> > > > > > slightly against the addition. But if others land on the
> > > > > > "useful" side here
> > > > > > (and it seemed a bit at least on github), I am also not
> > > > > > opposed.  It is a
> > > > > > clean name that lines up with existing ones, so it doesn't
> > > > > > seem like a big
> > > > > > "mental load" with respect to namespace cluttering.
> > > > > >
> > > > > > Bike shedding the API is probably a good idea in any case.
> > > > > >
> > > > > > I have pasted the current PR documentation (as html) below
> > > > > > for quick
> > > > > > reference. I wonder a bit about the reasoning for having
> > > > > > `pos` specify a
> > > > > > value rather than just a side?
> > > > > >
> > > > > >
> > > > > >
> > > > > > numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
> > > > > > View input as array with at least ndim dimensions.
> > > > > > New unit dimensions are inserted at the index given by
> > > > > > *pos* if
> > > > > > necessary.
> > > > > > Parameters*ary  *array_like
> > > > > > The input array. Non-array inputs are converted to arrays.
> > > > > > Arrays that
> > > > > > already have ndim or more dimensions are preserved.
> > > > > > *ndim  *int
> > > > > > The minimum number of dimensions required.
> > > > > > *pos  *int, optional
> > > > > > The index to insert the new dimensions. May range from -
> > > > > > ary.ndim - 1
> > > > > > to +ary.ndim (inclusive). Non-negative indices indicate
> > > > > > locations
> > > > > > before the corresponding axis: pos=0 means to insert at the
> > > > > > very
> > > > > > beginning. Negative indices indicate locations after the
> > > > > > corresponding axis:
> > > > > >  pos=-1 means to insert at the very end. 0 and -1 are
> > > > > > always
> > > > > > guaranteed to work. Any other number will depend on the
> > > > > > dimensions of the
> > > > > > existing array. Default is 0.
> > > > > > Returns*res  *ndarray
> > > > > > An array with res.ndim >= ndim. A view is returned for
> > > > > > array inputs.
> > > > > > Dimensions are prepended if *pos* is 0, so for example, a
> > > > > > 1-D array
> > > > > > of shape (N,) with ndim=4becomes a view of shape (1, 1, 1,
> > > > > > N).
> > > > > > Dimensions are appended if *pos* is -1, so for example a 2-
> > > > > > D array of
> > > > > > shape (M, N) becomes a view of shape (M, N, 1, 1)when
> > > > > > ndim=4.
> > > > > > *See also*
> > > > > > atleast_1d
> > > > > > <
> > > > > > https://18298-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.atleast_1d.html#numpy.atleast_1d
> > > > > > >
> > > > > > , atleast_2d
> > > > > > <
> > > > > > https://18298-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.atleast_2d.html#numpy.atleast_2d
> > > > > > >
> > > > > > , atleast_3d
> > > > > > <
> > > > > > https://18298-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.atleast_3d.html#numpy.atleast_3d
> > > > > > >
> > > > > > *Notes*
> > > > > > This function does not follow the convention of the other
> > > > > > atleast_*d functions
> > > > > > in numpy in that it only accepts a single array argument.
> > > > > > To process
> > > > > > multiple arrays, use a comprehension or loop around the
> > > > > > function call. See
> > > > > > examples below.
> > > > > > Setting pos=0 is equivalent to how the array would be
> > > > > > interpreted by
> > > > > > numpy’s broadcasting rules. There is no need to call this
> > > > > > function for
> > > > > > simple broadcasting. This is also roughly (but not exactly)
> > > > > > equivalent to
> > > > > >  np.array(ary, copy=False, subok=True, ndmin=ndim).
> > > > > > It is easy to create functions for specific dimensions
> > > > > > similar to the
> > > > > > other atleast_*d functions using Python’s functools.partial
> > > > > > <
> > > > > > https://docs.python.org/dev/library/functools.html#functools.partial
> > > > > > >
> > > > > >  function. An example is shown below.
> > > > > > *Examples*
> > > > > >
> > > > > > > > > np.atleast_nd(3.0, 4)array([[[[ 3.]]]])
> > > > > >
> > > > > > > > > x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3)
> > > > > >
> > > > > > > > > x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x,
> > > > > > > > > 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base
> > > > > > > > > is x.baseTrue
> > > > > >
> > > > > > > > > [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1,
> > > > > > > > > 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1,
> > > > > > > > > 2]]])]
> > > > > >
> > > > > > > > > np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1,
> > > > > > > > > 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1,
> > > > > > > > > 1, 1)
> > > > > >
> > > > > > > > > from functools import partial>>> atleast_4d =
> > > > > > > > > partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2,
> > > > > > > > > 3])[[[[1, 2, 3]]]]
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > NumPy-Discussion mailing list
> > > > > > [hidden email]
> > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > NumPy-Discussion mailing list
> > > > > > [hidden email]
> > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > >
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > [hidden email]
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > >
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > [hidden email]
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [hidden email]
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Robert Kern-2
In reply to this post by Eric Wieser
On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <[hidden email]> wrote:
> There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

My guess is that the historical motivation was to allow grayscale `(H, W)` images to be converted into `(H, W, 1)` images so that they can be broadcast against `(H, W, 3)` RGB images.

Correct. If you do introduce atleast_nd(), I'm not sure why you'd deprecate and remove the one existing function that *isn't* made redundant thereby.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Joseph Fox-Rabinovitz


On Fri, Feb 12, 2021, 09:32 Robert Kern <[hidden email]> wrote:
On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <[hidden email]> wrote:
> There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

My guess is that the historical motivation was to allow grayscale `(H, W)` images to be converted into `(H, W, 1)` images so that they can be broadcast against `(H, W, 3)` RGB images.

Correct. If you do introduce atleast_nd(), I'm not sure why you'd deprecate and remove the one existing function that *isn't* made redundant thereby.

`atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` argument lets you tell it where to put the new axes. What's unintuitive to my is that the 1D case gets promoted to from shape `(x,)` to shape `(1, x, 1)`. It takes two calls to `atleast_nd` to replicate that behavior.

One modification to `atleast_nd` I've thought about is making `pos` refer to the position of the existing axes in the new array rather than the position of the new axes, but that's likely not a useful way to go about it.

- Joe


--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Robert Kern-2
On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <[hidden email]> wrote:


On Fri, Feb 12, 2021, 09:32 Robert Kern <[hidden email]> wrote:
On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <[hidden email]> wrote:
> There might be some linear algebraic reason why those axis positions make sense, but I’m not aware of it...

My guess is that the historical motivation was to allow grayscale `(H, W)` images to be converted into `(H, W, 1)` images so that they can be broadcast against `(H, W, 3)` RGB images.

Correct. If you do introduce atleast_nd(), I'm not sure why you'd deprecate and remove the one existing function that *isn't* made redundant thereby.

`atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` argument lets you tell it where to put the new axes. What's unintuitive to my is that the 1D case gets promoted to from shape `(x,)` to shape `(1, x, 1)`. It takes two calls to `atleast_nd` to replicate that behavior.

When thinking about channeled images, the channel axis is not of the same kind as the H and W axes. Really, you tend to want to think about an RGB image as a (H, W) array of colors rather than an (H, W, 3) ndarray of intensity values. As much as possible, you want to treat RGB images similar to (H, W)-shaped grayscale images. Let's say I want to make a separable filter to convolve with my image, that is, we have a 1D filter for each of the H and W axes, and they are repeated for each channel, if RGB. Setting up a separable filter for (H, W) grayscale is straightforward with broadcasting semantics. I can use (ntaps,)-shaped vector for the W axis and (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB case, I want the same thing. atleast_3d() adapts those correctly for the (H, W, nchannels) case.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Sebastian Berg
On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:

> On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
> [hidden email]> wrote:
>
> >
> >
> > On Fri, Feb 12, 2021, 09:32 Robert Kern <[hidden email]>
> > wrote:
> >
> > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
> > > [hidden email]>
> > > wrote:
> > >
> > > > > There might be some linear algebraic reason why those axis
> > > > > positions
> > > > make sense, but I’m not aware of it...
> > > >
> > > > My guess is that the historical motivation was to allow
> > > > grayscale `(H,
> > > > W)` images to be converted into `(H, W, 1)` images so that they
> > > > can be
> > > > broadcast against `(H, W, 3)` RGB images.
> > > >
> > >
> > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
> > > deprecate and remove the one existing function that *isn't* made
> > > redundant
> > > thereby.
> > >
> >
> > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
> > argument lets you tell it where to put the new axes. What's
> > unintuitive to
> > my is that the 1D case gets promoted to from shape `(x,)` to shape
> > `(1, x,
> > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
> >
>
> When thinking about channeled images, the channel axis is not of the
> same
> kind as the H and W axes. Really, you tend to want to think about an
> RGB
> image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
> intensity values. As much as possible, you want to treat RGB images
> similar
> to (H, W)-shaped grayscale images. Let's say I want to make a
> separable
> filter to convolve with my image, that is, we have a 1D filter for
> each of
> the H and W axes, and they are repeated for each channel, if RGB.
> Setting
> up a separable filter for (H, W) grayscale is straightforward with
> broadcasting semantics. I can use (ntaps,)-shaped vector for the W
> axis and
> (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
> case, I
> want the same thing. atleast_3d() adapts those correctly for the (H,
> W,
> nchannels) case.
Right, my initial feeling it that without such context `atleast_3d` is
pretty surprising.  So I wonder if we can design `atleast_nd` in a way
that it is explicit about this context.

The `pos` argument is the current solution to this, but maybe is a
better way [2]?  Meshgrid for example defaults to `indexing='xy'` and
has `indexing='ij'` for a similar purpose [1].

Of course, if `atleast_3d` is common enough, I guess that argument
could also swing to adding a keyword-only argument to `atleast_3d`
(that way we can/will never change the default).

- Sebastian


[1] Not sure the purposes are comparable, but in both cases, they
provide information about the "context" in which meshgrid/atleast_3d
are used.

[2] It feels a bit like you may have to think about what `pos=3` will
actually do (in the sense, that we will all just end up doing trial and
error :)). At which point I am not sure there is too much gained over
the surprise of `atleast_3d`.

>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

ralfgommers


On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg <[hidden email]> wrote:
On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:
> On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
> [hidden email]> wrote:
>
> >
> >
> > On Fri, Feb 12, 2021, 09:32 Robert Kern <[hidden email]>
> > wrote:
> >
> > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
> > > [hidden email]>
> > > wrote:
> > >
> > > > > There might be some linear algebraic reason why those axis
> > > > > positions
> > > > make sense, but I’m not aware of it...
> > > >
> > > > My guess is that the historical motivation was to allow
> > > > grayscale `(H,
> > > > W)` images to be converted into `(H, W, 1)` images so that they
> > > > can be
> > > > broadcast against `(H, W, 3)` RGB images.
> > > >
> > >
> > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
> > > deprecate and remove the one existing function that *isn't* made
> > > redundant
> > > thereby.
> > >
> >
> > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
> > argument lets you tell it where to put the new axes. What's
> > unintuitive to
> > my is that the 1D case gets promoted to from shape `(x,)` to shape
> > `(1, x,
> > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
> >
>
> When thinking about channeled images, the channel axis is not of the
> same
> kind as the H and W axes. Really, you tend to want to think about an
> RGB
> image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
> intensity values. As much as possible, you want to treat RGB images
> similar
> to (H, W)-shaped grayscale images. Let's say I want to make a
> separable
> filter to convolve with my image, that is, we have a 1D filter for
> each of
> the H and W axes, and they are repeated for each channel, if RGB.
> Setting
> up a separable filter for (H, W) grayscale is straightforward with
> broadcasting semantics. I can use (ntaps,)-shaped vector for the W
> axis and
> (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
> case, I
> want the same thing. atleast_3d() adapts those correctly for the (H,
> W,
> nchannels) case.

Right, my initial feeling it that without such context `atleast_3d` is
pretty surprising.  So I wonder if we can design `atleast_nd` in a way
that it is explicit about this context.

Agreed. I think such a use case is probably too specific to design a single function for, at least in such a hardcoded way. There's also "channels first" and "channels last" versions of RGB images as 3-D arrays, and "channels first" is the default in most deep learning frameworks - so the choice atleast_3d makes is a little outdated by now.

Cheers,
Ralf


The `pos` argument is the current solution to this, but maybe is a
better way [2]?  Meshgrid for example defaults to `indexing='xy'` and
has `indexing='ij'` for a similar purpose [1].

Of course, if `atleast_3d` is common enough, I guess that argument
could also swing to adding a keyword-only argument to `atleast_3d`
(that way we can/will never change the default).

- Sebastian


[1] Not sure the purposes are comparable, but in both cases, they
provide information about the "context" in which meshgrid/atleast_3d
are used.

[2] It feels a bit like you may have to think about what `pos=3` will
actually do (in the sense, that we will all just end up doing trial and
error :)). At which point I am not sure there is too much gained over
the surprise of `atleast_3d`.

>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ENH: Proposal to add atleast_nd function

Robert Kern-2
On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers <[hidden email]> wrote:

On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg <[hidden email]> wrote:
On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:
> On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
> [hidden email]> wrote:
>
> >
> >
> > On Fri, Feb 12, 2021, 09:32 Robert Kern <[hidden email]>
> > wrote:
> >
> > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
> > > [hidden email]>
> > > wrote:
> > >
> > > > > There might be some linear algebraic reason why those axis
> > > > > positions
> > > > make sense, but I’m not aware of it...
> > > >
> > > > My guess is that the historical motivation was to allow
> > > > grayscale `(H,
> > > > W)` images to be converted into `(H, W, 1)` images so that they
> > > > can be
> > > > broadcast against `(H, W, 3)` RGB images.
> > > >
> > >
> > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
> > > deprecate and remove the one existing function that *isn't* made
> > > redundant
> > > thereby.
> > >
> >
> > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
> > argument lets you tell it where to put the new axes. What's
> > unintuitive to
> > my is that the 1D case gets promoted to from shape `(x,)` to shape
> > `(1, x,
> > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
> >
>
> When thinking about channeled images, the channel axis is not of the
> same
> kind as the H and W axes. Really, you tend to want to think about an
> RGB
> image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
> intensity values. As much as possible, you want to treat RGB images
> similar
> to (H, W)-shaped grayscale images. Let's say I want to make a
> separable
> filter to convolve with my image, that is, we have a 1D filter for
> each of
> the H and W axes, and they are repeated for each channel, if RGB.
> Setting
> up a separable filter for (H, W) grayscale is straightforward with
> broadcasting semantics. I can use (ntaps,)-shaped vector for the W
> axis and
> (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
> case, I
> want the same thing. atleast_3d() adapts those correctly for the (H,
> W,
> nchannels) case.

Right, my initial feeling it that without such context `atleast_3d` is
pretty surprising.  So I wonder if we can design `atleast_nd` in a way
that it is explicit about this context.

Agreed. I think such a use case is probably too specific to design a single function for, at least in such a hardcoded way.

That might be an argument for not designing a new one (or at least not giving it such a name). Not sure it's a good argument for removing a long-standing one.

Broadcasting is a very powerful convention that makes coding with arrays tolerable. It makes some choices (namely, prepending 1s to the shape) to make some common operations with mixed-dimension arrays work "by default". But it doesn't cover all of the desired operations conveniently. atleast_3d() bridges the gap to an important convention for a major use-case of arrays.

There's also "channels first" and "channels last" versions of RGB images as 3-D arrays, and "channels first" is the default in most deep learning frameworks - so the choice atleast_3d makes is a little outdated by now. 

DL frameworks do not constitute the majority of image processing code, which has a very strong channels-last contingent. But nonetheless, the very popular Tensorflow defaults to channels-last.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
12