Improving Complex Comparison/Ordering in Numpy

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Improving Complex Comparison/Ordering in Numpy

Rakesh Vasudevan
Hi all,

As a follow up to gh-15981, I would like to propose a change to bring complex dtype(s) comparison operators and related functions, in line with respective cpython implementations.

The current state of complex dtype comparisons/ordering as summarised in the issue is as follows:

# In python

>> cnum = 1 + 2j
>> cnum_two = 1 + 3j

# Doing a comparision yields
>> cnum > cnum_two

TypeError: '>' not supported between instances of 'complex' and 'complex'


# Doing the same in Numpy scalar comparision

>> np.array(cnum) > np.array(cnum_two)

# Yields

False

NOTE: only >, <, >= , <= do not work on complex numbers in python , equality (==) does work

similarly sorting uses comparison operators behind to sort complex values. Again this behavior diverges from the default python behavior.

# In native python
>> clist = [cnum, cnum_2]
>> sorted(clist, key=lambda c: (c.real, c.imag))
[(1+2j), (1+3j)]

# In numpy

>> np.sort(clist) #Uses the default comparision order

# Yields same result

# To get a cpython like sorting call we can do the following in numpy
np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)

This proposal aims to bring parity between default python handling of complex numbers and handling complex types in numpy

This is a two-step process

  1. Sort complex numbers in a pythonic way , accepting key arguments, and deprecate usage of sort() on complex numbers without key argument
    1. Possibly extend this to max(), min(), if it makes sense to do so. 
    2. Since sort() is being updated for complex numbers, searchsorted() is also a good candidate for implementing this change.
  2. Once this is done, we can deprecate the usage of comparison operators (>, <, >= , <=) on complex dtypes



Handling sort() for complex numbers
There are two approaches we can take for this 

  1. update sort() method, to have a ‘key’ kwarg. When key value is passed, use lexsort to get indices and continue sorting of it. We could support lambda function keys like python, but that is likely to be very slow.
  2. Create a new wrapper function sort_by() (placeholder name, Requesting name suggestions/feedback)That essentially acts like a syntactic sugar for
    1. np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)
  1. Improve the existing sort_complex() method with the new key search functionality (Though the change will only reflect for complex dtypes).
We could choose either method, both have pros and cons , approach 1 makes the sort function signature, closer to its python counterpart, while using approach 2 provides a better distinction between the two approaches for sorting. The performance on approach 1 function would vary, due to the key being an optional argument. Would love the community’s thoughts on this.

Handling min() and max() for complex numbers


Since min and max are essentially a set of comparisons, in python they are not allowed on complex numbers
>> clist = [cnum, cnum_2]
>>> min(clist)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'complex' and 'complex'

# But using keys argument again works
min(clist, key=lambda c: (c.real, c.imag))
We could use a similar key kwarg for min() and max() in python, but question remains how we handle the keys, in this use case , naive way would be to sort() on keys and take last or first element, which is likely going to be slow. Requesting suggestions on approaching this.

Comments on isclose()
Both python and numpy use the absolute value/magnitude for comparing if two values are close enough. Hence I do not see this change affecting this function.

Requesting feedback and suggestions on the above.

Thank you,

Rakesh

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Improving Complex Comparison/Ordering in Numpy

Brock Mendel

On Thu, Jun 4, 2020 at 9:17 PM Rakesh Vasudevan <[hidden email]> wrote:
Hi all,

As a follow up to gh-15981, I would like to propose a change to bring complex dtype(s) comparison operators and related functions, in line with respective cpython implementations.

The current state of complex dtype comparisons/ordering as summarised in the issue is as follows:

# In python

>> cnum = 1 + 2j
>> cnum_two = 1 + 3j

# Doing a comparision yields
>> cnum > cnum_two

TypeError: '>' not supported between instances of 'complex' and 'complex'


# Doing the same in Numpy scalar comparision

>> np.array(cnum) > np.array(cnum_two)

# Yields

False

NOTE: only >, <, >= , <= do not work on complex numbers in python , equality (==) does work

similarly sorting uses comparison operators behind to sort complex values. Again this behavior diverges from the default python behavior.

# In native python
>> clist = [cnum, cnum_2]
>> sorted(clist, key=lambda c: (c.real, c.imag))
[(1+2j), (1+3j)]

# In numpy

>> np.sort(clist) #Uses the default comparision order

# Yields same result

# To get a cpython like sorting call we can do the following in numpy
np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)

This proposal aims to bring parity between default python handling of complex numbers and handling complex types in numpy

This is a two-step process

  1. Sort complex numbers in a pythonic way , accepting key arguments, and deprecate usage of sort() on complex numbers without key argument
    1. Possibly extend this to max(), min(), if it makes sense to do so. 
    2. Since sort() is being updated for complex numbers, searchsorted() is also a good candidate for implementing this change.
  2. Once this is done, we can deprecate the usage of comparison operators (>, <, >= , <=) on complex dtypes



Handling sort() for complex numbers
There are two approaches we can take for this 

  1. update sort() method, to have a ‘key’ kwarg. When key value is passed, use lexsort to get indices and continue sorting of it. We could support lambda function keys like python, but that is likely to be very slow.
  2. Create a new wrapper function sort_by() (placeholder name, Requesting name suggestions/feedback)That essentially acts like a syntactic sugar for
    1. np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)
  1. Improve the existing sort_complex() method with the new key search functionality (Though the change will only reflect for complex dtypes).
We could choose either method, both have pros and cons , approach 1 makes the sort function signature, closer to its python counterpart, while using approach 2 provides a better distinction between the two approaches for sorting. The performance on approach 1 function would vary, due to the key being an optional argument. Would love the community’s thoughts on this.

Handling min() and max() for complex numbers


Since min and max are essentially a set of comparisons, in python they are not allowed on complex numbers
>> clist = [cnum, cnum_2]
>>> min(clist)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'complex' and 'complex'

# But using keys argument again works
min(clist, key=lambda c: (c.real, c.imag))
We could use a similar key kwarg for min() and max() in python, but question remains how we handle the keys, in this use case , naive way would be to sort() on keys and take last or first element, which is likely going to be slow. Requesting suggestions on approaching this.

Comments on isclose()
Both python and numpy use the absolute value/magnitude for comparing if two values are close enough. Hence I do not see this change affecting this function.

Requesting feedback and suggestions on the above.

Thank you,

Rakesh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Improving Complex Comparison/Ordering in Numpy

Rakesh Vasudevan
Hi all, 

   Following up on this. Created a WIP PR 

As stated in the original thread, We need to start by having a sort() function for complex numbers that can do it based on keys, rather than plain arithmetic ordering.

There are two broad ways to approach a sorting function that supports keys (Not just for complex numbers).

  1. Add a key kwarg to the sort() (function and method). To support key based sorting on arrays.
  2. Use a new function on the lines off sortby(c_arr, key=(c_arr.real, c_arr.imag)

In this PR I have chosen approach 1 for the following reasons

  1. Approach 1 means it is easier to deal with both in-place method and the function. Since we can make the change in the c-sort function, we have minimal change in the python layer. This I hope results, minimal impact on current code that handles complex sorting. One example within numpy is is linalg module's svd() function.

  2. With approach 2 when we deprecate complex arithmetic ordering, existing methods using sort() for complex types, need to update their signature.

As it stands the PR does the following 3 things within the Python-C Array method implementation of sort

  1. Checks for complex type- If array is of complex-type, it creates a default key(When no key is passed) which mimics the current arithmetic ordering in Numpy .
  2. Uses the keys to perform a Py_LexSort and generate indices.
  3. We perform the take_along_axis via C call back and copy over the result to the original array (pseudo in-place).

I am requesting feedback/help on implementing take_along_axis logic in C level in an in-place manner and the approach in general.

This will further feed into max() and min() as well. Once we figure this out. Next step would be to deprecate arithmetic ordering for complex types (Which I think will be a PR on it's own)


Regards

Rakesh


On Thu, Jun 4, 2020 at 9:21 PM Brock Mendel <[hidden email]> wrote:

On Thu, Jun 4, 2020 at 9:17 PM Rakesh Vasudevan <[hidden email]> wrote:
Hi all,

As a follow up to gh-15981, I would like to propose a change to bring complex dtype(s) comparison operators and related functions, in line with respective cpython implementations.

The current state of complex dtype comparisons/ordering as summarised in the issue is as follows:

# In python

>> cnum = 1 + 2j
>> cnum_two = 1 + 3j

# Doing a comparision yields
>> cnum > cnum_two

TypeError: '>' not supported between instances of 'complex' and 'complex'


# Doing the same in Numpy scalar comparision

>> np.array(cnum) > np.array(cnum_two)

# Yields

False

NOTE: only >, <, >= , <= do not work on complex numbers in python , equality (==) does work

similarly sorting uses comparison operators behind to sort complex values. Again this behavior diverges from the default python behavior.

# In native python
>> clist = [cnum, cnum_2]
>> sorted(clist, key=lambda c: (c.real, c.imag))
[(1+2j), (1+3j)]

# In numpy

>> np.sort(clist) #Uses the default comparision order

# Yields same result

# To get a cpython like sorting call we can do the following in numpy
np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)

This proposal aims to bring parity between default python handling of complex numbers and handling complex types in numpy

This is a two-step process

  1. Sort complex numbers in a pythonic way , accepting key arguments, and deprecate usage of sort() on complex numbers without key argument
    1. Possibly extend this to max(), min(), if it makes sense to do so. 
    2. Since sort() is being updated for complex numbers, searchsorted() is also a good candidate for implementing this change.
  2. Once this is done, we can deprecate the usage of comparison operators (>, <, >= , <=) on complex dtypes



Handling sort() for complex numbers
There are two approaches we can take for this 

  1. update sort() method, to have a ‘key’ kwarg. When key value is passed, use lexsort to get indices and continue sorting of it. We could support lambda function keys like python, but that is likely to be very slow.
  2. Create a new wrapper function sort_by() (placeholder name, Requesting name suggestions/feedback)That essentially acts like a syntactic sugar for
    1. np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)
  1. Improve the existing sort_complex() method with the new key search functionality (Though the change will only reflect for complex dtypes).
We could choose either method, both have pros and cons , approach 1 makes the sort function signature, closer to its python counterpart, while using approach 2 provides a better distinction between the two approaches for sorting. The performance on approach 1 function would vary, due to the key being an optional argument. Would love the community’s thoughts on this.

Handling min() and max() for complex numbers


Since min and max are essentially a set of comparisons, in python they are not allowed on complex numbers
>> clist = [cnum, cnum_2]
>>> min(clist)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'complex' and 'complex'

# But using keys argument again works
min(clist, key=lambda c: (c.real, c.imag))
We could use a similar key kwarg for min() and max() in python, but question remains how we handle the keys, in this use case , naive way would be to sort() on keys and take last or first element, which is likely going to be slow. Requesting suggestions on approaching this.

Comments on isclose()
Both python and numpy use the absolute value/magnitude for comparing if two values are close enough. Hence I do not see this change affecting this function.

Requesting feedback and suggestions on the above.

Thank you,

Rakesh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Improving Complex Comparison/Ordering in Numpy

Sebastian Berg
On Sat, 2020-06-27 at 16:08 -0700, Rakesh Vasudevan wrote:

> Hi all,
>
>    Following up on this. Created a WIP PR
> https://github.com/numpy/numpy/pull/16700
>
> As stated in the original thread, We need to start by having a sort()
> function for complex numbers that can do it based on keys, rather
> than
> plain arithmetic ordering.
>
> There are two broad ways to approach a sorting function that supports
> keys
> (Not just for complex numbers).
>
Thanks for this. I think the idea is good in general and I would be
happy to discuss details here. It was discussed briefly here:

    https://github.com/numpy/numpy/issues/15981

This is a WIP, but allows nicely to try out how the new API
could/should look like, and see the potential impact to code.  The
current choice is for:

    np.sort(arr, keys=(arr.real, arr.image))

for example.  `keys` is like the `key` argument to pythons sorts, but
unlike python sorts is not passed a function but rather a sequence of
arrays.

Alternative spellings could be `by=...`? Or maybe someone has a
different API idea.


There are also some implementation details to figure out, since
internally it probably will do an `argsort` over all key arrays which
is much like, but a bit faster than, `np.lexsort`+`np.take_along_axis`.

I like this approach in general, since I do not think complex
lexicographic sorting is "obvious" and this also allows the choice of:

    np.sort(complex_arr, keys=(abs(complex_arr,))

to get convenient (although maybe not fastest) sorting by magnitude
seems like a reasonable API choice.

So I am happy if Rakesh pushes this forward, and if anyone has doubts
about the API choice in general or the implications to complex sorting
specifically it would be good to discuss this.  The PR allows some
testing of the feature already.

Cheers,

Sebastian



>    1. Add a key kwarg to the sort() (function and method). To support
> key
>    based sorting on arrays.
>    2. Use a new function on the lines off sortby(c_arr,
> key=(c_arr.real,
>    c_arr.imag)
>
> In this PR I have chosen approach 1 for the following reasons
>
>    1.
>
>    Approach 1 means it is easier to deal with both in-place method
> and the
>    function. Since we can make the change in the c-sort function, we
> have
>    minimal change in the python layer. This I hope results, minimal
> impact on
>    current code that handles complex sorting. One example within
> numpy is is
>    linalg module's svd() function.
>    2.
>
>    With approach 2 when we deprecate complex arithmetic ordering,
> existing
>    methods using sort() for complex types, need to update their
> signature.
>
> As it stands the PR does the following 3 things within the Python-C
> Array
> method implementation of sort
>
>    1. Checks for complex type- If array is of complex-type, it
> creates a
>    default key(When no key is passed) which mimics the current
> arithmetic
>    ordering in Numpy .
>    2. Uses the keys to perform a Py_LexSort and generate indices.
>    3. We perform the take_along_axis via C call back and copy over
> the
>    result to the original array (pseudo in-place).
>
> I am requesting feedback/help on implementing take_along_axis logic
> in C
> level in an in-place manner and the approach in general.
>
> This will further feed into max() and min() as well. Once we figure
> this
> out. Next step would be to deprecate arithmetic ordering for complex
> types
> (Which I think will be a PR on it's own)
>
>
> Regards
>
> Rakesh
>
> On Thu, Jun 4, 2020 at 9:21 PM Brock Mendel <[hidden email]>
> wrote:
>
> > Corresponding pandas issue:
> > https://github.com/pandas-dev/pandas/issues/28050
> >
> > On Thu, Jun 4, 2020 at 9:17 PM Rakesh Vasudevan <
> > [hidden email]>
> > wrote:
> >
> > > Hi all,
> > >
> > > As a follow up to gh-15981 <
> > > https://github.com/numpy/numpy/issues/15981>;,
> > > I would like to propose a change to bring complex dtype(s)
> > > comparison
> > > operators and related functions, in line with respective cpython
> > > implementations.
> > >
> > > The current state of complex dtype comparisons/ordering as
> > > summarised in
> > > the issue is as follows:
> > >
> > > # In python
> > >
> > > > > cnum = 1 + 2j
> > > > > cnum_two = 1 + 3j
> > >
> > > # Doing a comparision yields
> > > > > cnum > cnum_two
> > >
> > > TypeError: '>' not supported between instances of 'complex' and
> > > 'complex'
> > >
> > >
> > > # Doing the same in Numpy scalar comparision
> > >
> > > > > np.array(cnum) > np.array(cnum_two)
> > >
> > > # Yields
> > >
> > > False
> > >
> > >
> > > *NOTE*: only >, <, >= , <= do not work on complex numbers in
> > > python ,
> > > equality (==) does work
> > >
> > > similarly sorting uses comparison operators behind to sort
> > > complex
> > > values. Again this behavior diverges from the default python
> > > behavior.
> > >
> > > # In native python
> > > > > clist = [cnum, cnum_2]
> > > > > sorted(clist, key=lambda c: (c.real, c.imag))
> > > [(1+2j), (1+3j)]
> > >
> > > # In numpy
> > >
> > > > > np.sort(clist) #Uses the default comparision order
> > >
> > > # Yields same result
> > >
> > > # To get a cpython like sorting call we can do the following in
> > > numpy
> > > np.take_along_axis(clist, np.lexsort((clist.real, clist.imag),
> > > 0), 0)
> > >
> > >
> > > This proposal aims to bring parity between default python
> > > handling of
> > > complex numbers and handling complex types in numpy
> > >
> > > This is a two-step process
> > >
> > >
> > >    1. Sort complex numbers in a pythonic way , accepting key
> > > arguments,
> > >    and deprecate usage of sort() on complex numbers without key
> > > argument
> > >       1. Possibly extend this to max(), min(), if it makes sense
> > > to do
> > >       so.
> > >       2. Since sort() is being updated for complex numbers,
> > >       searchsorted() is also a good candidate for implementing
> > > this change.
> > >    2. Once this is done, we can deprecate the usage of comparison
> > >    operators (>, <, >= , <=) on complex dtypes
> > >
> > >
> > >
> > >
> > > *Handling sort() for complex numbers*
> > > There are two approaches we can take for this
> > >
> > >
> > >    1. update sort() method, to have a ‘key’ kwarg. When key value
> > > is
> > >    passed, use lexsort to get indices and continue sorting of it.
> > > We could
> > >    support lambda function keys like python, but that is likely
> > > to be very
> > >    slow.
> > >    2. Create a new wrapper function sort_by() (placeholder name,
> > >    Requesting name suggestions/feedback)That essentially acts
> > > like a syntactic
> > >    sugar for
> > >       1. np.take_along_axis(clist, np.lexsort((clist.real,
> > > clist.imag),
> > >       0), 0)
> > >
> > >
> > >    1. Improve the existing sort_complex() method with the new key
> > > search
> > >    functionality (Though the change will only reflect for complex
> > > dtypes).
> > >
> > > We could choose either method, both have pros and cons , approach
> > > 1 makes
> > > the sort function signature, closer to its python counterpart,
> > > while using
> > > approach 2 provides a better distinction between the two
> > > approaches for
> > > sorting. The performance on approach 1 function would vary, due
> > > to the key
> > > being an optional argument. Would love the community’s thoughts
> > > on this.
> > >
> > >
> > > *Handling min() and max() for complex numbers*
> > >
> > > Since min and max are essentially a set of comparisons, in python
> > > they
> > > are not allowed on complex numbers
> > >
> > > > > clist = [cnum, cnum_2]
> > > > > > min(clist)
> > > Traceback (most recent call last):
> > >   File "<stdin>", line 1, in <module>
> > > TypeError: '<' not supported between instances of 'complex' and
> > > 'complex'
> > >
> > > # But using keys argument again works
> > > min(clist, key=lambda c: (c.real, c.imag))
> > >
> > > We could use a similar key kwarg for min() and max() in python,
> > > but
> > > question remains how we handle the keys, in this use case , naive
> > > way would
> > > be to sort() on keys and take last or first element, which is
> > > likely going
> > > to be slow. Requesting suggestions on approaching this.
> > >
> > > *Comments on isclose()*
> > > Both python and numpy use the absolute value/magnitude for
> > > comparing if
> > > two values are close enough. Hence I do not see this change
> > > affecting this
> > > function.
> > >
> > > Requesting feedback and suggestions on the above.
> > >
> > > Thank you,
> > >
> > > Rakesh
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [hidden email]
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving Complex Comparison/Ordering in Numpy

Stephan Hoyer-2
On Wed, Jul 1, 2020 at 12:23 PM Sebastian Berg <[hidden email]> wrote:
This is a WIP, but allows nicely to try out how the new API
could/should look like, and see the potential impact to code.  The
current choice is for:

    np.sort(arr, keys=(arr.real, arr.image))

for example.  `keys` is like the `key` argument to pythons sorts, but
unlike python sorts is not passed a function but rather a sequence of
arrays.

Alternative spellings could be `by=...`? Or maybe someone has a
different API idea.

I really like the look of np.sort(arr, by=(arr.real, arr.image)).
- This avoids adding an extra function sortby into NumPy's API. The default behavior (by=None) would of course be to sort by the arrays being sorted, so it's backwards compatible.
- Calling the new argument "by" instead of "key" avoids confusion with the behavior of Python's sort/sorted (which take functions instead of sequences).

The combination of lexsort() and take_along_axis() makes it possible to achieve this behavior currently, but it is definitely less clear than a single function call.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Improving Complex Comparison/Ordering in Numpy

Sebastian Berg
On Wed, 2020-07-01 at 12:48 -0700, Stephan Hoyer wrote:

> On Wed, Jul 1, 2020 at 12:23 PM Sebastian Berg <
> [hidden email]>
> wrote:
>
> > This is a WIP, but allows nicely to try out how the new API
> > could/should look like, and see the potential impact to code.  The
> > current choice is for:
> >
> >     np.sort(arr, keys=(arr.real, arr.image))
> >
> > for example.  `keys` is like the `key` argument to pythons sorts,
> > but
> > unlike python sorts is not passed a function but rather a sequence
> > of
> > arrays.
> >
> > Alternative spellings could be `by=...`? Or maybe someone has a
> > different API idea.
> >
>
> I really like the look of np.sort(arr, by=(arr.real, arr.image)).
> - This avoids adding an extra function sortby into NumPy's API. The
> default
> behavior (by=None) would of course be to sort by the arrays being
> sorted,
> so it's backwards compatible.
> - Calling the new argument "by" instead of "key" avoids confusion
> with the
> behavior of Python's sort/sorted (which take functions instead of
> sequences).

I just noticed that `DataFrame.sort_values()` uses `by=...` with a list
of column names.  However, I guess that is fairly compatible with this
usage.

- Sebastan


> The combination of lexsort() and take_along_axis() makes it possible
> to
> achieve this behavior currently, but it is definitely less clear than
> a
> single function call.
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving Complex Comparison/Ordering in Numpy

Rakesh Vasudevan
I agree with the idea of setting apart the parameter from python , "by" sounds like a good alternative

Rakesh 



On Wed, Jul 1, 2020, 18:45 Sebastian Berg <[hidden email]> wrote:
On Wed, 2020-07-01 at 12:48 -0700, Stephan Hoyer wrote:
> On Wed, Jul 1, 2020 at 12:23 PM Sebastian Berg <
> [hidden email]>
> wrote:
>
> > This is a WIP, but allows nicely to try out how the new API
> > could/should look like, and see the potential impact to code.  The
> > current choice is for:
> >
> >     np.sort(arr, keys=(arr.real, arr.image))
> >
> > for example.  `keys` is like the `key` argument to pythons sorts,
> > but
> > unlike python sorts is not passed a function but rather a sequence
> > of
> > arrays.
> >
> > Alternative spellings could be `by=...`? Or maybe someone has a
> > different API idea.
> >
>
> I really like the look of np.sort(arr, by=(arr.real, arr.image)).
> - This avoids adding an extra function sortby into NumPy's API. The
> default
> behavior (by=None) would of course be to sort by the arrays being
> sorted,
> so it's backwards compatible.
> - Calling the new argument "by" instead of "key" avoids confusion
> with the
> behavior of Python's sort/sorted (which take functions instead of
> sequences).


I just noticed that `DataFrame.sort_values()` uses `by=...` with a list
of column names.  However, I guess that is fairly compatible with this
usage.

- Sebastan


> The combination of lexsort() and take_along_axis() makes it possible
> to
> achieve this behavior currently, but it is definitely less clear than
> a
> single function call.
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion