> 1. Add a key kwarg to the sort() (function and method). To support

> key

> based sorting on arrays.

> 2. Use a new function on the lines off sortby(c_arr,

> key=(c_arr.real,

> c_arr.imag)

>

> In this PR I have chosen approach 1 for the following reasons

>

> 1.

>

> Approach 1 means it is easier to deal with both in-place method

> and the

> function. Since we can make the change in the c-sort function, we

> have

> minimal change in the python layer. This I hope results, minimal

> impact on

> current code that handles complex sorting. One example within

> numpy is is

> linalg module's svd() function.

> 2.

>

> With approach 2 when we deprecate complex arithmetic ordering,

> existing

> methods using sort() for complex types, need to update their

> signature.

>

> As it stands the PR does the following 3 things within the Python-C

> Array

> method implementation of sort

>

> 1. Checks for complex type- If array is of complex-type, it

> creates a

> default key(When no key is passed) which mimics the current

> arithmetic

> ordering in Numpy .

> 2. Uses the keys to perform a Py_LexSort and generate indices.

> 3. We perform the take_along_axis via C call back and copy over

> the

> result to the original array (pseudo in-place).

>

> I am requesting feedback/help on implementing take_along_axis logic

> in C

> level in an in-place manner and the approach in general.

>

> This will further feed into max() and min() as well. Once we figure

> this

> out. Next step would be to deprecate arithmetic ordering for complex

> types

> (Which I think will be a PR on it's own)

>

>

> Regards

>

> Rakesh

>

> On Thu, Jun 4, 2020 at 9:21 PM Brock Mendel <

[hidden email]>

> wrote:

>

> > Corresponding pandas issue:

> >

https://github.com/pandas-dev/pandas/issues/28050> >

> > On Thu, Jun 4, 2020 at 9:17 PM Rakesh Vasudevan <

> >

[hidden email]>

> > wrote:

> >

> > > Hi all,

> > >

> > > As a follow up to gh-15981 <

> > >

https://github.com/numpy/numpy/issues/15981>;,

> > > I would like to propose a change to bring complex dtype(s)

> > > comparison

> > > operators and related functions, in line with respective cpython

> > > implementations.

> > >

> > > The current state of complex dtype comparisons/ordering as

> > > summarised in

> > > the issue is as follows:

> > >

> > > # In python

> > >

> > > > > cnum = 1 + 2j

> > > > > cnum_two = 1 + 3j

> > >

> > > # Doing a comparision yields

> > > > > cnum > cnum_two

> > >

> > > TypeError: '>' not supported between instances of 'complex' and

> > > 'complex'

> > >

> > >

> > > # Doing the same in Numpy scalar comparision

> > >

> > > > > np.array(cnum) > np.array(cnum_two)

> > >

> > > # Yields

> > >

> > > False

> > >

> > >

> > > *NOTE*: only >, <, >= , <= do not work on complex numbers in

> > > python ,

> > > equality (==) does work

> > >

> > > similarly sorting uses comparison operators behind to sort

> > > complex

> > > values. Again this behavior diverges from the default python

> > > behavior.

> > >

> > > # In native python

> > > > > clist = [cnum, cnum_2]

> > > > > sorted(clist, key=lambda c: (c.real, c.imag))

> > > [(1+2j), (1+3j)]

> > >

> > > # In numpy

> > >

> > > > > np.sort(clist) #Uses the default comparision order

> > >

> > > # Yields same result

> > >

> > > # To get a cpython like sorting call we can do the following in

> > > numpy

> > > np.take_along_axis(clist, np.lexsort((clist.real, clist.imag),

> > > 0), 0)

> > >

> > >

> > > This proposal aims to bring parity between default python

> > > handling of

> > > complex numbers and handling complex types in numpy

> > >

> > > This is a two-step process

> > >

> > >

> > > 1. Sort complex numbers in a pythonic way , accepting key

> > > arguments,

> > > and deprecate usage of sort() on complex numbers without key

> > > argument

> > > 1. Possibly extend this to max(), min(), if it makes sense

> > > to do

> > > so.

> > > 2. Since sort() is being updated for complex numbers,

> > > searchsorted() is also a good candidate for implementing

> > > this change.

> > > 2. Once this is done, we can deprecate the usage of comparison

> > > operators (>, <, >= , <=) on complex dtypes

> > >

> > >

> > >

> > >

> > > *Handling sort() for complex numbers*

> > > There are two approaches we can take for this

> > >

> > >

> > > 1. update sort() method, to have a ‘key’ kwarg. When key value

> > > is

> > > passed, use lexsort to get indices and continue sorting of it.

> > > We could

> > > support lambda function keys like python, but that is likely

> > > to be very

> > > slow.

> > > 2. Create a new wrapper function sort_by() (placeholder name,

> > > Requesting name suggestions/feedback)That essentially acts

> > > like a syntactic

> > > sugar for

> > > 1. np.take_along_axis(clist, np.lexsort((clist.real,

> > > clist.imag),

> > > 0), 0)

> > >

> > >

> > > 1. Improve the existing sort_complex() method with the new key

> > > search

> > > functionality (Though the change will only reflect for complex

> > > dtypes).

> > >

> > > We could choose either method, both have pros and cons , approach

> > > 1 makes

> > > the sort function signature, closer to its python counterpart,

> > > while using

> > > approach 2 provides a better distinction between the two

> > > approaches for

> > > sorting. The performance on approach 1 function would vary, due

> > > to the key

> > > being an optional argument. Would love the community’s thoughts

> > > on this.

> > >

> > >

> > > *Handling min() and max() for complex numbers*

> > >

> > > Since min and max are essentially a set of comparisons, in python

> > > they

> > > are not allowed on complex numbers

> > >

> > > > > clist = [cnum, cnum_2]

> > > > > > min(clist)

> > > Traceback (most recent call last):

> > > File "<stdin>", line 1, in <module>

> > > TypeError: '<' not supported between instances of 'complex' and

> > > 'complex'

> > >

> > > # But using keys argument again works

> > > min(clist, key=lambda c: (c.real, c.imag))

> > >

> > > We could use a similar key kwarg for min() and max() in python,

> > > but

> > > question remains how we handle the keys, in this use case , naive

> > > way would

> > > be to sort() on keys and take last or first element, which is

> > > likely going

> > > to be slow. Requesting suggestions on approaching this.

> > >

> > > *Comments on isclose()*

> > > Both python and numpy use the absolute value/magnitude for

> > > comparing if

> > > two values are close enough. Hence I do not see this change

> > > affecting this

> > > function.

> > >

> > > Requesting feedback and suggestions on the above.

> > >

> > > Thank you,

> > >

> > > Rakesh

> > > _______________________________________________

> > > NumPy-Discussion mailing list

> > >

[hidden email]
> > >

https://mail.python.org/mailman/listinfo/numpy-discussion> > >

> > _______________________________________________

> > NumPy-Discussion mailing list

> >

[hidden email]
> >

https://mail.python.org/mailman/listinfo/numpy-discussion> >

>

> _______________________________________________

> NumPy-Discussion mailing list

>

[hidden email]
>

https://mail.python.org/mailman/listinfo/numpy-discussion