quantile() or percentile()

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

quantile() or percentile()

Chun-Wei Yuan
There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Joseph Fox-Rabinovitz
I think that there would be a very good reason to have a separate function if we were to introduce weights to the inputs, similarly to the way that we have mean and average. This would have some (positive) repercussions like making weighted histograms with the Freedman-Diaconis binwidth estimator a possibility. I have had this change on the back-burner for a long time, mainly because I was too lazy to figure out how to include it in the C code. However, I will take a closer look.

Regards,

    -Joe



On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]> wrote:
There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Chun-Wei Yuan
Just to provide some context, 9213 actually spawned off of this guy:

https://github.com/numpy/numpy/pull/9211

which might address the weighted inputs issue Joe brought up.

C

On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
I think that there would be a very good reason to have a separate function if we were to introduce weights to the inputs, similarly to the way that we have mean and average. This would have some (positive) repercussions like making weighted histograms with the Freedman-Diaconis binwidth estimator a possibility. I have had this change on the back-burner for a long time, mainly because I was too lazy to figure out how to include it in the C code. However, I will take a closer look.

Regards,

    -Joe



On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]> wrote:
There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Joseph Fox-Rabinovitz
While #9211 is a good start, it is pretty inefficient in terms of the fact that it performs an O(nlogn) sort of the array. It is possible to reduce the time to O(n) by using a similar partitioning algorithm to the one in the C code of percentile. I will look into it as soon as I can.

    -Joe

On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <[hidden email]> wrote:
Just to provide some context, 9213 actually spawned off of this guy:

https://github.com/numpy/numpy/pull/9211

which might address the weighted inputs issue Joe brought up.

C

On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
I think that there would be a very good reason to have a separate function if we were to introduce weights to the inputs, similarly to the way that we have mean and average. This would have some (positive) repercussions like making weighted histograms with the Freedman-Diaconis binwidth estimator a possibility. I have had this change on the back-burner for a long time, mainly because I was too lazy to figure out how to include it in the C code. However, I will take a closer look.

Regards,

    -Joe



On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]> wrote:
There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Chun-Wei Yuan
That would be great.  I just used np.argsort because it was familiar to me.  Didn't know about the C code.

On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
While #9211 is a good start, it is pretty inefficient in terms of the fact that it performs an O(nlogn) sort of the array. It is possible to reduce the time to O(n) by using a similar partitioning algorithm to the one in the C code of percentile. I will look into it as soon as I can.

    -Joe

On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <[hidden email]> wrote:
Just to provide some context, 9213 actually spawned off of this guy:

https://github.com/numpy/numpy/pull/9211

which might address the weighted inputs issue Joe brought up.

C

On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
I think that there would be a very good reason to have a separate function if we were to introduce weights to the inputs, similarly to the way that we have mean and average. This would have some (positive) repercussions like making weighted histograms with the Freedman-Diaconis binwidth estimator a possibility. I have had this change on the back-burner for a long time, mainly because I was too lazy to figure out how to include it in the C code. However, I will take a closer look.

Regards,

    -Joe



On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]> wrote:
There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Chun-Wei Yuan
Any way I can help expedite this?

On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <[hidden email]> wrote:
That would be great.  I just used np.argsort because it was familiar to me.  Didn't know about the C code.

On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
While #9211 is a good start, it is pretty inefficient in terms of the fact that it performs an O(nlogn) sort of the array. It is possible to reduce the time to O(n) by using a similar partitioning algorithm to the one in the C code of percentile. I will look into it as soon as I can.

    -Joe

On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <[hidden email]> wrote:
Just to provide some context, 9213 actually spawned off of this guy:

https://github.com/numpy/numpy/pull/9211

which might address the weighted inputs issue Joe brought up.

C

On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
I think that there would be a very good reason to have a separate function if we were to introduce weights to the inputs, similarly to the way that we have mean and average. This would have some (positive) repercussions like making weighted histograms with the Freedman-Diaconis binwidth estimator a possibility. I have had this change on the back-burner for a long time, mainly because I was too lazy to figure out how to include it in the C code. However, I will take a closer look.

Regards,

    -Joe



On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]> wrote:
There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion




_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Joseph Fox-Rabinovitz
Not that I know of. The algorithm is very simple, requiring a
relatively small addition to the current introselect algorithm used
for `np.partition`. My biggest hurdle is figuring out how the calling
machinery really works so that I can figure out which input type
permutations I need to generate, and how to get the right backend
running for a given function call.

    -Joe

On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <[hidden email]> wrote:

> Any way I can help expedite this?
>
> On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <[hidden email]>
> wrote:
>>
>> That would be great.  I just used np.argsort because it was familiar to
>> me.  Didn't know about the C code.
>>
>> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
>> <[hidden email]> wrote:
>>>
>>> While #9211 is a good start, it is pretty inefficient in terms of the
>>> fact that it performs an O(nlogn) sort of the array. It is possible to
>>> reduce the time to O(n) by using a similar partitioning algorithm to the one
>>> in the C code of percentile. I will look into it as soon as I can.
>>>
>>>     -Joe
>>>
>>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <[hidden email]>
>>> wrote:
>>>>
>>>> Just to provide some context, 9213 actually spawned off of this guy:
>>>>
>>>> https://github.com/numpy/numpy/pull/9211
>>>>
>>>> which might address the weighted inputs issue Joe brought up.
>>>>
>>>> C
>>>>
>>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
>>>> <[hidden email]> wrote:
>>>>>
>>>>> I think that there would be a very good reason to have a separate
>>>>> function if we were to introduce weights to the inputs, similarly to the way
>>>>> that we have mean and average. This would have some (positive) repercussions
>>>>> like making weighted histograms with the Freedman-Diaconis binwidth
>>>>> estimator a possibility. I have had this change on the back-burner for a
>>>>> long time, mainly because I was too lazy to figure out how to include it in
>>>>> the C code. However, I will take a closer look.
>>>>>
>>>>> Regards,
>>>>>
>>>>>     -Joe
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]>
>>>>> wrote:
>>>>>>
>>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
>>>>>> use it just like percentile(), but would input your q value in probability
>>>>>> space (0.5 for 50%):
>>>>>>
>>>>>> https://github.com/numpy/numpy/pull/9213
>>>>>>
>>>>>> Since there's a great deal of overlap between these two functions,
>>>>>> we'd like to solicit opinions on how to move forward on this.
>>>>>>
>>>>>> The current thinking is to tolerate the redundancy and keep both,
>>>>>> using one as the engine for the other.  I'm partial to having quantile
>>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on
>>>>>> quantile().
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> C
>>>>>>
>>>>>> _______________________________________________
>>>>>> NumPy-Discussion mailing list
>>>>>> [hidden email]
>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> [hidden email]
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> [hidden email]
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> [hidden email]
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Chun-Wei Yuan
Cool.  Just as a heads up, for my algorithm to work, I actually need the indices, which is why argsort() is so important to me.  I use it to get both ap_sorted and ws_sorted variables.  If your weighted-quantile algo is faster and doesn't require those indices, please by all means change my implementation.  Thanks.

On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz <[hidden email]> wrote:
Not that I know of. The algorithm is very simple, requiring a
relatively small addition to the current introselect algorithm used
for `np.partition`. My biggest hurdle is figuring out how the calling
machinery really works so that I can figure out which input type
permutations I need to generate, and how to get the right backend
running for a given function call.

    -Joe

On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <[hidden email]> wrote:
> Any way I can help expedite this?
>
> On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <[hidden email]>
> wrote:
>>
>> That would be great.  I just used np.argsort because it was familiar to
>> me.  Didn't know about the C code.
>>
>> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
>> <[hidden email]> wrote:
>>>
>>> While #9211 is a good start, it is pretty inefficient in terms of the
>>> fact that it performs an O(nlogn) sort of the array. It is possible to
>>> reduce the time to O(n) by using a similar partitioning algorithm to the one
>>> in the C code of percentile. I will look into it as soon as I can.
>>>
>>>     -Joe
>>>
>>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <[hidden email]>
>>> wrote:
>>>>
>>>> Just to provide some context, 9213 actually spawned off of this guy:
>>>>
>>>> https://github.com/numpy/numpy/pull/9211
>>>>
>>>> which might address the weighted inputs issue Joe brought up.
>>>>
>>>> C
>>>>
>>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
>>>> <[hidden email]> wrote:
>>>>>
>>>>> I think that there would be a very good reason to have a separate
>>>>> function if we were to introduce weights to the inputs, similarly to the way
>>>>> that we have mean and average. This would have some (positive) repercussions
>>>>> like making weighted histograms with the Freedman-Diaconis binwidth
>>>>> estimator a possibility. I have had this change on the back-burner for a
>>>>> long time, mainly because I was too lazy to figure out how to include it in
>>>>> the C code. However, I will take a closer look.
>>>>>
>>>>> Regards,
>>>>>
>>>>>     -Joe
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <[hidden email]>
>>>>> wrote:
>>>>>>
>>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
>>>>>> use it just like percentile(), but would input your q value in probability
>>>>>> space (0.5 for 50%):
>>>>>>
>>>>>> https://github.com/numpy/numpy/pull/9213
>>>>>>
>>>>>> Since there's a great deal of overlap between these two functions,
>>>>>> we'd like to solicit opinions on how to move forward on this.
>>>>>>
>>>>>> The current thinking is to tolerate the redundancy and keep both,
>>>>>> using one as the engine for the other.  I'm partial to having quantile
>>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on
>>>>>> quantile().
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> C
>>>>>>
>>>>>> _______________________________________________
>>>>>> NumPy-Discussion mailing list
>>>>>> [hidden email]
>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> [hidden email]
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> [hidden email]
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> [hidden email]
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Joseph Fox-Rabinovitz
I will go over your PR carefully to make sure we can agree on a
matching API. After that, we can swap the backend out whenever I get
around to it.

Thanks for working on this.

    -Joe

On Thu, Aug 3, 2017 at 5:36 PM, Chun-Wei Yuan <[hidden email]> wrote:

> Cool.  Just as a heads up, for my algorithm to work, I actually need the
> indices, which is why argsort() is so important to me.  I use it to get both
> ap_sorted and ws_sorted variables.  If your weighted-quantile algo is faster
> and doesn't require those indices, please by all means change my
> implementation.  Thanks.
>
> On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz
> <[hidden email]> wrote:
>>
>> Not that I know of. The algorithm is very simple, requiring a
>> relatively small addition to the current introselect algorithm used
>> for `np.partition`. My biggest hurdle is figuring out how the calling
>> machinery really works so that I can figure out which input type
>> permutations I need to generate, and how to get the right backend
>> running for a given function call.
>>
>>     -Joe
>>
>> On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <[hidden email]>
>> wrote:
>> > Any way I can help expedite this?
>> >
>> > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <[hidden email]>
>> > wrote:
>> >>
>> >> That would be great.  I just used np.argsort because it was familiar to
>> >> me.  Didn't know about the C code.
>> >>
>> >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
>> >> <[hidden email]> wrote:
>> >>>
>> >>> While #9211 is a good start, it is pretty inefficient in terms of the
>> >>> fact that it performs an O(nlogn) sort of the array. It is possible to
>> >>> reduce the time to O(n) by using a similar partitioning algorithm to
>> >>> the one
>> >>> in the C code of percentile. I will look into it as soon as I can.
>> >>>
>> >>>     -Joe
>> >>>
>> >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan
>> >>> <[hidden email]>
>> >>> wrote:
>> >>>>
>> >>>> Just to provide some context, 9213 actually spawned off of this guy:
>> >>>>
>> >>>> https://github.com/numpy/numpy/pull/9211
>> >>>>
>> >>>> which might address the weighted inputs issue Joe brought up.
>> >>>>
>> >>>> C
>> >>>>
>> >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
>> >>>> <[hidden email]> wrote:
>> >>>>>
>> >>>>> I think that there would be a very good reason to have a separate
>> >>>>> function if we were to introduce weights to the inputs, similarly to
>> >>>>> the way
>> >>>>> that we have mean and average. This would have some (positive)
>> >>>>> repercussions
>> >>>>> like making weighted histograms with the Freedman-Diaconis binwidth
>> >>>>> estimator a possibility. I have had this change on the back-burner
>> >>>>> for a
>> >>>>> long time, mainly because I was too lazy to figure out how to
>> >>>>> include it in
>> >>>>> the C code. However, I will take a closer look.
>> >>>>>
>> >>>>> Regards,
>> >>>>>
>> >>>>>     -Joe
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan
>> >>>>> <[hidden email]>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> There's an ongoing effort to introduce quantile() into numpy.
>> >>>>>> You'd
>> >>>>>> use it just like percentile(), but would input your q value in
>> >>>>>> probability
>> >>>>>> space (0.5 for 50%):
>> >>>>>>
>> >>>>>> https://github.com/numpy/numpy/pull/9213
>> >>>>>>
>> >>>>>> Since there's a great deal of overlap between these two functions,
>> >>>>>> we'd like to solicit opinions on how to move forward on this.
>> >>>>>>
>> >>>>>> The current thinking is to tolerate the redundancy and keep both,
>> >>>>>> using one as the engine for the other.  I'm partial to having
>> >>>>>> quantile
>> >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting
>> >>>>>> on
>> >>>>>> quantile().
>> >>>>>>
>> >>>>>> Best,
>> >>>>>>
>> >>>>>> C
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> NumPy-Discussion mailing list
>> >>>>>> [hidden email]
>> >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> NumPy-Discussion mailing list
>> >>>>> [hidden email]
>> >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> NumPy-Discussion mailing list
>> >>>> [hidden email]
>> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> NumPy-Discussion mailing list
>> >>> [hidden email]
>> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>
>> >>
>> >
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > [hidden email]
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> >
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Eric Wieser
In reply to this post by Chun-Wei Yuan

Let’s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue.

There are two main questions here:

  1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice
  2. Is this desirable enough to justify increasing the API surface?

The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions.

Eric

On Fri, 21 Jul 2017 at 16:12 Chun-Wei Yuan chunwei.yuan@... wrote:

There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Juan Nunez-Iglesias
I concur with the consensus.

On 10 Aug 2017, 11:10 PM +0200, Eric Wieser <[hidden email]>, wrote:

Let’s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue.

There are two main questions here:

  1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice
  2. Is this desirable enough to justify increasing the API surface?

The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions.

Eric

On Fri, 21 Jul 2017 at 16:12 Chun-Wei Yuan chunwei.yuan@... wrote:

There's an ongoing effort to introduce quantile() into numpy.  You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):


Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.

The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.  I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().

Best,

C
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Charles R Harris
In reply to this post by Eric Wieser


On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser <[hidden email]> wrote:

Let’s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue.

There are two main questions here:

  1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice
  2. Is this desirable enough to justify increasing the API surface?

The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions.


I think a quantile function would be natural and desirable.

<snip>

Chuck


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

josef.pktd


On Sun, Aug 13, 2017 at 9:28 AM, Charles R Harris <[hidden email]> wrote:


On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser <[hidden email]> wrote:

Let’s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue.

There are two main questions here:

  1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice
  2. Is this desirable enough to justify increasing the API surface?

The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions.


I think a quantile function would be natural and desirable.

I'm in favor of adding it. (moving away from +0)
It should be an obvious code completion choice, np.q?

Josef

 

<snip>

Chuck


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantile() or percentile()

Chris Barker - NOAA Federal
In reply to this post by Charles R Harris
+1 on quantile()

-CHB


On Sun, Aug 13, 2017 at 6:28 AM, Charles R Harris <[hidden email]> wrote:


On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser <[hidden email]> wrote:

Let’s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue.

There are two main questions here:

  1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice
  2. Is this desirable enough to justify increasing the API surface?

The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions.


I think a quantile function would be natural and desirable.

<snip>

Chuck


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion




--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Loading...