# Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

26 messages
12
Open this post in threaded view
|

## Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 Hi All,The question is what to do when all-nan slices are encountered in the nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the first returns nan and raises a warning, the second returns intp.min and raises a warning. It is proposed that the nanarg{max, min} functions, and possibly the nan{max, min} also, raise an error instead. Raising errors would be consistent with the behavior of the arg{max, min} and amax/amin functions when they encounter empty arrays. OTOH, now that we no longer support Python 2.4/2.5 the catch_warnings context manager can serve the same purpose by changing the warnings into exceptions. So, what to do? Thoughts?Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 +1 to making the nan functions consistent with the non-nan functions. On 2 Oct 2013 17:03, "Charles R Harris" <[hidden email]> wrote: Hi All,The question is what to do when all-nan slices are encountered in the nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the first returns nan and raises a warning, the second returns intp.min and raises a warning. It is proposed that the nanarg{max, min} functions, and possibly the nan{max, min} also, raise an error instead. Raising errors would be consistent with the behavior of the arg{max, min} and amax/amin functions when they encounter empty arrays. OTOH, now that we no longer support Python 2.4/2.5 the catch_warnings context manager can serve the same purpose by changing the warnings into exceptions. So, what to do? Thoughts?Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by Charles R Harris On 2 Oct 2013 18:04, "Charles R Harris" <[hidden email]> wrote: > > The question is what to do when all-nan slices are encountered in the nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the first returns nan and raises a warning, the second returns intp.min and raises a warning. It is proposed that the nanarg{max, min} functions, and possibly the nan{max, min} also, raise an error instead. I agree with Nathan; this sounds like more reasonable behaviour to me. Stéfan _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Wed, Oct 2, 2013 at 12:37 PM, Stéfan van der Walt <[hidden email]> wrote: > On 2 Oct 2013 18:04, "Charles R Harris" <[hidden email]> wrote: >> >> The question is what to do when all-nan slices are encountered in the >> nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the first >> returns nan and raises a warning, the second returns intp.min and raises a >> warning. It is proposed that the nanarg{max, min} functions, and possibly >> the nan{max, min} also, raise an error instead. > > I agree with Nathan; this sounds like more reasonable behaviour to me. If I understand what you are proposing -1 on raising an error with nan{max, min}, an empty array is empty in all columns an array with nans, might be empty in only some columns. as far as I understand, nan{max, min} only make sense with arrays that can hold a nan, so we can return nans. If a user calls with ints or bool, then there are either no nans or the array is empty, and I don't care. --- aside with nanarg{max, min} I would just return 0 in an all nan column, since the max or min is nan, and one is at zero. (but I'm not arguing) Josef > > Stéfan > > > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/numpy-discussion> _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Wed, Oct 2, 2013 at 10:56 AM, wrote: On Wed, Oct 2, 2013 at 12:37 PM, Stéfan van der Walt <[hidden email]> wrote: > On 2 Oct 2013 18:04, "Charles R Harris" <[hidden email]> wrote: >> >> The question is what to do when all-nan slices are encountered in the >> nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the first >> returns nan and raises a warning, the second returns intp.min and raises a >> warning. It is proposed that the nanarg{max, min} functions, and possibly >> the nan{max, min} also, raise an error instead. > > I agree with Nathan; this sounds like more reasonable behaviour to me. If I understand what you are proposing -1 on raising an error with nan{max, min}, an empty array is empty in all columns an array with nans, might be empty in only some columns. as far as I understand, nan{max, min} only make sense with arrays that can hold a nan, so we can return nans.That was my original thought.  If a user calls with ints or bool, then there are either no nans or the array is empty, and I don't care. --- aside with nanarg{max, min} I would just return 0 in an all nan column, since the max or min is nan, and one is at zero. (but I'm not arguing) That is an interesting proposal. I like it.Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Wed, Oct 2, 2013 at 1:05 PM, Charles R Harris wrote: On Wed, Oct 2, 2013 at 10:56 AM, wrote: On Wed, Oct 2, 2013 at 12:37 PM, Stéfan van der Walt <[hidden email]> wrote: > On 2 Oct 2013 18:04, "Charles R Harris" <[hidden email]> wrote: >> >> The question is what to do when all-nan slices are encountered in the >> nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the first >> returns nan and raises a warning, the second returns intp.min and raises a >> warning. It is proposed that the nanarg{max, min} functions, and possibly >> the nan{max, min} also, raise an error instead. > > I agree with Nathan; this sounds like more reasonable behaviour to me. If I understand what you are proposing -1 on raising an error with nan{max, min}, an empty array is empty in all columns an array with nans, might be empty in only some columns. as far as I understand, nan{max, min} only make sense with arrays that can hold a nan, so we can return nans.That was my original thought.  If a user calls with ints or bool, then there are either no nans or the array is empty, and I don't care. --- aside with nanarg{max, min} I would just return 0 in an all nan column, since the max or min is nan, and one is at zero. (but I'm not arguing) That is an interesting proposal. I like it.Chuck And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) (ignoring the silly detail that you can't do an equality on nans).Ben Root _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: > > And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) (ignoring the silly detail that you can't do an equality on nans). Why do you call this a silly detail? It seems to me a fundamental flaw to this approach. Stéfan _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Wed, Oct 2, 2013 at 2:05 PM, Stéfan van der Walt <[hidden email]> wrote: > On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: >> >> And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) >> (ignoring the silly detail that you can't do an equality on nans). > > Why do you call this a silly detail? It seems to me a fundamental flaw to > this approach. a nan is a nan is a NaN >>> np.testing.assert_equal([0, np.nan], [0, np.nan]) >>> Josef > > Stéfan > > > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/numpy-discussion> _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Wed, Oct 2, 2013 at 2:49 PM,  <[hidden email]> wrote: > On Wed, Oct 2, 2013 at 2:05 PM, Stéfan van der Walt <[hidden email]> wrote: >> On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: >>> >>> And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) >>> (ignoring the silly detail that you can't do an equality on nans). >> >> Why do you call this a silly detail? It seems to me a fundamental flaw to >> this approach. > > a nan is a nan is a NaN > >>>> np.testing.assert_equal([0, np.nan], [0, np.nan]) >>>> and the functions have "nan" in their names nan in - NaN out what about nanmean, nansum, ...? Josef > > Josef > >> >> Stéfan >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> [hidden email] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion>> _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by Stéfan van der Walt On Wed, Oct 2, 2013 at 2:05 PM, Stéfan van der Walt wrote: On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: > > And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) (ignoring the silly detail that you can't do an equality on nans). Why do you call this a silly detail? It seems to me a fundamental flaw to this approach. Just saying that it conceptually makes sense, even if the exact code I used wouldn't be perfectly correct. Because these are NaN functions, it means that the users are already aware of the need to handle nans appropriately. Just because you can't actually do equality between two NaNs in the same way as one can do with numbers does not invalidate the concept. Ben Root _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by josef.pktd On Wed, Oct 2, 2013 at 7:51 PM,  <[hidden email]> wrote: > On Wed, Oct 2, 2013 at 2:49 PM,  <[hidden email]> wrote: >> On Wed, Oct 2, 2013 at 2:05 PM, Stéfan van der Walt <[hidden email]> wrote: >>> On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: >>>> >>>> And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) >>>> (ignoring the silly detail that you can't do an equality on nans). >>> >>> Why do you call this a silly detail? It seems to me a fundamental flaw to >>> this approach. >> >> a nan is a nan is a NaN >> >>>>> np.testing.assert_equal([0, np.nan], [0, np.nan]) >>>>> > > and the functions have "nan" in their names > nan in - NaN out This makes no sense :-). The nan in the names means "pretend the nans aren't there", not "please scatter nans in the output"! These are just vectorized operations that can fail in some cases and not others, there's nothing special about the fact that the function's definition also involves nan. > what about nanmean, nansum, ...? They do the same thing as mean([]), sum([]), etc., which are well-defined. (nan and 0 respectively.) -n _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by josef.pktd On Wed, Oct 2, 2013 at 12:51 PM, wrote: On Wed, Oct 2, 2013 at 2:49 PM,  <[hidden email]> wrote: > On Wed, Oct 2, 2013 at 2:05 PM, Stéfan van der Walt <[hidden email]> wrote: >> On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: >>> >>> And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) >>> (ignoring the silly detail that you can't do an equality on nans). >> >> Why do you call this a silly detail? It seems to me a fundamental flaw to >> this approach. > > a nan is a nan is a NaN > >>>> np.testing.assert_equal([0, np.nan], [0, np.nan]) >>>> and the functions have "nan" in their names nan in - NaN out what about nanmean, nansum, ...?nanmean returns nan for empty slices while nansum returns nan in 1.8, consistent with previous behavior, and will return 0 in 1.9.The main problem I had was deciding what arg{max, min} should return as the return value is an integer. I like your suggestion of returning 0. One further possibility is to add a keyword 'raise' to make the behavior selectable.Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Wed, Oct 2, 2013 at 8:19 PM, Charles R Harris <[hidden email]> wrote: > > > > On Wed, Oct 2, 2013 at 12:51 PM, <[hidden email]> wrote: >> >> On Wed, Oct 2, 2013 at 2:49 PM,  <[hidden email]> wrote: >> > On Wed, Oct 2, 2013 at 2:05 PM, Stéfan van der Walt <[hidden email]> >> > wrote: >> >> On 2 Oct 2013 19:14, "Benjamin Root" <[hidden email]> wrote: >> >>> >> >>> And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a) >> >>> (ignoring the silly detail that you can't do an equality on nans). >> >> >> >> Why do you call this a silly detail? It seems to me a fundamental flaw >> >> to >> >> this approach. >> > >> > a nan is a nan is a NaN >> > >> >>>> np.testing.assert_equal([0, np.nan], [0, np.nan]) >> >>>> >> >> and the functions have "nan" in their names >> nan in - NaN out >> >> what about nanmean, nansum, ...? > > nanmean returns nan for empty slices while nansum returns nan in 1.8, > consistent with previous behavior, and will return 0 in 1.9. > > The main problem I had was deciding what arg{max, min} should return as the > return value is an integer. I like your suggestion of returning 0. I don't understand the justification for returning 0 at all. "nan" is not the max or min or the array. Even if argmax/argmin return nan[1], it's just a special code meaning "undefined", it has no relation to the nans inside the array. So returning 0 just feels to me like pure "we have to return *something* and this something! (that we can kind of justify is no-one looks too hard!)". This exactly the impulse that "when in doubt refuse the temptation to guess" is written to counteract... Seriously what user calls nanargmax and *wants* to get pointed to a random nan inside the array? Isn't the whole point of calling nanargmax to avoid exactly this situation? -n [1] It seems clear that we shouldn't mess about with argmax/argmin for 1.8, but my guess is that in the long run we'll come up with a more general convention for signalling partial errors, and eventually want to switch nanargmax/nanargmin to using that. _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by Charles R Harris On 2 Oct 2013 21:19, "Charles R Harris" <[hidden email]> wrote: > > > The main problem I had was deciding what arg{max, min} should return as the return value is an integer. I like your suggestion of returning 0. This doesn't allow the user to know the difference between valid and" invalid" output, does it? > One further possibility is to add a keyword 'raise' to make the behavior selectable. Preferably just pick a sensible default. Stéfan _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by Charles R Harris Hello, sorry, I don't know where exactly jump in in the thread, it is getting quite long and articulated... On 02/10/2013 21:19, Charles R Harris wrote: > The main problem I had was deciding what arg{max, min} should return as > the return value is an integer. I like your suggestion of returning 0. What about returning -1? It is still an integer (on my numpy version the return value is a signed integer), it still has the property that a[np.argmin(a)] == nan for a of only nans, but it is easily identifiable as an anomalous return value if needed. > One further possibility is to add a keyword 'raise' to make the behavior > selectable. I don't like function arguments that change the function behavior. Cheers, Daniele _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Thu, Oct 3, 2013 at 4:06 AM, Daniele Nicolodi wrote: Hello, sorry, I don't know where exactly jump in in the thread, it is getting quite long and articulated... On 02/10/2013 21:19, Charles R Harris wrote: > The main problem I had was deciding what arg{max, min} should return as > the return value is an integer. I like your suggestion of returning 0. What about returning -1? It is still an integer (on my numpy version the return value is a signed integer), it still has the property that a[np.argmin(a)] == nan for a of only nans, but it is easily identifiable as an anomalous return value if needed.The problem is that -1 is a valid index, whereas intp.min will always be out of range and lead to an IndexError if used.Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On 03/10/2013 13:56, Charles R Harris wrote: > On Thu, Oct 3, 2013 at 4:06 AM, Daniele Nicolodi <[hidden email] > > wrote: > >     Hello, > >     sorry, I don't know where exactly jump in in the thread, it is getting >     quite long and articulated... > >     On 02/10/2013 21:19, Charles R Harris wrote: >     > The main problem I had was deciding what arg{max, min} should >     return as >     > the return value is an integer. I like your suggestion of returning 0. > >     What about returning -1? It is still an integer (on my numpy version the >     return value is a signed integer), it still has the property that >     a[np.argmin(a)] == nan for a of only nans, but it is easily identifiable >     as an anomalous return value if needed. > > > The problem is that -1 is a valid index, whereas intp.min will always be > out of range and lead to an IndexError if used. If the goal is to have an error raised, just do it and do not rely on the fact that using an invalid index will soon or later result in an error. My proposal was a compromise to the proposal of returning 0. 0 is a valid index that cannot be distinguished from a valid return value, -1 is a valid index but can be easily distinguished as a special case. I don't have a strong preference between try:     i = np.nanargmin(a) except ValueError:     something() and i = np.nanargmin(a) if i < 0:     something() but definitely I don't like returning 0: i = np.nanargmin(a) if i == 0:     # uhm, wait, is the minimum at index 0 or the array is all nans?!?     if np.isnan(a[0]):         something() Cheers, Daniele _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 In reply to this post by Daniele Nicolodi On Thu, Oct 3, 2013 at 6:06 AM, Daniele Nicolodi wrote: Hello, sorry, I don't know where exactly jump in in the thread, it is getting quite long and articulated... On 02/10/2013 21:19, Charles R Harris wrote: > The main problem I had was deciding what arg{max, min} should return as > the return value is an integer. I like your suggestion of returning 0. What about returning -1? It is still an integer (on my numpy version the return value is a signed integer), it still has the property that a[np.argmin(a)] == nan for a of only nans, but it is easily identifiable as an anomalous return value if needed. This actually makes a lot of sense. We would never return -1 for any other reason. And if it is used for indexing anywhere else, that's (mostly) ok. A problem might occur if the indexes gathered from this function are then used to define slices. But I can't really convince myself that it would be all that terrible in that case, too. Documentation will be paramount here. Ben Root _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

 On Thu, Oct 3, 2013 at 9:10 AM, Benjamin Root <[hidden email]> wrote: > On Thu, Oct 3, 2013 at 6:06 AM, Daniele Nicolodi <[hidden email]> wrote: >> >> Hello, >> >> sorry, I don't know where exactly jump in in the thread, it is getting >> quite long and articulated... >> >> On 02/10/2013 21:19, Charles R Harris wrote: >> > The main problem I had was deciding what arg{max, min} should return as >> > the return value is an integer. I like your suggestion of returning 0. >> >> What about returning -1? It is still an integer (on my numpy version the >> return value is a signed integer), it still has the property that >> a[np.argmin(a)] == nan for a of only nans, but it is easily identifiable >> as an anomalous return value if needed. >> > > This actually makes a lot of sense. We would never return -1 for any other > reason. And if it is used for indexing anywhere else, that's (mostly) ok. A > problem might occur if the indexes gathered from this function are then used > to define slices. But I can't really convince myself that it would be all > that terrible in that case, too. Documentation will be paramount here. Please, no. It's another thing to remember and another way to shoot yourself in the foot and introduce casual bugs. FWIW, my vote is to raise an error or return a nan, which will likely eventually raise an error. If I have all nans, it's usually the case that something's off, and I'd like to know sooner rather than later. Skipper _______________________________________________ NumPy-Discussion mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/numpy-discussion