

Hello,
Currently, the builtin Python
round (which is different from
np.round) when called on a
np.float64 returns a
np.float64, due to its
__round__ method. A congruous statement is true for
np.float32. However, since Python 3, the default behavior of
round is to return a Python
int when it operates on a Python
float. This is a mismatch according to the
Liskov Substitution Principle, as both these types subclass Python’s
float. This has been brought up in
gh15297. Here is the problem summed up in code:
>>> type(round(np.float64(5)))
<class 'numpy.float64'>
>>> type(round(np.float32(5)))
<class 'numpy.float32'>
>>> type(round(float(5)))
<class 'int'>
This problem manifests itself most prominently when trying to index into collections:
>>> np.arange(6)[round(float(5))]
5
>>> np.arange(6)[round(np.float64(5))]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
Does anyone have any thoughts on the proposal?
Best regards,
Hameer Abbasi
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


great another object array
np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])]) array([1, 2, 200000000000000000000, 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896], dtype=object)
I would rather have numpy consistent with numpy than with python
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.
(although I don't remember having been careful about the distinction between python scalars and numpy scalars in some time. I had some fun with integers in the scipy.stats discrete distributions, until they became floats)
Josef
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.
round(some_ndarray) isn't implemented, so there is no difference to worry about.
If you want the float>float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
No. np.round() is an alias (which would be good to deprecate) for np.around(). No one has proposed changing np.around(). That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
The type of the return value is part of the type's interface, not the specific instance.
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


great another object array
np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])]) array([1, 2, 200000000000000000000, 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896], dtype=object)
I would rather have numpy consistent with numpy than with python
Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


great another object array
np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])]) array([1, 2, 200000000000000000000, 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896], dtype=object)
I would rather have numpy consistent with numpy than with python
Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.
Maybe I misunderstand
I'm using np.round a lot. So maybe it's a question whether and how it will affect np.round.
Does the following change with the proposal? np.round(np.array([1, 2.5, 2e20, 2e200])) array([1.e+000, 2.e+000, 2.e+020, 2.e+200])
np.round(np.array([1, 2.5, 2e20, 2e200])).astype(int) array([ 1, 2, 2147483648, 2147483648])
np.round(np.array([2e200])[0]) 2e+200
np.round(2e200) 2e+200 round(2e200) 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896
Josef "around 100" sounds like "something all_close(100)"
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


great another object array
np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])]) array([1, 2, 200000000000000000000, 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896], dtype=object)
I would rather have numpy consistent with numpy than with python
Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.
Maybe I misunderstand
I'm using np.round a lot. So maybe it's a question whether and how it will affect np.round.
Does the following change with the proposal? np.round(np.array([1, 2.5, 2e20, 2e200])) array([1.e+000, 2.e+000, 2.e+020, 2.e+200])
np.round(np.array([1, 2.5, 2e20, 2e200])).astype(int) array([ 1, 2, 2147483648, 2147483648])
np.round(np.array([2e200])[0]) 2e+200
np.round(2e200) 2e+200 round(2e200) 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896
Josef "around 100" sounds like "something all_close(100)"
I guess I'm slow
It only affects this case, as long as we don't have
__round__ in arrays
round(np.float64(2e200)) 2e+200 round(np.array([1, 2.5, 2e20, 2e200]))  TypeError Traceback (most recent call last) <ipythoninput177bd4a17555729> in <module> > 1 round(np.array([1, 2.5, 2e20, 2e200]))
TypeError: type numpy.ndarray doesn't define __round__ method
Josef
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


great another object array
np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])]) array([1, 2, 200000000000000000000, 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896], dtype=object)
I would rather have numpy consistent with numpy than with python
Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.
Maybe I misunderstand
I'm using np.round a lot. So maybe it's a question whether and how it will affect np.round.
Nope, not changing. Does the following change with the proposal? np.round(np.array([1, 2.5, 2e20, 2e200])) array([1.e+000, 2.e+000, 2.e+020, 2.e+200])
np.round(np.array([1, 2.5, 2e20, 2e200])).astype(int) array([ 1, 2, 2147483648, 2147483648])
np.round(np.array([2e200])[0]) 2e+200
np.round(2e200) 2e+200
No change. round(2e200) 199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896
Obviously, not under out control, but no, that's not changing.
This is the only result that will change:
round(np.float64(2e200)) 2e+200
Josef "around 100" sounds like "something all_close(100)"
I know. It's meant to be read as "arrayround". We prefer the `around()` spelling to avoid shadowing the builtin. Early mistake that we're still living with.  Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


It's not about what I want but this changes the output of round. In my example I didn't use any arrays but a scalar type which looks like will upcasted.
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.
round(some_ndarray) isn't implemented, so there is no difference to worry about.
If you want the float>float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Your example used np.round(), not the builtin round(). np.round() is not changing. If you want the dtype of the output to be the dtype of the input, you can certainly keep using np.round() (or its canonical spelling, np.around()). It's not about what I want but this changes the output of round. In my example I didn't use any arrays but a scalar type which looks like will upcasted.
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.
round(some_ndarray) isn't implemented, so there is no difference to worry about.
If you want the float>float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Oh sorry. That's trigger finger npdotting.
What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting.
Maybe not a big problem but can have significant impact. Your example used np.round(), not the builtin round(). np.round() is not changing. If you want the dtype of the output to be the dtype of the input, you can certainly keep using np.round() (or its canonical spelling, np.around()).
It's not about what I want but this changes the output of round. In my example I didn't use any arrays but a scalar type which looks like will upcasted.
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason
I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class.
There still remains the question, do we return Python
ints or
np.int64s?

Python
ints have the advantage of not overflowing.

If we decide to add
__round__ to arrays in the future, Python
ints may become inconsistent with our design, as such a method will return an
int64 array.
This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

change scalar floats to return integers for
__round__ (which integer type was not discussed, I propose
np.int64)

not change anything else: not 0d arrays and not other numpy functionality
I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.
round(some_ndarray) isn't implemented, so there is no difference to worry about.
If you want the float>float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


There several mixed issues here.
1. PEP 3141 < https://www.python.org/dev/peps/pep3141/> compliance.
Numpy scalars are `numbers.Real` instances, and have to respect the
`__round__` semantics defined by PEP 3141:
def __round__(self, ndigits:Integral=None):
"""Rounds self to ndigits decimal places, defaulting to 0.
If ndigits is omitted or None, returns an Integral,
otherwise returns a Real, preferably of the same type as
self. Types may choose which direction to round half. For
example, float rounds half toward even.
"""
This means that if Real > Real rounding is desired one should call
`round(x, 0)` or `np.around(x)`.
This semantics only dictates that the return type should be Integral, so
for `round(x)` and `round(x, None)`
np.float32 > np.int32
np.float32 > np.int64
np.float64 > np.int64
np.floatXX > int
are all OK.
I think also that it is perfectly OK to raise an overflow on `round(x)`
2. Liskov substitution principle
`np.float64` floats are also `float` instances (but `np.float32` are not.)
This means that strictly respecting LSP means that `np.float64` should round to python
`int`, since `round(x)` never overflows for python `float`.
Here we have several options.
 round `np.float64` > `int` and respect LSP.
 relax LSP, and round `np.float64` > `np.int64`. Who cares about `round(1e300)`?
 decide that there is no reason for having `np.float64` a subclass of `float`,
so that LSP does not apply.
This all said, I think that these are the two most sensible choices for `round(x)`:
np.float32 > np.int32
np.float64 > np.int64
drop np.float64 subclassing python float
or
np.float32 > int
np.float64 > int
keep np.float64 subclassing python float
The second one seems to me the less disruptive one.
Bye
Stefano
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Hello, Ilhan,
From:
NumPyDiscussion <numpydiscussionbounces+einstein.edison=[hidden email]> on behalf of Ilhan Polat <[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Thursday, 27. February 2020 at 08:41
To: Discussion of Numerical Python <[hidden email]>
Subject: Re: [Numpydiscussion] Output type of round is inconsistent with python builtin
Oh sorry. That's trigger finger npdotting.
What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting.
No they won’t. The only affected types would be scalars, and that too only with the builtin Python
round. Arrays don’t define the
__round__ method, and so won’t be affected.
np.ndarray.round won’t be affected either. Only
np_scalar_types.__round__ will be affected, which is what the Python
round checks for.
For illustration, in code:
>>> type(round(np_float))
<class 'numpy.float64'>
>>> type(round(np_array_0d))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type numpy.ndarray doesn't define __round__ method
>>> type(round(np_array_nd))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type numpy.ndarray doesn't define __round__ method
The second and third cases would remain unaffected. Only the first case would return a builtin Python
int with what Robert Kern is suggesting and a
np.int64 with what I’m suggesting. I do agree with something posted elsewhere on this thread that we should warn on overflow but prefer to be selfconsistent and return a
np.int64, but it doesn’t matter too much to me. Furthermore, the behavior of
np.[a]round and
np_arr.round(…) will not change. The only upcasting problem here is if someone does this in a loop, in which case they’re probably using Python objects and don’t care about memory
anyway.
Maybe not a big problem but can have significant impact.
Best regards,
Hameer Abbasi
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Hello, Ilhan,
From:
NumPyDiscussion <numpydiscussionbounces+einstein.edison=[hidden email]> on behalf of Ilhan Polat <[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Thursday, 27. February 2020 at 08:41
To: Discussion of Numerical Python <[hidden email]>
Subject: Re: [Numpydiscussion] Output type of round is inconsistent with python builtin
Oh sorry. That's trigger finger npdotting.
What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting.
No they won’t. The only affected types would be scalars, and that too only with the builtin Python
round.
Just to be clear, his example _did_ use numpy scalars.
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Oh sorry. That's trigger finger npdotting.
What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting.
Maybe not a big problem but can have significant impact.
np.round()/np.around() will still exist and behave as you would want it to in such cases (float32>float32, float64>float64).
 Robert Kern
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion

