Output type of round is inconsistent with python built-in

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Output type of round is inconsistent with python built-in

Hameer Abbasi

Hello,

 

Currently, the built-in Python round (which is different from np.round) when called on a np.float64 returns a np.float64, due to its __round__ method. A congruous statement is true for np.float32. However, since Python 3, the default behavior of round is to return a Python int when it operates on a Python float. This is a mismatch according to the Liskov Substitution Principle, as both these types subclass Python’s float. This has been brought up in gh-15297. Here is the problem summed up in code:

 

>>> type(round(np.float64(5)))

<class 'numpy.float64'>

>>> type(round(np.float32(5)))

<class 'numpy.float32'>

>>> type(round(float(5)))

<class 'int'>

 

This problem manifests itself most prominently when trying to index into collections:

 

>>> np.arange(6)[round(float(5))]

5

>>> np.arange(6)[round(np.float64(5))]

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

 

There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality

Does anyone have any thoughts on the proposal?

Best regards,
Hameer Abbasi


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

josef.pktd
great another object array

np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])])
array([1, 2, 200000000000000000000,
       199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896],
      dtype=object)


I would rather have numpy consistent with numpy than with python


On Wed, Feb 26, 2020 at 4:38 PM Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Ilhan Polat
In reply to this post by Robert Kern-2
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 




On Wed, Feb 26, 2020, 21:38 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

josef.pktd


On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat <[hidden email]> wrote:
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 




On Wed, Feb 26, 2020, 21:38 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality

I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.

(although I don't remember having been careful about the distinction between python scalars and numpy scalars in some time. 
I had some fun with integers in the scipy.stats discrete distributions, until they became floats)

Josef

 
The only reason that float.__round__() was allowed to change to returning ints was because ints became unbounded. If we also change to returning an integer type, it should be a Python int.

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
On Wed, Feb 26, 2020 at 5:41 PM <[hidden email]> wrote:


On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat <[hidden email]> wrote:
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 




On Wed, Feb 26, 2020, 21:38 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality

I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.

round(some_ndarray) isn't implemented, so there is no difference to worry about.

If you want the float->float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

-- 
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
In reply to this post by Ilhan Polat
On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat <[hidden email]> wrote:
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

No. np.round() is an alias (which would be good to deprecate) for np.around(). No one has proposed changing np.around().
 
That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 

The type of the return value is part of the type's interface, not the specific instance.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
In reply to this post by josef.pktd
On Wed, Feb 26, 2020 at 5:27 PM <[hidden email]> wrote:
great another object array

np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])])
array([1, 2, 200000000000000000000,
       199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896],
      dtype=object)


I would rather have numpy consistent with numpy than with python

Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

josef.pktd


On Wed, Feb 26, 2020 at 6:09 PM Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 5:27 PM <[hidden email]> wrote:
great another object array

np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])])
array([1, 2, 200000000000000000000,
       199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896],
      dtype=object)


I would rather have numpy consistent with numpy than with python

Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.


Maybe I misunderstand

I'm using np.round a lot. So maybe it's a question whether and how it will affect np.round.

Does the following change with the proposal?

np.round(np.array([1, 2.5, 2e20, 2e200]))
array([1.e+000, 2.e+000, 2.e+020, 2.e+200])

np.round(np.array([1, 2.5, 2e20, 2e200])).astype(int)
array([          1,           2, -2147483648, -2147483648])

np.round(np.array([2e200])[0])
2e+200

np.round(2e200)
2e+200

round(2e200)
199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896

Josef
"around 100" sounds like "something all_close(100)"
 

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

josef.pktd


On Wed, Feb 26, 2020 at 6:57 PM <[hidden email]> wrote:


On Wed, Feb 26, 2020 at 6:09 PM Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 5:27 PM <[hidden email]> wrote:
great another object array

np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])])
array([1, 2, 200000000000000000000,
       199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896],
      dtype=object)


I would rather have numpy consistent with numpy than with python

Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.


Maybe I misunderstand

I'm using np.round a lot. So maybe it's a question whether and how it will affect np.round.

Does the following change with the proposal?

np.round(np.array([1, 2.5, 2e20, 2e200]))
array([1.e+000, 2.e+000, 2.e+020, 2.e+200])

np.round(np.array([1, 2.5, 2e20, 2e200])).astype(int)
array([          1,           2, -2147483648, -2147483648])

np.round(np.array([2e200])[0])
2e+200

np.round(2e200)
2e+200

round(2e200)
199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896

Josef
"around 100" sounds like "something all_close(100)"

I guess I'm slow

It only affects this case, as long as we don't have  __round__ in arrays

round(np.float64(2e200))
2e+200

round(np.array([1, 2.5, 2e20, 2e200]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-177-bd4a17555729> in <module>
----> 1 round(np.array([1, 2.5, 2e20, 2e200]))

TypeError: type numpy.ndarray doesn't define __round__ method

Josef
 
 

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
In reply to this post by josef.pktd
On Wed, Feb 26, 2020 at 6:59 PM <[hidden email]> wrote:


On Wed, Feb 26, 2020 at 6:09 PM Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 5:27 PM <[hidden email]> wrote:
great another object array

np.asarray([round(x_i.item()) for x_i in np.array([1, 2.5, 2e20, 2e200])])
array([1, 2, 200000000000000000000,
       199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896],
      dtype=object)


I would rather have numpy consistent with numpy than with python

Since round() (and the __round__() interface) is part of Python and not numpy, there is nothing in numpy to be consistent with. We only implement __round__() for the scalar types.


Maybe I misunderstand

I'm using np.round a lot. So maybe it's a question whether and how it will affect np.round.

Nope, not changing.
 
Does the following change with the proposal?

np.round(np.array([1, 2.5, 2e20, 2e200]))
array([1.e+000, 2.e+000, 2.e+020, 2.e+200])

np.round(np.array([1, 2.5, 2e20, 2e200])).astype(int)
array([          1,           2, -2147483648, -2147483648])

np.round(np.array([2e200])[0])
2e+200

np.round(2e200)
2e+200

No change.
 
round(2e200)
199999999999999993946624442502072331894900655091004725296483501900693696871108151068392676809412503736055024831947764816364271468736556969278770082094479755742047182133579963622363626612334257709776896

Obviously, not under out control, but no, that's not changing.

This is the only result that will change:

round(np.float64(2e200))
2e+200
 
Josef
"around 100" sounds like "something all_close(100)"

I know. It's meant to be read as "array-round". We prefer the `around()` spelling to avoid shadowing the built-in. Early mistake that we're still living with.
 
--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Ilhan Polat
In reply to this post by Robert Kern-2
It's not about what I want but this changes the output of round. In my example I didn't use any arrays but a scalar type which looks like will upcasted. 

On Wed, Feb 26, 2020, 23:04 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 5:41 PM <[hidden email]> wrote:


On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat <[hidden email]> wrote:
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 




On Wed, Feb 26, 2020, 21:38 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality

I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.

round(some_ndarray) isn't implemented, so there is no difference to worry about.

If you want the float->float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
Your example used np.round(), not the builtin round(). np.round() is not changing. If you want the dtype of the output to be the dtype of the input, you can certainly keep using np.round() (or its canonical spelling, np.around()).

On Thu, Feb 27, 2020, 12:05 AM Ilhan Polat <[hidden email]> wrote:
It's not about what I want but this changes the output of round. In my example I didn't use any arrays but a scalar type which looks like will upcasted. 

On Wed, Feb 26, 2020, 23:04 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 5:41 PM <[hidden email]> wrote:


On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat <[hidden email]> wrote:
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 




On Wed, Feb 26, 2020, 21:38 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality

I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.

round(some_ndarray) isn't implemented, so there is no difference to worry about.

If you want the float->float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Ilhan Polat
Oh sorry. That's trigger finger np-dotting.

What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting. 

Maybe not a big problem but can have significant impact. 

On Thu, Feb 27, 2020, 05:12 Robert Kern <[hidden email]> wrote:
Your example used np.round(), not the builtin round(). np.round() is not changing. If you want the dtype of the output to be the dtype of the input, you can certainly keep using np.round() (or its canonical spelling, np.around()).

On Thu, Feb 27, 2020, 12:05 AM Ilhan Polat <[hidden email]> wrote:
It's not about what I want but this changes the output of round. In my example I didn't use any arrays but a scalar type which looks like will upcasted. 

On Wed, Feb 26, 2020, 23:04 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 5:41 PM <[hidden email]> wrote:


On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat <[hidden email]> wrote:
Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being bloated just by rounding for an example. Or numpy array size growing for no apparent reason

I am not really sure if I understand why LSP should hold in this case to be honest. Rounding is an operation specific for the number instance and not for the generic class. 




On Wed, Feb 26, 2020, 21:38 Robert Kern <[hidden email]> wrote:
On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <[hidden email]> wrote:


There still remains the question, do we return Python ints or np.int64s?

  • Python ints have the advantage of not overflowing.
  • If we decide to add __round__ to arrays in the future, Python ints may become inconsistent with our design, as such a method will return an int64 array.

 

This was issue was discussed in the weekly triage meeting today, and the following plan of action was proposed:

  • change scalar floats to return integers for __round__ (which integer type was not discussed, I propose np.int64)
  • not change anything else: not 0d arrays and not other numpy functionality

I think making numerical behavior different between arrays and numpy scalars with the same dtype, will create many happy debugging hours.

round(some_ndarray) isn't implemented, so there is no difference to worry about.

If you want the float->float rounding, use np.around(). That function should continue to behave like it currently does for both arrays and scalars.

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Stefano Miccoli
In reply to this post by Hameer Abbasi
There several mixed issues here.

1. PEP 3141 <https://www.python.org/dev/peps/pep-3141/> compliance.

Numpy scalars are `numbers.Real` instances, and have to respect the
`__round__` semantics defined by PEP 3141:

    def __round__(self, ndigits:Integral=None):
        """Rounds self to ndigits decimal places, defaulting to 0.

        If ndigits is omitted or None, returns an Integral,
        otherwise returns a Real, preferably of the same type as
        self. Types may choose which direction to round half. For
        example, float rounds half toward even.

        """

This means that if Real -> Real rounding is desired one should call
`round(x, 0)` or `np.around(x)`.

This semantics only dictates that the return type should be Integral, so
for `round(x)` and `round(x, None)`

np.float32 -> np.int32
np.float32 -> np.int64
np.float64 -> np.int64
np.floatXX -> int

are all OK.
I think also that it is perfectly OK to raise an overflow on `round(x)`


2. Liskov substitution principle

`np.float64` floats are also `float` instances (but `np.float32` are not.)
This means that strictly respecting LSP means that `np.float64` should round to python
`int`, since `round(x)` never overflows for python `float`.

Here we have several options.

- round `np.float64` -> `int` and respect LSP.

- relax LSP, and round  `np.float64` -> `np.int64`. Who cares about `round(1e300)`?

- decide that there is no reason for having `np.float64` a subclass of `float`,
  so that LSP does not apply.


This all said, I think that these are the two most sensible choices for `round(x)`:

np.float32 -> np.int32
np.float64 -> np.int64
drop np.float64 subclassing python float

or

np.float32 -> int
np.float64 -> int
keep np.float64 subclassing python float


The second one seems to me the less disruptive one.

Bye

Stefano
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Hameer Abbasi
In reply to this post by Ilhan Polat

Hello, Ilhan,

 

From: NumPy-Discussion <numpy-discussion-bounces+einstein.edison=[hidden email]> on behalf of Ilhan Polat <[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Thursday, 27. February 2020 at 08:41
To: Discussion of Numerical Python <[hidden email]>
Subject: Re: [Numpy-discussion] Output type of round is inconsistent with python built-in

 

Oh sorry. That's trigger finger np-dotting.

 

What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting. 

 

No they won’t. The only affected types would be scalars, and that too only with the built-in Python round. Arrays don’t define the __round__ method, and so won’t be affected. np.ndarray.round won’t be affected either. Only np_scalar_types.__round__ will be affected, which is what the Python round checks for.

 

For illustration, in code:

 

>>> type(round(np_float))

<class 'numpy.float64'>

>>> type(round(np_array_0d))

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

TypeError: type numpy.ndarray doesn't define __round__ method

>>> type(round(np_array_nd))

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

TypeError: type numpy.ndarray doesn't define __round__ method

 

The second and third cases would remain unaffected. Only the first case would return a builtin Python int with what Robert Kern is suggesting and a np.int64 with what I’m suggesting. I do agree with something posted elsewhere on this thread that we should warn on overflow but prefer to be self-consistent and return a np.int64, but it doesn’t matter too much to me. Furthermore, the behavior of np.[a]round and np_arr.round(…) will not change. The only upcasting problem here is if someone does this in a loop, in which case they’re probably using Python objects and don’t care about memory anyway.

 

Maybe not a big problem but can have significant impact. 

 

Best regards,

Hameer Abbasi


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
On Thu, Feb 27, 2020 at 4:49 AM Hameer Abbasi <[hidden email]> wrote:

Hello, Ilhan,

 

From: NumPy-Discussion <numpy-discussion-bounces+einstein.edison=[hidden email]> on behalf of Ilhan Polat <[hidden email]>
Reply to: Discussion of Numerical Python <[hidden email]>
Date: Thursday, 27. February 2020 at 08:41
To: Discussion of Numerical Python <[hidden email]>
Subject: Re: [Numpy-discussion] Output type of round is inconsistent with python built-in

 

Oh sorry. That's trigger finger np-dotting.

 

What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting. 

 

No they won’t. The only affected types would be scalars, and that too only with the built-in Python round.


Just to be clear, his example _did_ use numpy scalars.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Output type of round is inconsistent with python built-in

Robert Kern-2
In reply to this post by Ilhan Polat
On Thu, Feb 27, 2020 at 2:43 AM Ilhan Polat <[hidden email]> wrote:
Oh sorry. That's trigger finger np-dotting.

What i mean is if someone was using the round method on float32 or other small bit datatypes they would have a silent upcasting. 

Maybe not a big problem but can have significant impact. 

np.round()/np.around() will still exist and behave as you would want it to in such cases (float32->float32, float64->float64).

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion