np.{bool,float,int} deprecation

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

np.{bool,float,int} deprecation

Juan Nunez-Iglesias-2
Hi all,

At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].

To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.

I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.

Juan.

[1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
[2]: https://github.com/scikit-image/scikit-image/pull/5103
[3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Andras Deak
On Sun, Dec 6, 2020 at 12:31 AM Juan Nunez-Iglesias <[hidden email]> wrote:
>
> Hi all,
>
> At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].
>
> To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.
>
> I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.

Hi Juan,

Let me start with a disclaimer that I'm an end user, and as such it's
very easy for me to be bold when it comes to deprecations :)

But I experienced the same thing that you describe in
https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739429373
:

> [I]t was very surprising to me when I found out that np.float is float. For the longest time I thought that np.float was equivalent to "whatever the default float value is on my platform", and considered it best practice to use that instead of plain float. 😅 I think that is a common misconception.

And I'm pretty sure the vast majority of end users faces this. The
proper np.float32 and other types are intuitive enough that people
don't go out of their way to read the documentation in detail, and
it's highly unexpected that some `np.*` types are mere aliases.
Now, this should probably not be a problem as long as people only
stick these aliases into `dtype` keyword arguments, because that works
as expected (based on the wrong premise). But once you extrapolate
from the `dtype=np.int` behaviour to "`np.int` must be my native numpy
int type" you can easily get subtle bugs. For instance, you might
expect `isinstance(this_type, np.int)` to give you True if `this_type`
is the type of an item of an array with `dtype=np.int`.

To be fair I'm not sure that I've ever been bitten by this
personally... but once you're aware of the pitfall it seems really
ominous. I guess one helpful question is this: among all the code
churn needed to fix the breakage did you find any bugs that were
revealed by the deprecation? If that's the case (in scikit-image or
any other large downstream library) then that would be a good argument
for going forward with the deprecation.
Cheers,

András


> Juan.
>
> [1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> [2]: https://github.com/scikit-image/scikit-image/pull/5103
> [3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Charles R Harris
In reply to this post by Juan Nunez-Iglesias-2


On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]> wrote:
Hi all,

At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].

To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.

I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.

Juan.

[1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
[2]: https://github.com/scikit-image/scikit-image/pull/5103
[3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765

I checked pandas and astropy and both have several uses of the deprecated types but should be easy to fix. I suppose the question is if we want to make them fix things right now :)

Chuck 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

hmaarrfk

I guess if the answer is to stop people from

from numpy import *

there is a good fix for that doesn’t involve deprecating dtype=np.int.

If the answer is to deprecate

np.int(1) == int(1)

then one can add a warning to the __init__ of the np.int class, but continue to subclass the python int class.

It just doesn’t seem worthwhile to to stop people from using dtype=np.int, which seem to read:

“I want this to be a numpy integer, not necessarily a python integer”.


On Sat, Dec 5, 2020 at 10:14 PM Charles R Harris <[hidden email]> wrote:


On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]> wrote:
Hi all,

At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].

To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.

I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.

Juan.

[1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
[2]: https://github.com/scikit-image/scikit-image/pull/5103
[3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765

I checked pandas and astropy and both have several uses of the deprecated types but should be easy to fix. I suppose the question is if we want to make them fix things right now :)

Chuck 
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Stephan Hoyer-2
On Sat, Dec 5, 2020 at 9:24 PM Mark Harfouche <[hidden email]> wrote:

If the answer is to deprecate

np.int(1) == int(1)

then one can add a warning to the __init__ of the np.int class, but continue to subclass the python int class.

It just doesn’t seem worthwhile to to stop people from using dtype=np.int, which seem to read:

“I want this to be a numpy integer, not necessarily a python integer”.

The problem is that there is assuredly code that inadvertently relies upon this (mis)feature.

If we change the behavior of np.int() to create np.int64() objects instead of int() objects, it is likely to result in breaking some user code. Even with a prior warning, this breakage may be surprising and very hard to track down. In contrast, it's much safer to simply remove np.int entirely, because if users ignore the deprecation they end up with an error.

This is a general feature for deprecations: it's much safer to remove functionality than it is to change behavior.

So on the whole, I think this is the right call.


On Sat, Dec 5, 2020 at 10:14 PM Charles R Harris <[hidden email]> wrote:


On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]> wrote:
Hi all,

At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].

To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.

I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.

Juan.

[1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
[2]: https://github.com/scikit-image/scikit-image/pull/5103
[3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765

I checked pandas and astropy and both have several uses of the deprecated types but should be easy to fix. I suppose the question is if we want to make them fix things right now :)

Chuck 
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Robert Kern-2
On Sun, Dec 6, 2020 at 12:52 AM Stephan Hoyer <[hidden email]> wrote:
On Sat, Dec 5, 2020 at 9:24 PM Mark Harfouche <[hidden email]> wrote:

If the answer is to deprecate

np.int(1) == int(1)

then one can add a warning to the __init__ of the np.int class, but continue to subclass the python int class.

It just doesn’t seem worthwhile to to stop people from using dtype=np.int, which seem to read:

“I want this to be a numpy integer, not necessarily a python integer”.

The problem is that there is assuredly code that inadvertently relies upon this (mis)feature.

If we change the behavior of np.int() to create np.int64() objects instead of int() objects, it is likely to result in breaking some user code. Even with a prior warning, this breakage may be surprising and very hard to track down. In contrast, it's much safer to simply remove np.int entirely, because if users ignore the deprecation they end up with an error.

FWIW (and IIRC), this was the original misfeature. `np.int`, `np.bool`, and `np.float` were aliases for their corresponding default scalar types in the first numpy releases. However, too many people were doing `from numpy import *` and covering up the builtins. We renamed these aliases with trailing underscores to avoid that problem, but too many people (even in those early days) still had uses of `dtype=np.int`. Making `np.int is int` was the backwards-compatibility hack.

--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Sebastian Berg
In reply to this post by Charles R Harris
On Sat, 2020-12-05 at 20:12 -0700, Charles R Harris wrote:

> On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > At the prodding [1] of Sebastian, I’m starting a discussion on the
> > decision to deprecate np.{bool,float,int}. This deprecation broke
> > our
> > prerelease testing in scikit-image (which, hooray for rcs!), and
> > resulted
> > in a large amount of code churn to fix [2].
> >
> > To be honest, I do think *some* sort of deprecation is needed,
> > because for
> > the longest time I thought that np.float was what np.float_
> > actually is. I
> > think it would be worthwhile to move to *that*, though it’s an even
> > more
> > invasive deprecation than the currently proposed one. Writing `x =
> > np.zeros(5, dtype=int)` is somewhat magical, because someone with a
> > strict
> > typing mindset (there’s an increasing number!) might expect that
> > this is an
> > array of pointers to Python ints. This is why I’ve always preferred
> > to
> > write `dtype=np.int`, resulting in the current code churn.
> >
> > I don’t know what the best answer is, just sparking the discussion
> > Sebastian wants to see. ;) For skimage we’ve already merged a fix
> > (even if
> > it is one of dubious quality, as Stéfan points out [3] ;), so I
> > don’t have
> > too much stake in the outcome.
> >
> > Juan.
> >
> > [1]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > [3]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> >
>
> I checked pandas and astropy and both have several uses of the
> deprecated
> types but should be easy to fix. I suppose the question is if we want
> to
> make them fix things *right now* :)
>

The reason why I thought it might be good to bring this up again is
that I am not sure clear on how painful the deprecation is; which
should be weighed against the benefit.  And the benefit here is only
moderate.

Thus, with the things now in and a few more people exposed to it, if
anyone thinks its a bad idea or that we should delay, I am all ears.

Cheers,

Sebastian


> Chuck
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

ralfgommers


On Sun, Dec 6, 2020 at 4:23 PM Sebastian Berg <[hidden email]> wrote:
On Sat, 2020-12-05 at 20:12 -0700, Charles R Harris wrote:
> On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > At the prodding [1] of Sebastian, I’m starting a discussion on the
> > decision to deprecate np.{bool,float,int}. This deprecation broke
> > our
> > prerelease testing in scikit-image (which, hooray for rcs!), and
> > resulted
> > in a large amount of code churn to fix [2].
> >
> > To be honest, I do think *some* sort of deprecation is needed,
> > because for
> > the longest time I thought that np.float was what np.float_
> > actually is. I
> > think it would be worthwhile to move to *that*, though it’s an even
> > more
> > invasive deprecation than the currently proposed one. Writing `x =
> > np.zeros(5, dtype=int)` is somewhat magical, because someone with a
> > strict
> > typing mindset (there’s an increasing number!) might expect that
> > this is an
> > array of pointers to Python ints. This is why I’ve always preferred
> > to
> > write `dtype=np.int`, resulting in the current code churn.
> >
> > I don’t know what the best answer is, just sparking the discussion
> > Sebastian wants to see. ;) For skimage we’ve already merged a fix
> > (even if
> > it is one of dubious quality, as Stéfan points out [3] ;), so I
> > don’t have
> > too much stake in the outcome.
> >
> > Juan.
> >
> > [1]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > [3]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> >
>
> I checked pandas and astropy and both have several uses of the
> deprecated
> types but should be easy to fix. I suppose the question is if we want
> to
> make them fix things *right now* :)
>


The reason why I thought it might be good to bring this up again is
that I am not sure clear on how painful the deprecation is; which
should be weighed against the benefit.  And the benefit here is only
moderate.

It will be painful as in "lots of churn", but the fixes are straightforward. And it's clear many knowledgeable users didn't know they were aliases, so there is something to gain here.

Whether or not we revert the deprecation, I'd be in favor of improving the docs to answer the most common questions and pitfalls, like:

- What happens when I use Python builtin types with the dtype keyword?
- How do I check if something is an integer array? Or a NumPy or Python integer?
- What are default integer, float and complex precisions on all platforms?
- How do I iterate over all floating point dtypes when writing tests?
- Which of the many equivalent dtypes should I prefer? --> use float64, not float_ or double
- warning: float128 and float96 do not exist on all platforms

Related: it's still easy to have things leak into the namespace unintentionally - `np.sys` and `np.os` exist too. I think we can probably clean those up without a deprecation, but we should write some more public API tests that prevent this kind of thing.

Cheers,
Ralf



Thus, with the things now in and a few more people exposed to it, if
anyone thinks its a bad idea or that we should delay, I am all ears.

Cheers,

Sebastian


> Chuck
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Eric Wieser
If the CI noise in downstream libraries is particularly painful, we could switch to `PendingDeprecationWarning` instead of `DeprecationWarning` to make it easier to add the warnings to an ignore list.
I think this might make the warning less visible to end users though, who are the users that this deprecation was really aimed at.

Eric

On Mon, 7 Dec 2020 at 11:39, Ralf Gommers <[hidden email]> wrote:


On Sun, Dec 6, 2020 at 4:23 PM Sebastian Berg <[hidden email]> wrote:
On Sat, 2020-12-05 at 20:12 -0700, Charles R Harris wrote:
> On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > At the prodding [1] of Sebastian, I’m starting a discussion on the
> > decision to deprecate np.{bool,float,int}. This deprecation broke
> > our
> > prerelease testing in scikit-image (which, hooray for rcs!), and
> > resulted
> > in a large amount of code churn to fix [2].
> >
> > To be honest, I do think *some* sort of deprecation is needed,
> > because for
> > the longest time I thought that np.float was what np.float_
> > actually is. I
> > think it would be worthwhile to move to *that*, though it’s an even
> > more
> > invasive deprecation than the currently proposed one. Writing `x =
> > np.zeros(5, dtype=int)` is somewhat magical, because someone with a
> > strict
> > typing mindset (there’s an increasing number!) might expect that
> > this is an
> > array of pointers to Python ints. This is why I’ve always preferred
> > to
> > write `dtype=np.int`, resulting in the current code churn.
> >
> > I don’t know what the best answer is, just sparking the discussion
> > Sebastian wants to see. ;) For skimage we’ve already merged a fix
> > (even if
> > it is one of dubious quality, as Stéfan points out [3] ;), so I
> > don’t have
> > too much stake in the outcome.
> >
> > Juan.
> >
> > [1]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > [3]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> >
>
> I checked pandas and astropy and both have several uses of the
> deprecated
> types but should be easy to fix. I suppose the question is if we want
> to
> make them fix things *right now* :)
>


The reason why I thought it might be good to bring this up again is
that I am not sure clear on how painful the deprecation is; which
should be weighed against the benefit.  And the benefit here is only
moderate.

It will be painful as in "lots of churn", but the fixes are straightforward. And it's clear many knowledgeable users didn't know they were aliases, so there is something to gain here.

Whether or not we revert the deprecation, I'd be in favor of improving the docs to answer the most common questions and pitfalls, like:

- What happens when I use Python builtin types with the dtype keyword?
- How do I check if something is an integer array? Or a NumPy or Python integer?
- What are default integer, float and complex precisions on all platforms?
- How do I iterate over all floating point dtypes when writing tests?
- Which of the many equivalent dtypes should I prefer? --> use float64, not float_ or double
- warning: float128 and float96 do not exist on all platforms

Related: it's still easy to have things leak into the namespace unintentionally - `np.sys` and `np.os` exist too. I think we can probably clean those up without a deprecation, but we should write some more public API tests that prevent this kind of thing.

Cheers,
Ralf



Thus, with the things now in and a few more people exposed to it, if
anyone thinks its a bad idea or that we should delay, I am all ears.

Cheers,

Sebastian


> Chuck
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Aaron Meurer
In reply to this post by Juan Nunez-Iglesias-2
Regarding np.bool specifically, if you want to deprecate this, you
might want to discuss this with us at the array API standard
https://github.com/data-apis/array-api (which is currently in RFC
stage). The spec uses bool as the name for the boolean dtype.

Would it make sense for NumPy to change np.bool to just be the boolean
dtype object? Unlike int and float, there is no ambiguity with bool,
and NumPy clearly doesn't have any issues with shadowing builtin names
in its namespace.

Aaron Meurer

On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]> wrote:

>
> Hi all,
>
> At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].
>
> To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.
>
> I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.
>
> Juan.
>
> [1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> [2]: https://github.com/scikit-image/scikit-image/pull/5103
> [3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Sebastian Berg
On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:

> Regarding np.bool specifically, if you want to deprecate this, you
> might want to discuss this with us at the array API standard
> https://github.com/data-apis/array-api (which is currently in RFC
> stage). The spec uses bool as the name for the boolean dtype.
>
> Would it make sense for NumPy to change np.bool to just be the
> boolean
> dtype object? Unlike int and float, there is no ambiguity with bool,
> and NumPy clearly doesn't have any issues with shadowing builtin
> names
> in its namespace.
We could keep the Python alias around (which for `dtype=` is the same
as `np.bool_`).

I am not sure I like the idea of immediately shadowing the builtin.
That is a switch we can avoid flipping (without warning); `np.bool_`
and `bool` are fairly different beasts? [1]
OTOH, if someone wants to entertain switching... It could be
interesting to see how (unfixed) downstream projects react to it.

One approach would be:

* Go ahead for now (deprecate)
* Add a FutureWarning at some point that we _will_ start to export
  `np.bool` again (but `from numpy import *` is a problem?)
* Aim to make `np.bool is np.bool_` at some point in the (far) future.

It is multi-step (and I recall opinions that multi-step is bad).
Although, I think the main argument against it was to not force users
to modify code more than once.  And I do not think that happens here.

Of course we could use the `FutureWarning` right away, but I don't mind
taking it slow.

Cheers,

Sebastian
 


[1] I admit, probably almost nobody would notice. And usually using a
Python `bool` is better...


>
> Aaron Meurer
>
> On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]>
> wrote:
> > Hi all,
> >
> > At the prodding [1] of Sebastian, I’m starting a discussion on the
> > decision to deprecate np.{bool,float,int}. This deprecation broke
> > our prerelease testing in scikit-image (which, hooray for rcs!),
> > and resulted in a large amount of code churn to fix [2].
> >
> > To be honest, I do think *some* sort of deprecation is needed,
> > because for the longest time I thought that np.float was what
> > np.float_ actually is. I think it would be worthwhile to move to
> > *that*, though it’s an even more invasive deprecation than the
> > currently proposed one. Writing `x = np.zeros(5, dtype=int)` is
> > somewhat magical, because someone with a strict typing mindset
> > (there’s an increasing number!) might expect that this is an array
> > of pointers to Python ints. This is why I’ve always preferred to
> > write `dtype=np.int`, resulting in the current code churn.
> >
> > I don’t know what the best answer is, just sparking the discussion
> > Sebastian wants to see. ;) For skimage we’ve already merged a fix
> > (even if it is one of dubious quality, as Stéfan points out [3] ;),
> > so I don’t have too much stake in the outcome.
> >
> > Juan.
> >
> > [1]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > [3]:
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Aaron Meurer
On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
<[hidden email]> wrote:

>
> On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > Regarding np.bool specifically, if you want to deprecate this, you
> > might want to discuss this with us at the array API standard
> > https://github.com/data-apis/array-api (which is currently in RFC
> > stage). The spec uses bool as the name for the boolean dtype.
> >
> > Would it make sense for NumPy to change np.bool to just be the
> > boolean
> > dtype object? Unlike int and float, there is no ambiguity with bool,
> > and NumPy clearly doesn't have any issues with shadowing builtin
> > names
> > in its namespace.
>
> We could keep the Python alias around (which for `dtype=` is the same
> as `np.bool_`).
>
> I am not sure I like the idea of immediately shadowing the builtin.
> That is a switch we can avoid flipping (without warning); `np.bool_`
> and `bool` are fairly different beasts? [1]

NumPy already shadows a lot of builtins, in many cases, in ways that
are incompatible with existing ones. It's not something I would have
done personally, but it's been this way for a long time.

Aaron Meurer

> OTOH, if someone wants to entertain switching... It could be
> interesting to see how (unfixed) downstream projects react to it.
>
> One approach would be:
>
> * Go ahead for now (deprecate)
> * Add a FutureWarning at some point that we _will_ start to export
>   `np.bool` again (but `from numpy import *` is a problem?)
> * Aim to make `np.bool is np.bool_` at some point in the (far) future.
>
> It is multi-step (and I recall opinions that multi-step is bad).
> Although, I think the main argument against it was to not force users
> to modify code more than once.  And I do not think that happens here.
>
> Of course we could use the `FutureWarning` right away, but I don't mind
> taking it slow.
>
> Cheers,
>
> Sebastian
>
>
>
> [1] I admit, probably almost nobody would notice. And usually using a
> Python `bool` is better...
>
>
> >
> > Aaron Meurer
> >
> > On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]>
> > wrote:
> > > Hi all,
> > >
> > > At the prodding [1] of Sebastian, I’m starting a discussion on the
> > > decision to deprecate np.{bool,float,int}. This deprecation broke
> > > our prerelease testing in scikit-image (which, hooray for rcs!),
> > > and resulted in a large amount of code churn to fix [2].
> > >
> > > To be honest, I do think *some* sort of deprecation is needed,
> > > because for the longest time I thought that np.float was what
> > > np.float_ actually is. I think it would be worthwhile to move to
> > > *that*, though it’s an even more invasive deprecation than the
> > > currently proposed one. Writing `x = np.zeros(5, dtype=int)` is
> > > somewhat magical, because someone with a strict typing mindset
> > > (there’s an increasing number!) might expect that this is an array
> > > of pointers to Python ints. This is why I’ve always preferred to
> > > write `dtype=np.int`, resulting in the current code churn.
> > >
> > > I don’t know what the best answer is, just sparking the discussion
> > > Sebastian wants to see. ;) For skimage we’ve already merged a fix
> > > (even if it is one of dubious quality, as Stéfan points out [3] ;),
> > > so I don’t have too much stake in the outcome.
> > >
> > > Juan.
> > >
> > > [1]:
> > > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > > [3]:
> > > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [hidden email]
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Robert Kern-2
On Wed, Dec 9, 2020 at 4:08 PM Aaron Meurer <[hidden email]> wrote:
On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
<[hidden email]> wrote:
>
> On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > Regarding np.bool specifically, if you want to deprecate this, you
> > might want to discuss this with us at the array API standard
> > https://github.com/data-apis/array-api (which is currently in RFC
> > stage). The spec uses bool as the name for the boolean dtype.
> >
> > Would it make sense for NumPy to change np.bool to just be the
> > boolean
> > dtype object? Unlike int and float, there is no ambiguity with bool,
> > and NumPy clearly doesn't have any issues with shadowing builtin
> > names
> > in its namespace.
>
> We could keep the Python alias around (which for `dtype=` is the same
> as `np.bool_`).
>
> I am not sure I like the idea of immediately shadowing the builtin.
> That is a switch we can avoid flipping (without warning); `np.bool_`
> and `bool` are fairly different beasts? [1]

NumPy already shadows a lot of builtins, in many cases, in ways that
are incompatible with existing ones. It's not something I would have
done personally, but it's been this way for a long time.

Sometimes, we had the function first before Python added them to the builtins (e.g. sum(), any(), all(), IIRC). I think max() and min() are the main ones that we added after Python did, and we explicitly exclude them from __all__ to avoid clobbering the builtins.

Shadowing the types (bool, int, float) historically tended to be more problematic than those functions. The first releases of numpy _did_ have those as the scalar types. That empirically turned out to cause more problems for people than sum() or any(), so we renamed the scalar types to have the trailing underscore. We only left the shadowed names as aliases for the builtins because enough people still had `dtype=np.float` in their code that we didn't want to break.

All that said, "from numpy import *" is less common these days. We have been pretty successful at getting people on board with the np campaign.
 
--
Robert Kern

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Stephan Hoyer-2
In reply to this post by Aaron Meurer
On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]> wrote:
On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
<[hidden email]> wrote:
>
> On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > Regarding np.bool specifically, if you want to deprecate this, you
> > might want to discuss this with us at the array API standard
> > https://github.com/data-apis/array-api (which is currently in RFC
> > stage). The spec uses bool as the name for the boolean dtype.
> >
> > Would it make sense for NumPy to change np.bool to just be the
> > boolean
> > dtype object? Unlike int and float, there is no ambiguity with bool,
> > and NumPy clearly doesn't have any issues with shadowing builtin
> > names
> > in its namespace.
>
> We could keep the Python alias around (which for `dtype=` is the same
> as `np.bool_`).
>
> I am not sure I like the idea of immediately shadowing the builtin.
> That is a switch we can avoid flipping (without warning); `np.bool_`
> and `bool` are fairly different beasts? [1]

NumPy already shadows a lot of builtins, in many cases, in ways that
are incompatible with existing ones. It's not something I would have
done personally, but it's been this way for a long time.

It may be defensible to keep np.bool as an alias for Python's bool even when we remove the other aliases.

np.int_ and np.float_ have fixed precision, which makes them somewhat different from the builtin types. NumPy has a whole bunch of different precisions for integer and floats, so this distinction matters.

In contrast, there is only one boolean dtype in NumPy, which matches Python's bool. So we wouldn't have to worry, for example, about whether a user has requested a specific precision explicitly. This comes up in issues like type-promotion where libraries like JAX and PyTorch have special case logic for most Python types vs NumPy dtypes (but booleans are the same for both):

 

Aaron Meurer

> OTOH, if someone wants to entertain switching... It could be
> interesting to see how (unfixed) downstream projects react to it.
>
> One approach would be:
>
> * Go ahead for now (deprecate)
> * Add a FutureWarning at some point that we _will_ start to export
>   `np.bool` again (but `from numpy import *` is a problem?)
> * Aim to make `np.bool is np.bool_` at some point in the (far) future.
>
> It is multi-step (and I recall opinions that multi-step is bad).
> Although, I think the main argument against it was to not force users
> to modify code more than once.  And I do not think that happens here.
>
> Of course we could use the `FutureWarning` right away, but I don't mind
> taking it slow.
>
> Cheers,
>
> Sebastian
>
>
>
> [1] I admit, probably almost nobody would notice. And usually using a
> Python `bool` is better...
>
>
> >
> > Aaron Meurer
> >
> > On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <[hidden email]>
> > wrote:
> > > Hi all,
> > >
> > > At the prodding [1] of Sebastian, I’m starting a discussion on the
> > > decision to deprecate np.{bool,float,int}. This deprecation broke
> > > our prerelease testing in scikit-image (which, hooray for rcs!),
> > > and resulted in a large amount of code churn to fix [2].
> > >
> > > To be honest, I do think *some* sort of deprecation is needed,
> > > because for the longest time I thought that np.float was what
> > > np.float_ actually is. I think it would be worthwhile to move to
> > > *that*, though it’s an even more invasive deprecation than the
> > > currently proposed one. Writing `x = np.zeros(5, dtype=int)` is
> > > somewhat magical, because someone with a strict typing mindset
> > > (there’s an increasing number!) might expect that this is an array
> > > of pointers to Python ints. This is why I’ve always preferred to
> > > write `dtype=np.int`, resulting in the current code churn.
> > >
> > > I don’t know what the best answer is, just sparking the discussion
> > > Sebastian wants to see. ;) For skimage we’ve already merged a fix
> > > (even if it is one of dubious quality, as Stéfan points out [3] ;),
> > > so I don’t have too much stake in the outcome.
> > >
> > > Juan.
> > >
> > > [1]:
> > > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > > [3]:
> > > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [hidden email]
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Sebastian Berg
On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:

> On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]>
> wrote:
>
> > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > <[hidden email]> wrote:
> > >
> > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > Regarding np.bool specifically, if you want to deprecate this,
> > > > you
> > > > might want to discuss this with us at the array API standard
> > > > https://github.com/data-apis/array-api (which is currently in
> > > > RFC
> > > > stage). The spec uses bool as the name for the boolean dtype.
> > > >
> > > > Would it make sense for NumPy to change np.bool to just be the
> > > > boolean
> > > > dtype object? Unlike int and float, there is no ambiguity with
> > > > bool,
> > > > and NumPy clearly doesn't have any issues with shadowing
> > > > builtin
> > > > names
> > > > in its namespace.
> > >
> > > We could keep the Python alias around (which for `dtype=` is the
> > > same
> > > as `np.bool_`).
> > >
> > > I am not sure I like the idea of immediately shadowing the
> > > builtin.
> > > That is a switch we can avoid flipping (without warning);
> > > `np.bool_`
> > > and `bool` are fairly different beasts? [1]
> >
> > NumPy already shadows a lot of builtins, in many cases, in ways
> > that
> > are incompatible with existing ones. It's not something I would
> > have
> > done personally, but it's been this way for a long time.
> >
>
> It may be defensible to keep np.bool as an alias for Python's bool
> even
> when we remove the other aliases.
That is true, `int` is probably the most confusing, since it is not at
all compatible to a Python integer, but rather the "default" integer
(which happens to be the same as C `long` currently).

So we could focus on `np.int`, `np.long`.  I am a bit unsure whether
you would prefer that or are mainly pointing out the possibility?


Right now, my main take-away from the discussion is that it would be
good to clarify the release notes a bit more.

Using `float` for a dtype seems fine to me, but I prefer mentioning
`np.float64` over `np.float_`.
For integers, I wonder if we should also suggest `np.int64`, even – or
because – if the default integer on many systems is currently
`np.int_`?

Cheers,

Sebastian



>
> np.int_ and np.float_ have fixed precision, which makes them somewhat
> different from the builtin types. NumPy has a whole bunch of
> different
> precisions for integer and floats, so this distinction matters.
>
> In contrast, there is only one boolean dtype in NumPy, which matches
> Python's bool. So we wouldn't have to worry, for example, about
> whether a
> user has requested a specific precision explicitly. This comes up in
> issues
> like type-promotion where libraries like JAX and PyTorch have special
> case
> logic for most Python types vs NumPy dtypes (but booleans are the
> same for
> both):
> https://jax.readthedocs.io/en/latest/type_promotion.html
>
>
>
> >
> > Aaron Meurer
> >
> > > OTOH, if someone wants to entertain switching... It could be
> > > interesting to see how (unfixed) downstream projects react to it.
> > >
> > > One approach would be:
> > >
> > > * Go ahead for now (deprecate)
> > > * Add a FutureWarning at some point that we _will_ start to
> > > export
> > >   `np.bool` again (but `from numpy import *` is a problem?)
> > > * Aim to make `np.bool is np.bool_` at some point in the (far)
> > > future.
> > >
> > > It is multi-step (and I recall opinions that multi-step is bad).
> > > Although, I think the main argument against it was to not force
> > > users
> > > to modify code more than once.  And I do not think that happens
> > > here.
> > >
> > > Of course we could use the `FutureWarning` right away, but I
> > > don't mind
> > > taking it slow.
> > >
> > > Cheers,
> > >
> > > Sebastian
> > >
> > >
> > >
> > > [1] I admit, probably almost nobody would notice. And usually
> > > using a
> > > Python `bool` is better...
> > >
> > >
> > > >
> > > > Aaron Meurer
> > > >
> > > > On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <
> > > > [hidden email]>
> > > > wrote:
> > > > > Hi all,
> > > > >
> > > > > At the prodding [1] of Sebastian, I’m starting a discussion
> > > > > on the
> > > > > decision to deprecate np.{bool,float,int}. This deprecation
> > > > > broke
> > > > > our prerelease testing in scikit-image (which, hooray for
> > > > > rcs!),
> > > > > and resulted in a large amount of code churn to fix [2].
> > > > >
> > > > > To be honest, I do think *some* sort of deprecation is
> > > > > needed,
> > > > > because for the longest time I thought that np.float was what
> > > > > np.float_ actually is. I think it would be worthwhile to move
> > > > > to
> > > > > *that*, though it’s an even more invasive deprecation than
> > > > > the
> > > > > currently proposed one. Writing `x = np.zeros(5, dtype=int)`
> > > > > is
> > > > > somewhat magical, because someone with a strict typing
> > > > > mindset
> > > > > (there’s an increasing number!) might expect that this is an
> > > > > array
> > > > > of pointers to Python ints. This is why I’ve always preferred
> > > > > to
> > > > > write `dtype=np.int`, resulting in the current code churn.
> > > > >
> > > > > I don’t know what the best answer is, just sparking the
> > > > > discussion
> > > > > Sebastian wants to see. ;) For skimage we’ve already merged a
> > > > > fix
> > > > > (even if it is one of dubious quality, as Stéfan points out
> > > > > [3] ;),
> > > > > so I don’t have too much stake in the outcome.
> > > > >
> > > > > Juan.
> > > > >
> > > > > [1]:
> > > > >
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > > > > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > > > > [3]:
> > > > >
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > [hidden email]
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > [hidden email]
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [hidden email]
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

ralfgommers


On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <[hidden email]> wrote:
On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]>
> wrote:
>
> > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > <[hidden email]> wrote:
> > >
> > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > Regarding np.bool specifically, if you want to deprecate this,
> > > > you
> > > > might want to discuss this with us at the array API standard
> > > > https://github.com/data-apis/array-api (which is currently in
> > > > RFC
> > > > stage). The spec uses bool as the name for the boolean dtype.
> > > >
> > > > Would it make sense for NumPy to change np.bool to just be the
> > > > boolean
> > > > dtype object? Unlike int and float, there is no ambiguity with
> > > > bool,
> > > > and NumPy clearly doesn't have any issues with shadowing
> > > > builtin
> > > > names
> > > > in its namespace.
> > >
> > > We could keep the Python alias around (which for `dtype=` is the
> > > same
> > > as `np.bool_`).
> > >
> > > I am not sure I like the idea of immediately shadowing the
> > > builtin.
> > > That is a switch we can avoid flipping (without warning);
> > > `np.bool_`
> > > and `bool` are fairly different beasts? [1]
> >
> > NumPy already shadows a lot of builtins, in many cases, in ways
> > that
> > are incompatible with existing ones. It's not something I would
> > have
> > done personally, but it's been this way for a long time.
> >
>
> It may be defensible to keep np.bool as an alias for Python's bool
> even when we remove the other aliases.

I'd agree with that.


That is true, `int` is probably the most confusing, since it is not at
all compatible to a Python integer, but rather the "default" integer
(which happens to be the same as C `long` currently).

So we could focus on `np.int`, `np.long`.  I am a bit unsure whether
you would prefer that or are mainly pointing out the possibility?

Not sure what you mean with focus, focus on describing in the release notes? Deprecating `np.int` seems like the most beneficial part of this whole exercise.

Right now, my main take-away from the discussion is that it would be
good to clarify the release notes a bit more.

Using `float` for a dtype seems fine to me, but I prefer mentioning
`np.float64` over `np.float_`.
For integers, I wonder if we should also suggest `np.int64`, even – or
because – if the default integer on many systems is currently
`np.int_`?

I agree. I think we should recommend sane, descriptive names that do the right thing. So ideally we'd have people spell their dtype specifiers as
  dtype=bool  # or np.bool
  dtype=np.float64
  dtype=np.int64
  dtype=np.complex128
The names with underscores at the end make little sense from a UX perspective. And the C equivalents (single/double/etc) made sense 15 years ago, but with the user base of today - the majority of whom will not know C fluently or at all - also don't make too much sense.

The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and 64 bits is likely to be a pitfall much more often than it is what the user actually needs, so shouldn't be recommended and probably deserves a warning in the docs.

Cheers,
Ralf
 

>
> np.int_ and np.float_ have fixed precision, which makes them somewhat
> different from the builtin types. NumPy has a whole bunch of
> different
> precisions for integer and floats, so this distinction matters.
>
> In contrast, there is only one boolean dtype in NumPy, which matches
> Python's bool. So we wouldn't have to worry, for example, about
> whether a
> user has requested a specific precision explicitly. This comes up in
> issues
> like type-promotion where libraries like JAX and PyTorch have special
> case
> logic for most Python types vs NumPy dtypes (but booleans are the
> same for
> both):
> https://jax.readthedocs.io/en/latest/type_promotion.html


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Sebastian Berg
On Thu, 2020-12-10 at 20:38 +0100, Ralf Gommers wrote:

> On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <
> [hidden email]>
> wrote:
>
> > On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> > > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]>
> > > wrote:
> > >
> > > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > > > <[hidden email]> wrote:
> > > > >
> > > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > > > Regarding np.bool specifically, if you want to deprecate
> > > > > > this,
> > > > > > you
> > > > > > might want to discuss this with us at the array API
> > > > > > standard
> > > > > > https://github.com/data-apis/array-api (which is currently
> > > > > > in
> > > > > > RFC
> > > > > > stage). The spec uses bool as the name for the boolean
> > > > > > dtype.
> > > > > >
> > > > > > Would it make sense for NumPy to change np.bool to just be
> > > > > > the
> > > > > > boolean
> > > > > > dtype object? Unlike int and float, there is no ambiguity
> > > > > > with
> > > > > > bool,
> > > > > > and NumPy clearly doesn't have any issues with shadowing
> > > > > > builtin
> > > > > > names
> > > > > > in its namespace.
> > > > >
> > > > > We could keep the Python alias around (which for `dtype=` is
> > > > > the
> > > > > same
> > > > > as `np.bool_`).
> > > > >
> > > > > I am not sure I like the idea of immediately shadowing the
> > > > > builtin.
> > > > > That is a switch we can avoid flipping (without warning);
> > > > > `np.bool_`
> > > > > and `bool` are fairly different beasts? [1]
> > > >
> > > > NumPy already shadows a lot of builtins, in many cases, in ways
> > > > that
> > > > are incompatible with existing ones. It's not something I would
> > > > have
> > > > done personally, but it's been this way for a long time.
> > > >
> > >
> > > It may be defensible to keep np.bool as an alias for Python's
> > > bool
> > > even when we remove the other aliases.
> >
>
> I'd agree with that.
>
>
> > That is true, `int` is probably the most confusing, since it is not
> > at
> > all compatible to a Python integer, but rather the "default"
> > integer
> > (which happens to be the same as C `long` currently).
> >
> > So we could focus on `np.int`, `np.long`.  I am a bit unsure
> > whether
> > you would prefer that or are mainly pointing out the possibility?
> >
>
> Not sure what you mean with focus, focus on describing in the release
> notes? Deprecating `np.int` seems like the most beneficial part of
> this
> whole exercise.
>
I meant limiting the current deprecation to `np.int`, maybe `np.long`,
and a "carefully chosen" set.
To be honest, I don't mind either way, so any stronger opinion will tip
the scale for me personally (my default currently is to update the
release notes to recommend the more descriptive names).

There are probably more doc updates that would be nice, I will suggest
updating a separate issue for that.


> Right now, my main take-away from the discussion is that it would be
> > good to clarify the release notes a bit more.
> >
> > Using `float` for a dtype seems fine to me, but I prefer mentioning
> > `np.float64` over `np.float_`.
> > For integers, I wonder if we should also suggest `np.int64`, even –
> > or
> > because – if the default integer on many systems is currently
> > `np.int_`?
> >
>
> I agree. I think we should recommend sane, descriptive names that do
> the
> right thing. So ideally we'd have people spell their dtype specifiers
> as
>   dtype=bool  # or np.bool
>   dtype=np.float64
>   dtype=np.int64
>   dtype=np.complex128
> The names with underscores at the end make little sense from a UX
> perspective. And the C equivalents (single/double/etc) made sense 15
> years
> ago, but with the user base of today - the majority of whom will not
> know C
> fluently or at all - also don't make too much sense.
>
> The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and
> 64
> bits is likely to be a pitfall much more often than it is what the
> user
> actually needs, so shouldn't be recommended and probably deserves a
> warning
> in the docs.
Right, there is one slight trickery because `np.intp` is often a great
integer dtype to use, because it is the integer that NumPy uses for all
things related to indexing and array sizes.
(I would be happy to dig out my PR making `np.intp` the default NumPy
integer.)

Cheers,

Sebastian


>
> Cheers,
> Ralf
>
>
> >
> > >
> > > np.int_ and np.float_ have fixed precision, which makes them
> > > somewhat
> > > different from the builtin types. NumPy has a whole bunch of
> > > different
> > > precisions for integer and floats, so this distinction matters.
> > >
> > > In contrast, there is only one boolean dtype in NumPy, which
> > > matches
> > > Python's bool. So we wouldn't have to worry, for example, about
> > > whether a
> > > user has requested a specific precision explicitly. This comes up
> > > in
> > > issues
> > > like type-promotion where libraries like JAX and PyTorch have
> > > special
> > > case
> > > logic for most Python types vs NumPy dtypes (but booleans are the
> > > same for
> > > both):
> > > https://jax.readthedocs.io/en/latest/type_promotion.html
> >
> >
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

Eric Wieser
>  you might want to discuss this with us at the array API standard
https://github.com/data-apis/array-api (which is currently in RFC
> stage). The spec uses bool as the name for the boolean dtype.

I don't fully understand this argument - `np.bool` is already not the boolean dtype. Either:

* The spec is suggesting that `pkg.bool` be some arbitrary object that can be passed into a dtype argument and will produce a boolean array.
  If this is the case, the spec could also just require that `dtype=builtins.bool` have this behavior.
* The spec is suggesting that `pkg.bool` is some rich dtype object.
  Ignoring the question of whether this should be `np.bool_` or `np.dtype(np.bool_)`, it's currently neither, and changing it will break users relying on `np.bool(True) is True`.
  That's not to say this isn't a sensible thing for the specification to have, it's just something that numpy can't conform to without breaking code.

While it would be great if `np.bool_` could be spelt `np.bool`, I really don't think we can make that change without a long deprecation first (if at all).

Eric

On Thu, 10 Dec 2020 at 20:00, Sebastian Berg <[hidden email]> wrote:
On Thu, 2020-12-10 at 20:38 +0100, Ralf Gommers wrote:
> On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <
> [hidden email]>
> wrote:
>
> > On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> > > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]>
> > > wrote:
> > >
> > > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > > > <[hidden email]> wrote:
> > > > >
> > > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > > > Regarding np.bool specifically, if you want to deprecate
> > > > > > this,
> > > > > > you
> > > > > > might want to discuss this with us at the array API
> > > > > > standard
> > > > > > https://github.com/data-apis/array-api (which is currently
> > > > > > in
> > > > > > RFC
> > > > > > stage). The spec uses bool as the name for the boolean
> > > > > > dtype.
> > > > > >
> > > > > > Would it make sense for NumPy to change np.bool to just be
> > > > > > the
> > > > > > boolean
> > > > > > dtype object? Unlike int and float, there is no ambiguity
> > > > > > with
> > > > > > bool,
> > > > > > and NumPy clearly doesn't have any issues with shadowing
> > > > > > builtin
> > > > > > names
> > > > > > in its namespace.
> > > > >
> > > > > We could keep the Python alias around (which for `dtype=` is
> > > > > the
> > > > > same
> > > > > as `np.bool_`).
> > > > >
> > > > > I am not sure I like the idea of immediately shadowing the
> > > > > builtin.
> > > > > That is a switch we can avoid flipping (without warning);
> > > > > `np.bool_`
> > > > > and `bool` are fairly different beasts? [1]
> > > >
> > > > NumPy already shadows a lot of builtins, in many cases, in ways
> > > > that
> > > > are incompatible with existing ones. It's not something I would
> > > > have
> > > > done personally, but it's been this way for a long time.
> > > >
> > >
> > > It may be defensible to keep np.bool as an alias for Python's
> > > bool
> > > even when we remove the other aliases.
> >
>
> I'd agree with that.
>
>
> > That is true, `int` is probably the most confusing, since it is not
> > at
> > all compatible to a Python integer, but rather the "default"
> > integer
> > (which happens to be the same as C `long` currently).
> >
> > So we could focus on `np.int`, `np.long`.  I am a bit unsure
> > whether
> > you would prefer that or are mainly pointing out the possibility?
> >
>
> Not sure what you mean with focus, focus on describing in the release
> notes? Deprecating `np.int` seems like the most beneficial part of
> this
> whole exercise.
>

I meant limiting the current deprecation to `np.int`, maybe `np.long`,
and a "carefully chosen" set.
To be honest, I don't mind either way, so any stronger opinion will tip
the scale for me personally (my default currently is to update the
release notes to recommend the more descriptive names).

There are probably more doc updates that would be nice, I will suggest
updating a separate issue for that.


> Right now, my main take-away from the discussion is that it would be
> > good to clarify the release notes a bit more.
> >
> > Using `float` for a dtype seems fine to me, but I prefer mentioning
> > `np.float64` over `np.float_`.
> > For integers, I wonder if we should also suggest `np.int64`, even –
> > or
> > because – if the default integer on many systems is currently
> > `np.int_`?
> >
>
> I agree. I think we should recommend sane, descriptive names that do
> the
> right thing. So ideally we'd have people spell their dtype specifiers
> as
>   dtype=bool  # or np.bool
>   dtype=np.float64
>   dtype=np.int64
>   dtype=np.complex128
> The names with underscores at the end make little sense from a UX
> perspective. And the C equivalents (single/double/etc) made sense 15
> years
> ago, but with the user base of today - the majority of whom will not
> know C
> fluently or at all - also don't make too much sense.
>
> The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and
> 64
> bits is likely to be a pitfall much more often than it is what the
> user
> actually needs, so shouldn't be recommended and probably deserves a
> warning
> in the docs.

Right, there is one slight trickery because `np.intp` is often a great
integer dtype to use, because it is the integer that NumPy uses for all
things related to indexing and array sizes.
(I would be happy to dig out my PR making `np.intp` the default NumPy
integer.)

Cheers,

Sebastian


>
> Cheers,
> Ralf
>
>
> >
> > >
> > > np.int_ and np.float_ have fixed precision, which makes them
> > > somewhat
> > > different from the builtin types. NumPy has a whole bunch of
> > > different
> > > precisions for integer and floats, so this distinction matters.
> > >
> > > In contrast, there is only one boolean dtype in NumPy, which
> > > matches
> > > Python's bool. So we wouldn't have to worry, for example, about
> > > whether a
> > > user has requested a specific precision explicitly. This comes up
> > > in
> > > issues
> > > like type-promotion where libraries like JAX and PyTorch have
> > > special
> > > case
> > > logic for most Python types vs NumPy dtypes (but booleans are the
> > > same for
> > > both):
> > > https://jax.readthedocs.io/en/latest/type_promotion.html
> >
> >
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

ralfgommers


On Fri, Dec 11, 2020 at 9:47 AM Eric Wieser <[hidden email]> wrote:
>  you might want to discuss this with us at the array API standard
https://github.com/data-apis/array-api (which is currently in RFC
> stage). The spec uses bool as the name for the boolean dtype.

I don't fully understand this argument - `np.bool` is already not the boolean dtype. Either:

* The spec is suggesting that `pkg.bool` be some arbitrary object that can be passed into a dtype argument and will produce a boolean array.
  If this is the case, the spec could also just require that `dtype=builtins.bool` have this behavior.

Yes, this.

* The spec is suggesting that `pkg.bool` is some rich dtype object.
  Ignoring the question of whether this should be `np.bool_` or `np.dtype(np.bool_)`, it's currently neither, and changing it will break users relying on `np.bool(True) is True`.
  That's not to say this isn't a sensible thing for the specification to have, it's just something that numpy can't conform to without breaking code.

It can have richer behaviour, there's no constraints there - but it's not necessary.


While it would be great if `np.bool_` could be spelt `np.bool`, I really don't think we can make that change without a long deprecation first (if at all).

Given that that standard API would be in a new namespace (given backwards compat we can't possibly introduce it in the main namespace), there `bool` can be the numpy boolean dtype (if desired).

The key point is that `bool_` is a terrible name, and keeping `np.bool` that you can use as a dtype specifier is desirable.

Cheers,
Ralf


Eric

On Thu, 10 Dec 2020 at 20:00, Sebastian Berg <[hidden email]> wrote:
On Thu, 2020-12-10 at 20:38 +0100, Ralf Gommers wrote:
> On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <
> [hidden email]>
> wrote:
>
> > On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> > > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]>
> > > wrote:
> > >
> > > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > > > <[hidden email]> wrote:
> > > > >
> > > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > > > Regarding np.bool specifically, if you want to deprecate
> > > > > > this,
> > > > > > you
> > > > > > might want to discuss this with us at the array API
> > > > > > standard
> > > > > > https://github.com/data-apis/array-api (which is currently
> > > > > > in
> > > > > > RFC
> > > > > > stage). The spec uses bool as the name for the boolean
> > > > > > dtype.
> > > > > >
> > > > > > Would it make sense for NumPy to change np.bool to just be
> > > > > > the
> > > > > > boolean
> > > > > > dtype object? Unlike int and float, there is no ambiguity
> > > > > > with
> > > > > > bool,
> > > > > > and NumPy clearly doesn't have any issues with shadowing
> > > > > > builtin
> > > > > > names
> > > > > > in its namespace.
> > > > >
> > > > > We could keep the Python alias around (which for `dtype=` is
> > > > > the
> > > > > same
> > > > > as `np.bool_`).
> > > > >
> > > > > I am not sure I like the idea of immediately shadowing the
> > > > > builtin.
> > > > > That is a switch we can avoid flipping (without warning);
> > > > > `np.bool_`
> > > > > and `bool` are fairly different beasts? [1]
> > > >
> > > > NumPy already shadows a lot of builtins, in many cases, in ways
> > > > that
> > > > are incompatible with existing ones. It's not something I would
> > > > have
> > > > done personally, but it's been this way for a long time.
> > > >
> > >
> > > It may be defensible to keep np.bool as an alias for Python's
> > > bool
> > > even when we remove the other aliases.
> >
>
> I'd agree with that.
>
>
> > That is true, `int` is probably the most confusing, since it is not
> > at
> > all compatible to a Python integer, but rather the "default"
> > integer
> > (which happens to be the same as C `long` currently).
> >
> > So we could focus on `np.int`, `np.long`.  I am a bit unsure
> > whether
> > you would prefer that or are mainly pointing out the possibility?
> >
>
> Not sure what you mean with focus, focus on describing in the release
> notes? Deprecating `np.int` seems like the most beneficial part of
> this
> whole exercise.
>

I meant limiting the current deprecation to `np.int`, maybe `np.long`,
and a "carefully chosen" set.
To be honest, I don't mind either way, so any stronger opinion will tip
the scale for me personally (my default currently is to update the
release notes to recommend the more descriptive names).

There are probably more doc updates that would be nice, I will suggest
updating a separate issue for that.


> Right now, my main take-away from the discussion is that it would be
> > good to clarify the release notes a bit more.
> >
> > Using `float` for a dtype seems fine to me, but I prefer mentioning
> > `np.float64` over `np.float_`.
> > For integers, I wonder if we should also suggest `np.int64`, even –
> > or
> > because – if the default integer on many systems is currently
> > `np.int_`?
> >
>
> I agree. I think we should recommend sane, descriptive names that do
> the
> right thing. So ideally we'd have people spell their dtype specifiers
> as
>   dtype=bool  # or np.bool
>   dtype=np.float64
>   dtype=np.int64
>   dtype=np.complex128
> The names with underscores at the end make little sense from a UX
> perspective. And the C equivalents (single/double/etc) made sense 15
> years
> ago, but with the user base of today - the majority of whom will not
> know C
> fluently or at all - also don't make too much sense.
>
> The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and
> 64
> bits is likely to be a pitfall much more often than it is what the
> user
> actually needs, so shouldn't be recommended and probably deserves a
> warning
> in the docs.

Right, there is one slight trickery because `np.intp` is often a great
integer dtype to use, because it is the integer that NumPy uses for all
things related to indexing and array sizes.
(I would be happy to dig out my PR making `np.intp` the default NumPy
integer.)

Cheers,

Sebastian


>
> Cheers,
> Ralf
>
>
> >
> > >
> > > np.int_ and np.float_ have fixed precision, which makes them
> > > somewhat
> > > different from the builtin types. NumPy has a whole bunch of
> > > different
> > > precisions for integer and floats, so this distinction matters.
> > >
> > > In contrast, there is only one boolean dtype in NumPy, which
> > > matches
> > > Python's bool. So we wouldn't have to worry, for example, about
> > > whether a
> > > user has requested a specific precision explicitly. This comes up
> > > in
> > > issues
> > > like type-promotion where libraries like JAX and PyTorch have
> > > special
> > > case
> > > logic for most Python types vs NumPy dtypes (but booleans are the
> > > same for
> > > both):
> > > https://jax.readthedocs.io/en/latest/type_promotion.html
> >
> >
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: np.{bool,float,int} deprecation

ralfgommers
In reply to this post by Sebastian Berg


On Thu, Dec 10, 2020 at 9:00 PM Sebastian Berg <[hidden email]> wrote:
On Thu, 2020-12-10 at 20:38 +0100, Ralf Gommers wrote:
> On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <
> [hidden email]>
> wrote:
>
> > On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> > > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <[hidden email]>
> > > wrote:
> > >
> > > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > > > <[hidden email]> wrote:
> > > > >
> > > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > > > Regarding np.bool specifically, if you want to deprecate
> > > > > > this,
> > > > > > you
> > > > > > might want to discuss this with us at the array API
> > > > > > standard
> > > > > > https://github.com/data-apis/array-api (which is currently
> > > > > > in
> > > > > > RFC
> > > > > > stage). The spec uses bool as the name for the boolean
> > > > > > dtype.
> > > > > >
> > > > > > Would it make sense for NumPy to change np.bool to just be
> > > > > > the
> > > > > > boolean
> > > > > > dtype object? Unlike int and float, there is no ambiguity
> > > > > > with
> > > > > > bool,
> > > > > > and NumPy clearly doesn't have any issues with shadowing
> > > > > > builtin
> > > > > > names
> > > > > > in its namespace.
> > > > >
> > > > > We could keep the Python alias around (which for `dtype=` is
> > > > > the
> > > > > same
> > > > > as `np.bool_`).
> > > > >
> > > > > I am not sure I like the idea of immediately shadowing the
> > > > > builtin.
> > > > > That is a switch we can avoid flipping (without warning);
> > > > > `np.bool_`
> > > > > and `bool` are fairly different beasts? [1]
> > > >
> > > > NumPy already shadows a lot of builtins, in many cases, in ways
> > > > that
> > > > are incompatible with existing ones. It's not something I would
> > > > have
> > > > done personally, but it's been this way for a long time.
> > > >
> > >
> > > It may be defensible to keep np.bool as an alias for Python's
> > > bool
> > > even when we remove the other aliases.
> >
>
> I'd agree with that.
>
>
> > That is true, `int` is probably the most confusing, since it is not
> > at
> > all compatible to a Python integer, but rather the "default"
> > integer
> > (which happens to be the same as C `long` currently).
> >
> > So we could focus on `np.int`, `np.long`.  I am a bit unsure
> > whether
> > you would prefer that or are mainly pointing out the possibility?
> >
>
> Not sure what you mean with focus, focus on describing in the release
> notes? Deprecating `np.int` seems like the most beneficial part of
> this
> whole exercise.
>

I meant limiting the current deprecation to `np.int`, maybe `np.long`,
and a "carefully chosen" set.

Just deprecation `np.int` may make sense. That will already raise awareness, and leaving `np.float` as-is may prevent a lot of churn. And we could then still deprecate `np.float` later. I also don't feel strongly about `float` either way though.

I'm not sure why you'd specifically touch `long`, it's not really relevant and it's not a builtin.

Cheers,
Ralf

To be honest, I don't mind either way, so any stronger opinion will tip
the scale for me personally (my default currently is to update the
release notes to recommend the more descriptive names).

There are probably more doc updates that would be nice, I will suggest
updating a separate issue for that.


> Right now, my main take-away from the discussion is that it would be
> > good to clarify the release notes a bit more.
> >
> > Using `float` for a dtype seems fine to me, but I prefer mentioning
> > `np.float64` over `np.float_`.
> > For integers, I wonder if we should also suggest `np.int64`, even –
> > or
> > because – if the default integer on many systems is currently
> > `np.int_`?
> >
>
> I agree. I think we should recommend sane, descriptive names that do
> the
> right thing. So ideally we'd have people spell their dtype specifiers
> as
>   dtype=bool  # or np.bool
>   dtype=np.float64
>   dtype=np.int64
>   dtype=np.complex128
> The names with underscores at the end make little sense from a UX
> perspective. And the C equivalents (single/double/etc) made sense 15
> years
> ago, but with the user base of today - the majority of whom will not
> know C
> fluently or at all - also don't make too much sense.
>
> The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and
> 64
> bits is likely to be a pitfall much more often than it is what the
> user
> actually needs, so shouldn't be recommended and probably deserves a
> warning
> in the docs.

Right, there is one slight trickery because `np.intp` is often a great
integer dtype to use, because it is the integer that NumPy uses for all
things related to indexing and array sizes.
(I would be happy to dig out my PR making `np.intp` the default NumPy
integer.)

Cheers,

Sebastian


>
> Cheers,
> Ralf
>
>
> >
> > >
> > > np.int_ and np.float_ have fixed precision, which makes them
> > > somewhat
> > > different from the builtin types. NumPy has a whole bunch of
> > > different
> > > precisions for integer and floats, so this distinction matters.
> > >
> > > In contrast, there is only one boolean dtype in NumPy, which
> > > matches
> > > Python's bool. So we wouldn't have to worry, for example, about
> > > whether a
> > > user has requested a specific precision explicitly. This comes up
> > > in
> > > issues
> > > like type-promotion where libraries like JAX and PyTorch have
> > > special
> > > case
> > > logic for most Python types vs NumPy dtypes (but booleans are the
> > > same for
> > > both):
> > > https://jax.readthedocs.io/en/latest/type_promotion.html
> >
> >
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
12