Type resolver related deprecations and changes

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Type resolver related deprecations and changes

Sebastian Berg
Hi all,

I have to do some changes to the type resolution, and I started these
here: https://github.com/numpy/numpy/pull/18718

There are four changes:

* Deprecate `signature="l"` and `signature=("l",)`, these are confusing
  since the signature should include all inputs and outputs.  To only
  provide the output use `dtype="l"`.

* Using `dtype=` for comparisons (e.g. `np.equal`) used to be weird:

      np.equal(1, 2, dtype=object)  -> returns boolean
      np.equal(None, 2, dtype=object)  -> returns object array

  The first one will now give a FutureWarning. Comparisons that provide
  a dtype other than object or bool give a DeprecationWarning (or
  fail).
  I hope the warning can be preserved when more refactoring happens.

* NumPy *almost* always ignores any metadata, byte-order, time unit
  information from the `dtype` or `signature` arguments to ufuncs.
  Practically, the dtypes passed actually denote the DType types
  rather than the specific instance (which could be byte swapped).

  NumPy will now do this always and give a warning if byte-order or
  time unit is ignored!

* It is THEORETICALLY possible to call `ufunc->type_resolver` in the C
  API (as opposed to providing it, which is somewhat OK).
  If someone does that they have to normalize the type tuple now, I
  don't really see a reason for keeping support, when NumPy will stop
  calling it itself almost always and anyone using it would probably be
  in trouble soon.
  To be clear: I have NOT found a single instance of such code in a
  code search. Even *providing* it – which is much more reasonable –
  is probably only done by astropy/pyerfa.



** Long example for the "time unit" dropping change **

For the third point, which is in theory the largest impact. Both pandas
and astropy do not notice it (I also grepped scipy, its clean).
These are the biggest changes:

    # The following will now warn on most systems (unchanged result):
    np.add(3, 5, dtype=">i32")

    # The biggest impact is for timedelta or datetimes:
    arr = np.arange(10, dtype="m8[s]")
    # The examples always ignored the time unit "ns" (using the
    # unit of `arr`.  They now issue a warning:
    np.add(arr, arr, dtype="m8[ns]")
    np.maximum.reduce(arr, dtype="m8[ns]")

    # The following issue a warning but previously did return
    # a "ns" result.
    np.add(3, 5, dtype="m8[ns]")  # Now return generic time units
    np.maximum(arr, arr, dtype="m8[ns]")  # Now returns "s" (from `arr`)


I doubt there is a good way to just keep the old behaviour.  It is
hopelessly inconsistent.  (The result even depends on how you pass
things, as the paths I deprecate align with `dtype=` but not with the
`signature=(None, None, dtype)` equivalent call.)
One thought I had just raising a hard error instead of a UserWarning
right now.

If you say that why not have `dtype=...` always be honored as the
correct output dtype, I don't disagree.  But it seems to me we probably
would have wade through a FutureWarning first, so a warning is the
"right direction".  (Or just add an `output_dtypes` keyword argument.)

Cheers,

Sebastian


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Type resolver related deprecations and changes

Sebastian Berg
On Fri, 2021-04-02 at 15:46 -0500, Sebastian Berg wrote:
> Hi all,
>
> I have to do some changes to the type resolution, and I started these
> here: https://github.com/numpy/numpy/pull/18718
>

Just to keep everyone updated.  We discussed this yesterdays meeting,
and settled with going with these changes and going directly to an
error (instead of warning).
This may mean that some users have to update their code, but in that
case the code is likely to do something unexpected already.  I expect
very few users to be affected, so please let us know if you are.

If anyone notices issues with any of these changes I am happy to look
whether we can mitigate the issue or avoid/delay a specific change!

In general these changes are necessary for refactors with regards to
new DTypes NEP 41, 42, and specifically the draft NEP 43.  (In large
parts this is because NumPy is currently inconsistent.)

Cheers,

Sebastian


> There are four changes:
>
> * Deprecate `signature="l"` and `signature=("l",)`, these are
> confusing
>   since the signature should include all inputs and outputs.  To only
>   provide the output use `dtype="l"`.
>
> * Using `dtype=` for comparisons (e.g. `np.equal`) used to be weird:
>
>       np.equal(1, 2, dtype=object)  -> returns boolean
>       np.equal(None, 2, dtype=object)  -> returns object array
>
>   The first one will now give a FutureWarning. Comparisons that
> provide
>   a dtype other than object or bool give a DeprecationWarning (or
>   fail).
>   I hope the warning can be preserved when more refactoring happens.
>
> * NumPy *almost* always ignores any metadata, byte-order, time unit
>   information from the `dtype` or `signature` arguments to ufuncs.
>   Practically, the dtypes passed actually denote the DType types
>   rather than the specific instance (which could be byte swapped).
>
>   NumPy will now do this always and give a warning if byte-order or
>   time unit is ignored!
>
> * It is THEORETICALLY possible to call `ufunc->type_resolver` in the
> C
>   API (as opposed to providing it, which is somewhat OK).
>   If someone does that they have to normalize the type tuple now, I
>   don't really see a reason for keeping support, when NumPy will stop
>   calling it itself almost always and anyone using it would probably
> be
>   in trouble soon.
>   To be clear: I have NOT found a single instance of such code in a
>   code search. Even *providing* it – which is much more reasonable –
>   is probably only done by astropy/pyerfa.
>
>
>
> ** Long example for the "time unit" dropping change **
>
> For the third point, which is in theory the largest impact. Both
> pandas
> and astropy do not notice it (I also grepped scipy, its clean).
> These are the biggest changes:
>
>     # The following will now warn on most systems (unchanged result):
>     np.add(3, 5, dtype=">i32")
>
>     # The biggest impact is for timedelta or datetimes:
>     arr = np.arange(10, dtype="m8[s]")
>     # The examples always ignored the time unit "ns" (using the
>     # unit of `arr`.  They now issue a warning:
>     np.add(arr, arr, dtype="m8[ns]")
>     np.maximum.reduce(arr, dtype="m8[ns]")
>
>     # The following issue a warning but previously did return
>     # a "ns" result.
>     np.add(3, 5, dtype="m8[ns]")  # Now return generic time units
>     np.maximum(arr, arr, dtype="m8[ns]")  # Now returns "s" (from
> `arr`)
>
>
> I doubt there is a good way to just keep the old behaviour.  It is
> hopelessly inconsistent.  (The result even depends on how you pass
> things, as the paths I deprecate align with `dtype=` but not with the
> `signature=(None, None, dtype)` equivalent call.)
> One thought I had just raising a hard error instead of a UserWarning
> right now.
>
> If you say that why not have `dtype=...` always be honored as the
> correct output dtype, I don't disagree.  But it seems to me we
> probably
> would have wade through a FutureWarning first, so a warning is the
> "right direction".  (Or just add an `output_dtypes` keyword
> argument.)
>
> Cheers,
>
> Sebastian
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment