__skip_array_function__ discussion summary

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

__skip_array_function__ discussion summary

Sebastian Berg
Hi all,

This is an attempt from me to wrap up the discussion a bit so that
others can chime in if they want to.

NumPy 1.17 will ship with `__array_function__` a way for array like
projects (dask, cupy) to override almost all numpy functions [0]. This
addition is uncontroversial.
NumPy 1.17 will _not_ ship with the `__skip_array_funciton__` following
a longer dicussion. For those interested, I tried to give an very short
overview over the topic below.


The discussion here is around the addition of `__skip_array_function__`
which would allow code to use:

np.ones_like.__skip_array_function__(*args)

to reuse the current implementation in numpy (i.e. directly call the
current code). This can simplify things drastically for some array-
likes, since they do not have to provide an alternative implementation.
However, PR-13585 [1] sparked a more detailed discussion, since it was
going to add the use of `__skip_array_function__` internally in numpy
[2].

The issue is exposure of implementation details. If we do not use it
internally, a user may implement their own `np.empty_like` and rely on
`np.ones_like` to use `np.empty_like` [3] internally. Thus,
`np.ones_like(my_array_like)` can work without `my_array_like` having
any special code for `np.ones_like`.

The PR exposes the issue that if `np.ones_like` is changed to call
`np.empty_like.__skip_array_function__` internally, this will break the
users `my_array_like` (it will not call their own `np.empty_like`
implementation.

We could expect users to fix up such breaking changes, but it exposes
how fragile the interaction of user types using
`__skip_array_function__` and changes in the specific implementation
used by numpy can be in some cases.

The second option would be to make sure we use
`__skip_array_function__` internally, so that users cannot expect
`np.ones_like` to work because they made `np.empty_like` work in the
above example (does not increase the "API surface" of NumPy).

Plus it increases the issue that the numpy code itself is less readable
if we use `__skip_array_function__` internally in many/all places.

Those two options further have very different goals in mind for the
final usage of the protocol. So that right now the solution is to step
back, not include the addition and rather gain experience with the
NumPy 1.17 release that includes `__array_function__` but not
`__skip_array_function`.


I hope this may help those interested who did not follow the full
discussion, can't say I feel I am very good at summarizing. For details
I encourage you to have a look at the PR discussion and the recent
mails to the list.

Best,

Sebastian


[0] http://www.numpy.org/neps/nep-0018-array-function-protocol.html#implementations-in-terms-of-a-limited-core-api
[1] https://github.com/numpy/numpy/pull/13585
[2] Mostly for slight optimization.
[3] It also uses `np.copyto` which can be overridden as well.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: __skip_array_function__ discussion summary

Stefan van der Walt
On Thu, 23 May 2019 14:33:17 -0700, Sebastian Berg wrote:
> Those two options further have very different goals in mind for the
> final usage of the protocol. So that right now the solution is to step
> back, not include the addition and rather gain experience with the
> NumPy 1.17 release that includes `__array_function__` but not
> `__skip_array_function`.

To emphasize how this solves the API exposure problem:

If `__skip_array_function__` is being made available, the user can
implement `ones_like` for their custom class as:

class MyArray:
    def __array_function__(func, types, *args, **kwargs):
        if func == np.ones_like:
            return np.ones_like.__skip_array_function__(x)

Without it, they are forced to reimplement `ones_like` from scratch.
This ensures that they never rely on any internal behavior of
`np.ones_like`, which may change at any time to break for their custom
array class.

Here's a concrete example:

The user wants to override `ones_like` and `zeros_like` for their custom
array.  They implement it as follows:

class MyArray:
    def __array_function__(func, types, *args, **kwargs):
        if func == np.ones_like:
            return np.ones_like.__skip_array_function__(*args, **kwargs)
        elif func == np.zeros_like:
            return MyArray(...)

Would this work?  Well, it depends on how NumPy implements `ones_like`
internally.  If NumPy used `__skip_array_function__ consistently
throughout, it would not work:

def np.ones_like(x):
    y = np.zeros_like.__skip_array_function__(x)
    y.fill(1)
    return y

If, instead, the implementation was

def np.ones_like(x):
    y = np.zeros_like(x)
    y.fill(1)
    return y

it would work.  *BUT*, it would be brittle, because our internal
implementation may easily change to:

def np.ones_like(x):
    y = np.empty_like(x)
    y.fill(1)
    return y

And if `empty_like` isn't implemented by MyArray, this would break.


The workaround that Stephan Hoyer mentioned (and that will have to be
used in 1.17) is that you can still use the NumPy machinery to operate
on pure arrays:

class MyArray:
    def __array_function__(func, types, *args, **kwargs):
        if func == np.ones_like:
            x_arr = np.asarray(x)
            ones = np.ones_like(x_arr)
            return MyArray.from_array(ones)

Stéfan

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (541 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: __skip_array_function__ discussion summary

Marten van Kerkwijk
Hi Sebastian, Stéfan,

Thanks for the very good summaries!

An additional item worth mentioning is that by using `__skip_array_function__` everywhere inside, one minimizes the performance penalty of checking for `__array_function__`. It would obviously be worth trying to do that, but ideally in a way that is much less intrusive.

Furthermore, it became clear that there were different pictures of the final goal, with quite a bit of discussion about the relevant benefits of trying the limit exposure of the internal API and of, conversely, trying to (incrementally) move to implementations that are maximally re-usable (using duck-typing), which are themselves based around a smaller core (more in line with Nathaniel's NEP-22).

In the latter respect, Stéfan's example is instructive. The real implementation of `ones_like` is:
```
def ones_like(a, dtype=None, order='K', subok=True, shape=None):
    res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape)
    multiarray.copyto(res, 1, casting='unsafe')
    return res
```

The first step is here seems obvious: an "empty_like" function would seem to belong in the core.
The second step less so: Stéfan's `res.fill(1)` seems more logical, as surely a class's method is the optimal way to do something. Though I do feel `.fill` itself breaks "There should be one-- and preferably only one --obvious way to do it." So, I'd want to replace it with `res[...] = 1`, so that one relies on the more obvious `__setitem__`. (Note that all are equally fast even now.)

Of course, in this idealized future, there would be little reason to even allow `ones_like` to be overridden with __array_function__...

All the best,

Marten

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __skip_array_function__ discussion summary

Stephan Hoyer-2
Sebastian, Stefan and Marten -- thanks for the excellent summaries of the discussion.

In line with this consensus, I have drafted a revision of the NEP without __skip_array_function__: https://github.com/numpy/numpy/pull/13624


On Thu, May 23, 2019 at 5:28 PM Marten van Kerkwijk <[hidden email]> wrote:
Hi Sebastian, Stéfan,

Thanks for the very good summaries!

An additional item worth mentioning is that by using `__skip_array_function__` everywhere inside, one minimizes the performance penalty of checking for `__array_function__`. It would obviously be worth trying to do that, but ideally in a way that is much less intrusive.

Furthermore, it became clear that there were different pictures of the final goal, with quite a bit of discussion about the relevant benefits of trying the limit exposure of the internal API and of, conversely, trying to (incrementally) move to implementations that are maximally re-usable (using duck-typing), which are themselves based around a smaller core (more in line with Nathaniel's NEP-22).

In the latter respect, Stéfan's example is instructive. The real implementation of `ones_like` is:
```
def ones_like(a, dtype=None, order='K', subok=True, shape=None):
    res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape)
    multiarray.copyto(res, 1, casting='unsafe')
    return res
```

The first step is here seems obvious: an "empty_like" function would seem to belong in the core.
The second step less so: Stéfan's `res.fill(1)` seems more logical, as surely a class's method is the optimal way to do something. Though I do feel `.fill` itself breaks "There should be one-- and preferably only one --obvious way to do it." So, I'd want to replace it with `res[...] = 1`, so that one relies on the more obvious `__setitem__`. (Note that all are equally fast even now.)

Of course, in this idealized future, there would be little reason to even allow `ones_like` to be overridden with __array_function__...

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion