Keep __array_function__ unexposed by default for 1.17?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Keep __array_function__ unexposed by default for 1.17?

Marten van Kerkwijk
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Juan Nunez-Iglesias
I just want to express my general support for Marten's concerns. As an "interested observer", I've been meaning to give `__array_function__` a try but haven't had the chance yet. So from my anecdotal experience I expect that more people need to play with this before setting the API in stone.

At scikit-image we place a very strong emphasis on code simplicity and readability, so I also share Marten's concerns about code getting too complex. My impression reading the NEP was "whoa, this is hard, I'm glad smarter people than me are working on this, I'm sure it'll get simpler in time". But I haven't seen the simplicity materialise...

On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Stephan Hoyer-2
Thanks for raising these concerns.

The full implications of my recent __skip_array_function__ proposal are only now becoming evident to me now, looking at it's use in GH-13585. Guaranteeing that it does not expand NumPy's API surface seems hard to achieve without pervasive use of __skip_array_function__ internally.

Taking a step back, the sort of minor hacks [1] that motivated __skip_array_function__ for me are annoying, but really not too bad -- they are a small amount of additional code duplication in a proposal that already requires a large amount of code duplication.

So let's roll back the recent NEP change adding __skip_array_function__ to the public interface [2]. Inside the few NumPy functions where __array_function__ causes a measurable performance impact due to repeated calls (most notably np.block, for which some benchmarks are 25% slower), we can make use of the private __wrapped__ attribute.

I would still like to turn on __array_function__ in NumPy 1.17. At least, let's try that for the release candidate and see how it goes. The "all in" nature of __array_function__ without __skip_array_function__ will both limit its use to cases where it is strongly motivated, and also limits the API implications for NumPy. There is still plenty of room for expanding the protocol, but it's really hard to see what is necessary (and prudent!) without actual use.





On Tue, May 21, 2019 at 11:44 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I just want to express my general support for Marten's concerns. As an "interested observer", I've been meaning to give `__array_function__` a try but haven't had the chance yet. So from my anecdotal experience I expect that more people need to play with this before setting the API in stone.

At scikit-image we place a very strong emphasis on code simplicity and readability, so I also share Marten's concerns about code getting too complex. My impression reading the NEP was "whoa, this is hard, I'm glad smarter people than me are working on this, I'm sure it'll get simpler in time". But I haven't seen the simplicity materialise...

On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Marten van Kerkwijk
Hi Stephan,

I'm quite happy with the idea of turning on __array_function__ but postponing any formal solution to getting into the wrapped routines (i.e., one can use __wrapped__, but it is an implementation detail that is not documented and comes with absolutely no guarantees).

That way, 1.17 will be a release where we can think of how to address two different things:
1. Reduce the overhead costs for pure ndarray cases (i.e., mostly within numpy itself);
2. Simplify implementation in outside packages.

On the performance front, I'm not quite sure what the state of the environment variable check is, but is it possible to just flip the default, i.e., for 1.17 one gets __array_function__ support turned on by default, but can turn it off if wanted?

All the best,

Marten

On Wed, May 22, 2019 at 11:53 AM Stephan Hoyer <[hidden email]> wrote:
Thanks for raising these concerns.

The full implications of my recent __skip_array_function__ proposal are only now becoming evident to me now, looking at it's use in GH-13585. Guaranteeing that it does not expand NumPy's API surface seems hard to achieve without pervasive use of __skip_array_function__ internally.

Taking a step back, the sort of minor hacks [1] that motivated __skip_array_function__ for me are annoying, but really not too bad -- they are a small amount of additional code duplication in a proposal that already requires a large amount of code duplication.

So let's roll back the recent NEP change adding __skip_array_function__ to the public interface [2]. Inside the few NumPy functions where __array_function__ causes a measurable performance impact due to repeated calls (most notably np.block, for which some benchmarks are 25% slower), we can make use of the private __wrapped__ attribute.

I would still like to turn on __array_function__ in NumPy 1.17. At least, let's try that for the release candidate and see how it goes. The "all in" nature of __array_function__ without __skip_array_function__ will both limit its use to cases where it is strongly motivated, and also limits the API implications for NumPy. There is still plenty of room for expanding the protocol, but it's really hard to see what is necessary (and prudent!) without actual use.





On Tue, May 21, 2019 at 11:44 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I just want to express my general support for Marten's concerns. As an "interested observer", I've been meaning to give `__array_function__` a try but haven't had the chance yet. So from my anecdotal experience I expect that more people need to play with this before setting the API in stone.

At scikit-image we place a very strong emphasis on code simplicity and readability, so I also share Marten's concerns about code getting too complex. My impression reading the NEP was "whoa, this is hard, I'm glad smarter people than me are working on this, I'm sure it'll get simpler in time". But I haven't seen the simplicity materialise...

On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

ralfgommers


On Wed, May 22, 2019 at 9:46 PM Marten van Kerkwijk <[hidden email]> wrote:
Hi Stephan,

I'm quite happy with the idea of turning on __array_function__ but postponing any formal solution to getting into the wrapped routines (i.e., one can use __wrapped__, but it is an implementation detail that is not documented and comes with absolutely no guarantees).

That way, 1.17 will be a release where we can think of how to address two different things:
1. Reduce the overhead costs for pure ndarray cases (i.e., mostly within numpy itself);
2. Simplify implementation in outside packages.

On the performance front, I'm not quite sure what the state of the environment variable check is, but is it possible to just flip the default, i.e., for 1.17 one gets __array_function__ support turned on by default, but can turn it off if wanted?

This would be useful as a safety measure.


All the best,

Marten

On Wed, May 22, 2019 at 11:53 AM Stephan Hoyer <[hidden email]> wrote:
Thanks for raising these concerns.

The full implications of my recent __skip_array_function__ proposal are only now becoming evident to me now, looking at it's use in GH-13585. Guaranteeing that it does not expand NumPy's API surface seems hard to achieve without pervasive use of __skip_array_function__ internally.

Taking a step back, the sort of minor hacks [1] that motivated __skip_array_function__ for me are annoying, but really not too bad -- they are a small amount of additional code duplication in a proposal that already requires a large amount of code duplication.

So let's roll back the recent NEP change adding __skip_array_function__ to the public interface [2]. Inside the few NumPy functions where __array_function__ causes a measurable performance impact due to repeated calls (most notably np.block, for which some benchmarks are 25% slower), we can make use of the private __wrapped__ attribute.

Thanks Stephan, this sounds good.


I would still like to turn on __array_function__ in NumPy 1.17. At least, let's try that for the release candidate and see how it goes.

I agree. I'd actually suggest flipping the switch asap and see if it causes any issues for projects that test against numpy master in their CI, and the people that like to live on the bleeding edge by installing master into their environment.

Cheers,
Ralf


The "all in" nature of __array_function__ without __skip_array_function__ will both limit its use to cases where it is strongly motivated, and also limits the API implications for NumPy. There is still plenty of room for expanding the protocol, but it's really hard to see what is necessary (and prudent!) without actual use.





On Tue, May 21, 2019 at 11:44 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I just want to express my general support for Marten's concerns. As an "interested observer", I've been meaning to give `__array_function__` a try but haven't had the chance yet. So from my anecdotal experience I expect that more people need to play with this before setting the API in stone.

At scikit-image we place a very strong emphasis on code simplicity and readability, so I also share Marten's concerns about code getting too complex. My impression reading the NEP was "whoa, this is hard, I'm glad smarter people than me are working on this, I'm sure it'll get simpler in time". But I haven't seen the simplicity materialise...

On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Stephan Hoyer-2
On Wed, May 22, 2019 at 2:36 PM Ralf Gommers <[hidden email]> wrote:
I would still like to turn on __array_function__ in NumPy 1.17. At least, let's try that for the release candidate and see how it goes.

I agree. I'd actually suggest flipping the switch asap and see if it causes any issues for projects that test against numpy master in their CI, and the people that like to live on the bleeding edge by installing master into their environment.

The switch actually has already been done on master for several months now, until for a period in the 1.16 release cycle before we added the off switch. Doing so did turn up a few bugs, e.g., https://github.com/numpy/numpy/issues/12263 

We will actually need to re-add in the code that does the environment variable to allow for turning it off, but this isn't a big deal. My main concern is that this adds some complexity for third-party projects in detecting whether __array_function__ is enabled or not. They can't just use the NumPy version and will need to check the environment variable as well, or actually try using it on an example object.

If we want to keep an "off" switch we might want to add some sort of API for exposing whether NumPy is using __array_function__ or not. Maybe numpy.__experimental_array_function_enabled__ = True, so you can just test `hasattr(numpy, '__experimental_array_function_enabled__')`? This is assuming that we are OK with adding an underscore attribute to NumPy's namespace semi-indefinitely.
 

Cheers,
Ralf


The "all in" nature of __array_function__ without __skip_array_function__ will both limit its use to cases where it is strongly motivated, and also limits the API implications for NumPy. There is still plenty of room for expanding the protocol, but it's really hard to see what is necessary (and prudent!) without actual use.





On Tue, May 21, 2019 at 11:44 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I just want to express my general support for Marten's concerns. As an "interested observer", I've been meaning to give `__array_function__` a try but haven't had the chance yet. So from my anecdotal experience I expect that more people need to play with this before setting the API in stone.

At scikit-image we place a very strong emphasis on code simplicity and readability, so I also share Marten's concerns about code getting too complex. My impression reading the NEP was "whoa, this is hard, I'm glad smarter people than me are working on this, I'm sure it'll get simpler in time". But I haven't seen the simplicity materialise...

On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Marten van Kerkwijk

If we want to keep an "off" switch we might want to add some sort of API for exposing whether NumPy is using __array_function__ or not. Maybe numpy.__experimental_array_function_enabled__ = True, so you can just test `hasattr(numpy, '__experimental_array_function_enabled__')`? This is assuming that we are OK with adding an underscore attribute to NumPy's namespace semi-indefinitely.

Might this be overthinking it? I might use this myself on supercomputer runs were I know that I'm using arrays only. Though one should not extrapolate from oneself!

That said, it is not difficult as is. For instance, we could explain in the docs that one can tell from:
```
enabled = hasattr(np.core, 'overrides') and np.core.overrides.ENABLE_ARRAY_FUNCTION
```
One could even allow for eventual removal by explaining it should be,
```
enabled = hasattr(np.core, 'overrides') and getattr(np.core.overrides, 'ENABLE_ARRAY_FUNCTION', True)
```
(If I understand correctly, one cannot tell from the presence of `ndarray.__array_function__`, correct?)

-- Marten


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

ralfgommers


On Thu, May 23, 2019 at 3:02 AM Marten van Kerkwijk <[hidden email]> wrote:

If we want to keep an "off" switch we might want to add some sort of API for exposing whether NumPy is using __array_function__ or not. Maybe numpy.__experimental_array_function_enabled__ = True, so you can just test `hasattr(numpy, '__experimental_array_function_enabled__')`? This is assuming that we are OK with adding an underscore attribute to NumPy's namespace semi-indefinitely.

I don't think we want to add or document anything publicly. That only adds to the configuration problem, and indeed makes it harder to rely on the issue. All I was suggested was keeping some (private) safety switch in the code base for a while in case of real issues as a workaround.



Might this be overthinking it? I might use this myself on supercomputer runs were I know that I'm using arrays only. Though one should not extrapolate from oneself!

That said, it is not difficult as is. For instance, we could explain in the docs that one can tell from:
```
enabled = hasattr(np.core, 'overrides') and np.core.overrides.ENABLE_ARRAY_FUNCTION
```
One could even allow for eventual removal by explaining it should be,
```
enabled = hasattr(np.core, 'overrides') and getattr(np.core.overrides, 'ENABLE_ARRAY_FUNCTION', True)
```
(If I understand correctly, one cannot tell from the presence of `ndarray.__array_function__`, correct?)

I think a hasattr check for __array_function__ is right.

Ralf


-- Marten

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Stephan Hoyer-2
On Thu, May 23, 2019 at 2:43 AM Ralf Gommers <[hidden email]> wrote:


On Thu, May 23, 2019 at 3:02 AM Marten van Kerkwijk <[hidden email]> wrote:

If we want to keep an "off" switch we might want to add some sort of API for exposing whether NumPy is using __array_function__ or not. Maybe numpy.__experimental_array_function_enabled__ = True, so you can just test `hasattr(numpy, '__experimental_array_function_enabled__')`? This is assuming that we are OK with adding an underscore attribute to NumPy's namespace semi-indefinitely.

I don't think we want to add or document anything publicly. That only adds to the configuration problem, and indeed makes it harder to rely on the issue. All I was suggested was keeping some (private) safety switch in the code base for a while in case of real issues as a workaround.

I was concerned that libraries dask might have different behavior internally depending upon whether or not __array_function__ is enabled, but looking more carefully dask only does this detection for tests. So maybe this is not needed.

Still, I'm concerned about the potential broader implications of making it possibly to turn this off. In general, I don't think NumPy should have configurable global state -- it opens up the possibility of a whole class of issues. Stefan van der Walt raised this point when this "off switch" was suggested a few months ago:

That said, I'd be OK with keeping around an environment variable as an emergency opt-out for now, especially to support benchmarking the impact of __array_function__ checks.

But I would definitely be opposed to keeping around this switch around long term, for more than a major version or two. If there will be an outcry when we remove checks for NUMPY_EXPERIMENTAL_ARRAY_FUNCTION, then we should reconsider the entire __array_function__ approach.

Might this be overthinking it? I might use this myself on supercomputer runs were I know that I'm using arrays only. Though one should not extrapolate from oneself!

That said, it is not difficult as is. For instance, we could explain in the docs that one can tell from:
```
enabled = hasattr(np.core, 'overrides') and np.core.overrides.ENABLE_ARRAY_FUNCTION
```
One could even allow for eventual removal by explaining it should be,
```
enabled = hasattr(np.core, 'overrides') and getattr(np.core.overrides, 'ENABLE_ARRAY_FUNCTION', True)
```
(If I understand correctly, one cannot tell from the presence of `ndarray.__array_function__`, correct?)

I think a hasattr check for __array_function__ is right.

We define ndarray.__array_function__ (even on NumPy 1.16) regardless of whether __array_function__ is enabled or not.

In principle we could have checked the environment variable from C before defining the method, but it's too late for that now.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Hameer Abbasi
On Thu, 2019-05-23 at 10:19 -0700, Stephan Hoyer wrote:
On Thu, May 23, 2019 at 2:43 AM Ralf Gommers <[hidden email]> wrote:


On Thu, May 23, 2019 at 3:02 AM Marten van Kerkwijk <[hidden email]> wrote:

If we want to keep an "off" switch we might want to add some sort of API for exposing whether NumPy is using __array_function__ or not. Maybe numpy.__experimental_array_function_enabled__ = True, so you can just test `hasattr(numpy, '__experimental_array_function_enabled__')`? This is assuming that we are OK with adding an underscore attribute to NumPy's namespace semi-indefinitely.


I don't think we want to add or document anything publicly. That only adds to the configuration problem, and indeed makes it harder to rely on the issue. All I was suggested was keeping some (private) safety switch in the code base for a while in case of real issues as a workaround.


I was concerned that libraries dask might have different behavior internally depending upon whether or not __array_function__ is enabled, but looking more carefully dask only does this detection for tests. So maybe this is not needed.

Still, I'm concerned about the potential broader implications of making it possibly to turn this off. In general, I don't think NumPy should have configurable global state -- it opens up the possibility of a whole class of issues. Stefan van der Walt raised this point when this "off switch" was suggested a few months ago:

I agree -- Global mutable state is bad in general, but keeping around the environment variable is okay.


That said, I'd be OK with keeping around an environment variable as an emergency opt-out for now, especially to support benchmarking the impact of __array_function__ checks.

+1 for keeping the env var for now.


But I would definitely be opposed to keeping around this switch around long term, for more than a major version or two. If there will be an outcry when we remove checks for NUMPY_EXPERIMENTAL_ARRAY_FUNCTION, then we should reconsider the entire __array_function__ approach.

Might this be overthinking it? I might use this myself on supercomputer runs were I know that I'm using arrays only. Though one should not extrapolate from oneself!

That said, it is not difficult as is. For instance, we could explain in the docs that one can tell from:
```
enabled = hasattr(np.core, 'overrides') and np.core.overrides.ENABLE_ARRAY_FUNCTION

```
One could even allow for eventual removal by explaining it should be,
```
enabled = hasattr(np.core, 'overrides') and getattr(np.core.overrides, 'ENABLE_ARRAY_FUNCTION', True)
```
(If I understand correctly, one cannot tell from the presence of `ndarray.__array_function__`, correct?)


I think a hasattr check for __array_function__ is right.


We define ndarray.__array_function__ (even on NumPy 1.16) regardless of whether __array_function__ is enabled or not.

In principle we could have checked the environment variable from C before defining the method, but it's too late for that now.

I disagree here: In principle the only people relying on this would be the same ones relying on the functionality of this protocol, so this would be an easy change to undo, if at all needed. I do not know of any libraries that actually use/call the __array_function__ attribute other than NumPy, when it isn't enabled.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]

https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Keep __array_function__ unexposed by default for 1.17?

Hameer Abbasi
In reply to this post by Stephan Hoyer-2
On Wed, 2019-05-22 at 08:52 -0700, Stephan Hoyer wrote:
Thanks for raising these concerns.

The full implications of my recent __skip_array_function__ proposal are only now becoming evident to me now, looking at it's use in GH-13585. Guaranteeing that it does not expand NumPy's API surface seems hard to achieve without pervasive use of __skip_array_function__ internally.

Taking a step back, the sort of minor hacks [1] that motivated __skip_array_function__ for me are annoying, but really not too bad -- they are a small amount of additional code duplication in a proposal that already requires a large amount of code duplication.

So let's roll back the recent NEP change adding __skip_array_function__ to the public interface [2]. Inside the few NumPy functions where __array_function__ causes a measurable performance impact due to repeated calls (most notably np.block, for which some benchmarks are 25% slower), we can make use of the private __wrapped__ attribute.

I would still like to turn on __array_function__ in NumPy 1.17. At least, let's try that for the release candidate and see how it goes. The "all in" nature of __array_function__ without __skip_array_function__ will both limit its use to cases where it is strongly motivated, and also limits the API implications for NumPy. There is still plenty of room for expanding the protocol, but it's really hard to see what is necessary (and prudent!) without actual use.

Agreed that we should turn it on for 1.17 RC, and see if there are any complaints.


On Tue, May 21, 2019 at 11:44 PM Juan Nunez-Iglesias <[hidden email]> wrote:
I just want to express my general support for Marten's concerns. As an "interested observer", I've been meaning to give `__array_function__` a try but haven't had the chance yet. So from my anecdotal experience I expect that more people need to play with this before setting the API in stone.

At scikit-image we place a very strong emphasis on code simplicity and readability, so I also share Marten's concerns about code getting too complex. My impression reading the NEP was "whoa, this is hard, I'm glad smarter people than me are working on this, I'm sure it'll get simpler in time". But I haven't seen the simplicity materialise...

On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
Hi All,

For 1.17, there has been a big effort, especially by Stephan, to make __array_function__ sufficiently usable that it can be exposed. I think this is great, and still like the idea very much, but its impact on the numpy code base has gotten so big in the most recent PR (gh-13585) that I wonder if we shouldn't reconsider the approach, and at least for 1.17 stick with the status quo. Since that seems to be a bigger question than can be usefully addressed in the PR, I thought I would raise it here.

Specifically, now not only does every numpy function have its dispatcher function, but also internally all numpy function calls are being done via the new `__skip_array_function__` attribute, to avoid further overrides. I think both changes make the code significantly less readable, thus, e.g., making it even harder than it is already to attract new contributors.

I think with this it is probably time to step back and check whether the implementation is in fact the right one. For instance, among the alternatives we originally considered was one that had the overridable versions of functions in the regular `numpy` namespace, and the once that would not themselves check in a different one. Alternatively, for some of the benefits provided by `__skip_array_function__`, there was a different suggestion to have a special return value, of `NotImplementedButCoercible`. Might these be better after all?

More generally, I think we're suffering from the fact that several of us seem to have rather different final goals in mind  In particular, I'd like to move to a state where as much of the code as possible makes use of the simplest possible implementation, with only a few true base functions, so that all but those simplest functions will generally work on any type of array. Others, however, worry much more about making implementations (even more) part of the API.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]

https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion