NumPy 1.17.0rc1 released

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

NumPy 1.17.0rc1 released

Charles R Harris
Hi All,

On behalf of the NumPy team I am pleased to announce the release of NumPy 1.17.0rc1. The 1.17 release contains a number of new features that should substantially improve its performance and usefulness. The Python versions supported are 3.5-3.7, note that Python 2.7 has been dropped. Python 3.8b1 should work with the released source packages, but there are no guarantees about future releases. Highlights of this release are:
  • A new extensible random module along with four selectable random numbe5 generators and improved seeding designed for use in parallel processes has been added. The currently available bit generators are MT19937, PCG64, Philox, and SFC64.
  • NumPy's FFT implementation was changed from fftpack to pocketfft, resulting in faster, more accurate transforms and better handling of datasets of prime length.
  • New radix sort and timsort sorting methods. It is currently not possible to choose which will be used, but they are hardwired to the datatype and used when either ``stable`` or ``mergesort`` is passed as the method.
  • Overriding numpy functions is now possible by default
Downstream developers should use Cython >= 0.29.10 for Python 3.8 support and OpenBLAS >= 3.7 (not currently out) to avoid problems on the Skylake architecture. The NumPy wheels on PyPI are built from the OpenBLAS development branch in order to avoid those problems. Wheels for this release can be downloaded from PyPI, source archives and release notes are available from Github.

Contributors

A total of 142 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.
  • Aaron Voelker +
  • Abdur Rehman +
  • Abdur-Rahmaan Janhangeer +
  • Abhinav Sagar +
  • Adam J. Stewart +
  • Adam Orr +
  • Albert Thomas +
  • Alex Watt +
  • Alexander Blinne +
  • Alexander Shadchin
  • Allan Haldane
  • Ander Ustarroz +
  • Andras Deak
  • Andreas Schwab
  • Andrew Naguib +
  • Andy Scholand +
  • Ankit Shukla +
  • Anthony Sottile
  • Antoine Pitrou
  • Antony Lee
  • Arcesio Castaneda Medina +
  • Assem +
  • Bernardt Duvenhage +
  • Bharat Raghunathan +
  • Bharat123rox +
  • Bran +
  • Bruce Merry +
  • Charles Harris
  • Chirag Nighut +
  • Christoph Gohlke
  • Christopher Whelan +
  • Chuanzhu Xu +
  • Daniel Hrisca
  • Daniel Lawrence +
  • Debsankha Manik +
  • Dennis Zollo +
  • Dieter Werthmüller +
  • Dominic Jack +
  • EelcoPeacs +
  • Eric Larson
  • Eric Wieser
  • Fabrice Fontaine +
  • Gary Gurlaskie +
  • Gregory Lee +
  • Gregory R. Lee
  • Hameer Abbasi
  • Haoyu Sun +
  • He Jia +
  • Hunter Damron +
  • Ian Sanders +
  • Ilja +
  • Isaac Virshup +
  • Isaiah Norton +
  • Jaime Fernandez
  • Jakub Wilk
  • Jan S. (Milania1) +
  • Jarrod Millman
  • Javier Dehesa +
  • Jeremy Lay +
  • Jim Turner +
  • Jingbei Li +
  • Joachim Hereth +
  • John Belmonte +
  • John Kirkham
  • John Law +
  • Jonas Jensen
  • Joseph Fox-Rabinovitz
  • Joseph Martinot-Lagarde
  • Josh Wilson
  • Juan Luis Cano Rodríguez
  • Julian Taylor
  • Jérémie du Boisberranger +
  • Kai Striega +
  • Katharine Hyatt +
  • Kevin Sheppard
  • Kexuan Sun
  • Kiko Correoso +
  • Kriti Singh +
  • Lars Grueter +
  • Maksim Shabunin +
  • Manvi07 +
  • Mark Harfouche
  • Marten van Kerkwijk
  • Martin Reinecke +
  • Matthew Brett
  • Matthias Bussonnier
  • Matti Picus
  • Michel Fruchart +
  • Mike Lui +
  • Mike Taves +
  • Min ho Kim +
  • Mircea Akos Bruma
  • Nick Minkyu Lee
  • Nick Papior
  • Nick R. Papior +
  • Nicola Soranzo +
  • Nimish Telang +
  • OBATA Akio +
  • Oleksandr Pavlyk
  • Ori Broda +
  • Paul Ivanov
  • Pauli Virtanen
  • Peter Andreas Entschev +
  • Peter Bell +
  • Pierre de Buyl
  • Piyush Jaipuriayar +
  • Prithvi MK +
  • Raghuveer Devulapalli +
  • Ralf Gommers
  • Richard Harris +
  • Rishabh Chakrabarti +
  • Riya Sharma +
  • Robert Kern
  • Roman Yurchak
  • Ryan Levy +
  • Sebastian Berg
  • Sergei Lebedev +
  • Shekhar Prasad Rajak +
  • Stefan van der Walt
  • Stephan Hoyer
  • SuryaChand P +
  • Søren Rasmussen +
  • Thibault Hallouin +
  • Thomas A Caswell
  • Tobias Uelwer +
  • Tony LaTorre +
  • Toshiki Kataoka
  • Tyler Moncur +
  • Tyler Reddy
  • Valentin Haenel
  • Vrinda Narayan +
  • Warren Weckesser
  • Weitang Li
  • Wojtek Ruszczewski
  • Yu Feng
  • Yu Kobayashi +
  • Yury Kirienko +
  • @aashuli +
  • @euronion +
  • @luzpaz
  • @parul +
  • @spacescientist +
Cheers,

Charles Harris


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: NumPy 1.17.0rc1 released

Juan Nunez-Iglesias-2
Hi Chuck, and thanks for putting this together!

It seems the release has broken existing uses of dask array with `np.min` (I presume among other functions):


Perhaps `__array_function__` should be switched off for one more release cycle? I imagine that scikit-image [1] are not the only ones using this construct, which worked fine before `__array_function__`.

Thank goodness (and you!) for pre-releases! ;)

Juan.


On Mon, 1 Jul 2019, at 8:48 AM, Charles R Harris wrote:
Hi All,

On behalf of the NumPy team I am pleased to announce the release of NumPy 1.17.0rc1. The 1.17 release contains a number of new features that should substantially improve its performance and usefulness. The Python versions supported are 3.5-3.7, note that Python 2.7 has been dropped. Python 3.8b1 should work with the released source packages, but there are no guarantees about future releases. Highlights of this release are:
  • A new extensible random module along with four selectable random numbe5 generators and improved seeding designed for use in parallel processes has been added. The currently available bit generators are MT19937, PCG64, Philox, and SFC64.
  • NumPy's FFT implementation was changed from fftpack to pocketfft, resulting in faster, more accurate transforms and better handling of datasets of prime length.
  • New radix sort and timsort sorting methods. It is currently not possible to choose which will be used, but they are hardwired to the datatype and used when either ``stable`` or ``mergesort`` is passed as the method.
  • Overriding numpy functions is now possible by default
Downstream developers should use Cython >= 0.29.10 for Python 3.8 support and OpenBLAS >= 3.7 (not currently out) to avoid problems on the Skylake architecture. The NumPy wheels on PyPI are built from the OpenBLAS development branch in order to avoid those problems. Wheels for this release can be downloaded from PyPI, source archives and release notes are available from Github.

Contributors

A total of 142 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.
  • Aaron Voelker +
  • Abdur Rehman +
  • Abdur-Rahmaan Janhangeer +
  • Abhinav Sagar +
  • Adam J. Stewart +
  • Adam Orr +
  • Albert Thomas +
  • Alex Watt +
  • Alexander Blinne +
  • Alexander Shadchin
  • Allan Haldane
  • Ander Ustarroz +
  • Andras Deak
  • Andreas Schwab
  • Andrew Naguib +
  • Andy Scholand +
  • Ankit Shukla +
  • Anthony Sottile
  • Antoine Pitrou
  • Antony Lee
  • Arcesio Castaneda Medina +
  • Assem +
  • Bernardt Duvenhage +
  • Bharat Raghunathan +
  • Bharat123rox +
  • Bran +
  • Bruce Merry +
  • Charles Harris
  • Chirag Nighut +
  • Christoph Gohlke
  • Christopher Whelan +
  • Chuanzhu Xu +
  • Daniel Hrisca
  • Daniel Lawrence +
  • Debsankha Manik +
  • Dennis Zollo +
  • Dieter Werthmüller +
  • Dominic Jack +
  • EelcoPeacs +
  • Eric Larson
  • Eric Wieser
  • Fabrice Fontaine +
  • Gary Gurlaskie +
  • Gregory Lee +
  • Gregory R. Lee
  • Hameer Abbasi
  • Haoyu Sun +
  • He Jia +
  • Hunter Damron +
  • Ian Sanders +
  • Ilja +
  • Isaac Virshup +
  • Isaiah Norton +
  • Jaime Fernandez
  • Jakub Wilk
  • Jan S. (Milania1) +
  • Jarrod Millman
  • Javier Dehesa +
  • Jeremy Lay +
  • Jim Turner +
  • Jingbei Li +
  • Joachim Hereth +
  • John Belmonte +
  • John Kirkham
  • John Law +
  • Jonas Jensen
  • Joseph Fox-Rabinovitz
  • Joseph Martinot-Lagarde
  • Josh Wilson
  • Juan Luis Cano Rodríguez
  • Julian Taylor
  • Jérémie du Boisberranger +
  • Kai Striega +
  • Katharine Hyatt +
  • Kevin Sheppard
  • Kexuan Sun
  • Kiko Correoso +
  • Kriti Singh +
  • Lars Grueter +
  • Maksim Shabunin +
  • Manvi07 +
  • Mark Harfouche
  • Marten van Kerkwijk
  • Martin Reinecke +
  • Matthew Brett
  • Matthias Bussonnier
  • Matti Picus
  • Michel Fruchart +
  • Mike Lui +
  • Mike Taves +
  • Min ho Kim +
  • Mircea Akos Bruma
  • Nick Minkyu Lee
  • Nick Papior
  • Nick R. Papior +
  • Nicola Soranzo +
  • Nimish Telang +
  • OBATA Akio +
  • Oleksandr Pavlyk
  • Ori Broda +
  • Paul Ivanov
  • Pauli Virtanen
  • Peter Andreas Entschev +
  • Peter Bell +
  • Pierre de Buyl
  • Piyush Jaipuriayar +
  • Prithvi MK +
  • Raghuveer Devulapalli +
  • Ralf Gommers
  • Richard Harris +
  • Rishabh Chakrabarti +
  • Riya Sharma +
  • Robert Kern
  • Roman Yurchak
  • Ryan Levy +
  • Sebastian Berg
  • Sergei Lebedev +
  • Shekhar Prasad Rajak +
  • Stefan van der Walt
  • Stephan Hoyer
  • SuryaChand P +
  • Søren Rasmussen +
  • Thibault Hallouin +
  • Thomas A Caswell
  • Tobias Uelwer +
  • Tony LaTorre +
  • Toshiki Kataoka
  • Tyler Moncur +
  • Tyler Reddy
  • Valentin Haenel
  • Vrinda Narayan +
  • Warren Weckesser
  • Weitang Li
  • Wojtek Ruszczewski
  • Yu Feng
  • Yu Kobayashi +
  • Yury Kirienko +
  • @aashuli +
  • @euronion +
  • @luzpaz
  • @parul +
  • @spacescientist +
Cheers,

Charles Harris
_______________________________________________
NumPy-Discussion mailing list
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

__array_function related regression for 1.17.0rc1

ralfgommers


On Mon, Jul 1, 2019 at 6:08 AM Juan Nunez-Iglesias <[hidden email]> wrote:
Hi Chuck, and thanks for putting this together!

It seems the release has broken existing uses of dask array with `np.min` (I presume among other functions):


Perhaps `__array_function__` should be switched off for one more release cycle? I imagine that scikit-image [1] are not the only ones using this construct, which worked fine before `__array_function__`.

Hmm, I'd really like to avoid that, that would be another 6 months where we get close to zero testing.

This issue is not very surprising - __array_function__ is going to have a fair bit of backwards compat impact for people who were relying on feeding all sorts of stuff into numpy functions that previously got converted with asarray. At this point Dask is the main worry, followed by CuPy and pydata/sparse. All those libraries have very responsive maintainers. Perhaps we should just try to get these issues fixed asap in those libraries instead?

Cheers,
Ralf


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

Juan Nunez-Iglesias-2


On Mon, 1 Jul 2019, at 2:34 PM, Ralf Gommers wrote:
This issue is not very surprising - __array_function__ is going to have a fair bit of backwards compat impact for people who were relying on feeding all sorts of stuff into numpy functions that previously got converted with asarray. At this point Dask is the main worry, followed by CuPy and pydata/sparse. All those libraries have very responsive maintainers. Perhaps we should just try to get these issues fixed asap in those libraries instead?

Fixing them is not sufficient, because many people are still going to end up with broken code unless they are bleeding-edge with everything. It's best to minimise the number of forbidden version combinations.

Your suggestion on the issue to switch from typeerror to warning is, imho, much better, as long as the warning contains a link to an issue/webpage explaining what needs to happen. It's only because I've been vaguely aware of the `__array_function__` discussions that I was able to diagnose relatively quickly. The average user would be very confused by this code break or by a warning, and be unsure of what they need to do to get rid of the warning.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

ralfgommers


On Mon, Jul 1, 2019 at 7:37 AM Juan Nunez-Iglesias <[hidden email]> wrote:


On Mon, 1 Jul 2019, at 2:34 PM, Ralf Gommers wrote:
This issue is not very surprising - __array_function__ is going to have a fair bit of backwards compat impact for people who were relying on feeding all sorts of stuff into numpy functions that previously got converted with asarray. At this point Dask is the main worry, followed by CuPy and pydata/sparse. All those libraries have very responsive maintainers. Perhaps we should just try to get these issues fixed asap in those libraries instead?

Fixing them is not sufficient, because many people are still going to end up with broken code unless they are bleeding-edge with everything. It's best to minimise the number of forbidden version combinations.

Yes, fair enough.


Your suggestion on the issue to switch from typeerror to warning is, imho, much better, as long as the warning contains a link to an issue/webpage explaining what needs to happen. It's only because I've been vaguely aware of the `__array_function__` discussions that I was able to diagnose relatively quickly. The average user would be very confused by this code break or by a warning, and be unsure of what they need to do to get rid of the warning.

 This would work I think. It's not even a band-aid, it's probably the better design option because any sane library that implements __array_function__ will have a much smaller API surface than NumPy - and why forbid users from feeding array-like input to the rest of the NumPy functions?

Cheers,
Ralf



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

Stephan Hoyer-2
Your suggestion on the issue to switch from typeerror to warning is, imho, much better, as long as the warning contains a link to an issue/webpage explaining what needs to happen. It's only because I've been vaguely aware of the `__array_function__` discussions that I was able to diagnose relatively quickly. The average user would be very confused by this code break or by a warning, and be unsure of what they need to do to get rid of the warning.

 This would work I think. It's not even a band-aid, it's probably the better design option because any sane library that implements __array_function__ will have a much smaller API surface than NumPy - and why forbid users from feeding array-like input to the rest of the NumPy functions?

This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API":

My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries.

In contrast to putting this into NumPy, if a library like dask prefers to issue warnings or even keep around fallback coercion indefinitely (not that I would recommend it), they can do that by putting it in their __array_function__ implementation.


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

Juan Nunez-Iglesias-2
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API":

My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries.

In contrast to putting this into NumPy, if a library like dask prefers to issue warnings or even keep around fallback coercion indefinitely (not that I would recommend it), they can do that by putting it in their __array_function__ implementation.

I get the above concerns, and thanks for bringing them up, Stephan, as I'd only skimmed the NEP the first time around and missed them. Nevertheless, the fact is that the current behaviour breaks user code that was perfectly valid until NumPy 1.16, which seems, well, insane. So, warning for a few versions followed raising seems like the only way forward to me. The NEP explicitly states “We would like to gain experience with how __array_function__ is actually used before making decisions that would be difficult to roll back.” I think that this breakage *is* that experience, and the decision right now should be not to break user code with no warning period.

I'm also wondering where the list of functions that must be implemented can be found, so that libraries like dask and CuPy can be sure that they have a complete implementation, and further typeerrors won't be raised with their arrays.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

ralfgommers


On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias <[hidden email]> wrote:
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API":

My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries.

Do you mean "fallback coercion in NumPy itself", or "at all"? Right now there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users will keep wanting to do that. Forcing everyone to write  `np.median(np.array(some_dask_array))` serves no purpose. So the coercion has to be somewhere. You're arguing that that's up to Dask et al I think?

Putting it in Dask right now still doesn't address Juan's backwards compat concern, but perhaps that could be bridged with a Dask bugfix release and some short-lived pain. 

I'm not convinced that this shouldn't be fixed in NumPy though. Your concern "reliably implement "strict" overrides of NumPy's API" is a bit abstract. Overriding the _whole_ NumPy API is definitely undesirable. If we'd have a reference list somewhere about every function that is handled with __array_function__, then would that address your concern? Such a list could be auto-generated fairly easily.

In contrast to putting this into NumPy, if a library like dask prefers to issue warnings or even keep around fallback coercion indefinitely (not that I would recommend it), they can do that by putting it in their __array_function__ implementation.

I get the above concerns, and thanks for bringing them up, Stephan, as I'd only skimmed the NEP the first time around and missed them. Nevertheless, the fact is that the current behaviour breaks user code that was perfectly valid until NumPy 1.16, which seems, well, insane. So, warning for a few versions followed raising seems like the only way forward to me. The NEP explicitly states “We would like to gain experience with how __array_function__ is actually used before making decisions that would be difficult to roll back.” I think that this breakage *is* that experience, and the decision right now should be not to break user code with no warning period.

I'm also wondering where the list of functions that must be implemented can be found, so that libraries like dask and CuPy can be sure that they have a complete implementation, and further typeerrors won't be raised with their arrays.

This is one of the reasons I'm working on https://github.com/Quansight-Labs/rnumpy. It doesn't make sense for any library to copy the whole NumPy API, it's way too large with lots of stuff in there that's only there for backwards compat and has a better alternative or shouldn't be in NumPy in the first place.

Cheers,
Ralf



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

Stephan Hoyer-2
In reply to this post by Juan Nunez-Iglesias-2
On Tue, Jul 2, 2019 at 1:46 AM Juan Nunez-Iglesias <[hidden email]> wrote:
I'm also wondering where the list of functions that must be implemented can be found, so that libraries like dask and CuPy can be sure that they have a complete implementation, and further typeerrors won't be raised with their arrays.

This is a good question. We don't have a master list currently.

In practice, I would be surprised if there is ever more than exactly one full implementation of NumPy's full API. We added dispatch with __array_function__ even to really obscure corners of NumPy's API, e.g., np.lib.scimath.

The short answer right now is "Any publicly exposed function that says it takes array-like arguments, aside from functions specifically for coercing to NumPy arrays and the functions in numpy.testing." 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

Stephan Hoyer-2
In reply to this post by ralfgommers
On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers <[hidden email]> wrote:


On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias <[hidden email]> wrote:
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API":

My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries.

Do you mean "fallback coercion in NumPy itself", or "at all"? Right now there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users will keep wanting to do that. Forcing everyone to write  `np.median(np.array(some_dask_array))` serves no purpose. So the coercion has to be somewhere. You're arguing that that's up to Dask et al I think?

Yes, I'm arguing this is up to dask to maintain backwards compatibility -- or not, as the maintainers see fit.

NumPy adding dispatching with __array_function__ did not break any existing code, until the maintainers of other libraries started adding __array_function__ methods. I hope that the risks of implementing such experimental methods were self-evident.
 
Putting it in Dask right now still doesn't address Juan's backwards compat concern, but perhaps that could be bridged with a Dask bugfix release and some short-lived pain. 

I really think this is the best (only?) path forward.

I'm not convinced that this shouldn't be fixed in NumPy though. Your concern "reliably implement "strict" overrides of NumPy's API" is a bit abstract. Overriding the _whole_ NumPy API is definitely undesirable. If we'd have a reference list somewhere about every function that is handled with __array_function__, then would that address your concern? Such a list could be auto-generated fairly easily.

By "reliably implement strict overrides" I mean the ability to ensure that every operation either uses an override or raises an informative error -- making it very clear which operation needs to be implemented or avoided.

It's true that we didn't really consider "always issuing warnings" as a long term solution in the NEP. I can see how this would simply a backwards compatibility story for libraries like dask, but in general, I really don't like warnings: Using them like exceptions can easily result in code that is partially broken or that fails later for non-obvious reasons. There's a reason why Python's errors stop execution flow, until errors in languages like PHP or JavaScript.


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

ralfgommers


On Tue, Jul 2, 2019 at 8:38 AM Stephan Hoyer <[hidden email]> wrote:
On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers <[hidden email]> wrote:


On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias <[hidden email]> wrote:
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API":

My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries.

Do you mean "fallback coercion in NumPy itself", or "at all"? Right now there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users will keep wanting to do that. Forcing everyone to write  `np.median(np.array(some_dask_array))` serves no purpose. So the coercion has to be somewhere. You're arguing that that's up to Dask et al I think?

Yes, I'm arguing this is up to dask to maintain backwards compatibility -- or not, as the maintainers see fit.

NumPy adding dispatching with __array_function__ did not break any existing code, until the maintainers of other libraries started adding __array_function__ methods. I hope that the risks of implementing such experimental methods were self-evident.

Yeah, that's a bit of a chicken-and-egg story though. We add something and try to be "strict". Dask adds something because they like the idea and generally are quick to adopt these types of things. If we make it too hard to be backwards compatible, then neither NumPy nor Dask may try and it ends up breaking scikit-image & co. I for one don't care where the fix lands, but it's pretty to me that breaking scikit-image is the worst of all options.

 
Putting it in Dask right now still doesn't address Juan's backwards compat concern, but perhaps that could be bridged with a Dask bugfix release and some short-lived pain. 

I really think this is the best (only?) path forward.

I think I agree (depending on how easy it is to get the Dask fix landed).

I'm not convinced that this shouldn't be fixed in NumPy though. Your concern "reliably implement "strict" overrides of NumPy's API" is a bit abstract. Overriding the _whole_ NumPy API is definitely undesirable. If we'd have a reference list somewhere about every function that is handled with __array_function__, then would that address your concern? Such a list could be auto-generated fairly easily.

By "reliably implement strict overrides" I mean the ability to ensure that every operation either uses an override or raises an informative error -- making it very clear which operation needs to be implemented or avoided.

That isn't necessarily a good goal in itself though. In many cases, an `asarray` call still needs to go *somewhere*. If the "reliably implement strict overrides" is to help library authors, then there may be other ways to do that. For end users it can only hurt; those TypeErrors aren't exactly easy to understand.


It's true that we didn't really consider "always issuing warnings" as a long term solution in the NEP. I can see how this would simply a backwards compatibility story for libraries like dask, but in general, I really don't like warnings:

I agree.

Cheers,
Ralf

Using them like exceptions can easily result in code that is partially broken or that fails later for non-obvious reasons. There's a reason why Python's errors stop execution flow, until errors in languages like PHP or JavaScript.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: __array_function related regression for 1.17.0rc1

ralfgommers


On Tue, Jul 2, 2019 at 1:15 PM Ralf Gommers <[hidden email]> wrote:


On Tue, Jul 2, 2019 at 8:38 AM Stephan Hoyer <[hidden email]> wrote:
On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers <[hidden email]> wrote:


On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias <[hidden email]> wrote:
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API":

My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries.

Do you mean "fallback coercion in NumPy itself", or "at all"? Right now there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users will keep wanting to do that. Forcing everyone to write  `np.median(np.array(some_dask_array))` serves no purpose. So the coercion has to be somewhere. You're arguing that that's up to Dask et al I think?

Yes, I'm arguing this is up to dask to maintain backwards compatibility -- or not, as the maintainers see fit.

NumPy adding dispatching with __array_function__ did not break any existing code, until the maintainers of other libraries started adding __array_function__ methods. I hope that the risks of implementing such experimental methods were self-evident.

Yeah, that's a bit of a chicken-and-egg story though. We add something and try to be "strict". Dask adds something because they like the idea and generally are quick to adopt these types of things. If we make it too hard to be backwards compatible, then neither NumPy nor Dask may try and it ends up breaking scikit-image & co. I for one don't care where the fix lands, but it's pretty to me that breaking scikit-image is the worst of all options.

 
Putting it in Dask right now still doesn't address Juan's backwards compat concern, but perhaps that could be bridged with a Dask bugfix release and some short-lived pain. 

I really think this is the best (only?) path forward.

I think I agree (depending on how easy it is to get the Dask fix landed).

That's landed, and Dask is planning a bugfix release in 2 days, so before the NumPy 1.17.0 release. So this is not a release blocker anymore for us I think.

Cheers,
Ralf


I'm not convinced that this shouldn't be fixed in NumPy though. Your concern "reliably implement "strict" overrides of NumPy's API" is a bit abstract. Overriding the _whole_ NumPy API is definitely undesirable. If we'd have a reference list somewhere about every function that is handled with __array_function__, then would that address your concern? Such a list could be auto-generated fairly easily.

By "reliably implement strict overrides" I mean the ability to ensure that every operation either uses an override or raises an informative error -- making it very clear which operation needs to be implemented or avoided.

That isn't necessarily a good goal in itself though. In many cases, an `asarray` call still needs to go *somewhere*. If the "reliably implement strict overrides" is to help library authors, then there may be other ways to do that. For end users it can only hurt; those TypeErrors aren't exactly easy to understand.


It's true that we didn't really consider "always issuing warnings" as a long term solution in the NEP. I can see how this would simply a backwards compatibility story for libraries like dask, but in general, I really don't like warnings:

I agree.

Cheers,
Ralf

Using them like exceptions can easily result in code that is partially broken or that fails later for non-obvious reasons. There's a reason why Python's errors stop execution flow, until errors in languages like PHP or JavaScript.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion