New draft of NEP 31 — Context-local and global overrides of the NumPy API

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

New draft of NEP 31 — Context-local and global overrides of the NumPy API

Hameer Abbasi

Hello everyone, I’ve improved upon the content of NEP 31 to make it simpler, and also according to the new NEP template, only part of the NEP is being sent out to the mailing list. For the full nep, please see PR 14793.

 

============================================================

NEP 31 — Context-local and global overrides of the NumPy API

============================================================

 

:Author: Hameer Abbasi <[hidden email]>

:Author: Ralf Gommers <[hidden email]>

:Author: Peter Bell <[hidden email]>

:Status: Draft

:Type: Standards Track

:Created: 2019-08-22

 

 

Abstract

--------

 

This NEP proposes to make all of NumPy's public API overridable via an

extensible backend mechanism.

 

Acceptance of this NEP means NumPy would provide global and context-local

overrides in a separate namespace, as well as a dispatch mechanism similar

to NEP-18 [2]_. First experiences with ``__array_function__`` show that it

is necessary to be able to override NumPy functions that *do not take an

array-like argument*, and hence aren't overridable via

``__array_function__``. The most pressing need is array creation and coercion

functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_.

 

This NEP proposes to allow, in an opt-in fashion, overriding any part of the

NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and

obviates the need to add an ever-growing list of new protocols for each new

type of function or object that needs to become overridable.

 

Motivation and Scope

--------------------

 

The primary end-goal of this NEP is to make the following possible:

 

.. code:: python

 

    # On the library side

    import numpy.overridable as unp

 

   def library_function(array):

        array = unp.asarray(array)

        # Code using unumpy as usual

        return array

 

    # On the user side:

    import numpy.overridable as unp

    import uarray as ua

    import dask.array as da

 

    ua.register_backend(da) # Can be done within Dask itself

 

    library_function(dask_array)  # works and returns dask_array

 

    with unp.set_backend(da):

        library_function([1, 2, 3, 4])  # actually returns a Dask array.

 

Here, ``backend`` can be any compatible object defined either by NumPy or an

external library, such as Dask or CuPy. Ideally, it should be the module

``dask.array`` or ``cupy`` itself.

 

These kinds of overrides are useful for both the end-user as well as library

authors. End-users may have written or wish to write code that they then later

speed up or move to a different implementation, say PyData/Sparse. They can do

this simply by setting a backend. Library authors may also wish to write code

that is portable across array implementations, for example ``sklearn`` may wish

to write code for a machine learning algorithm that is portable across array

implementations while also using array creation functions.

 

This NEP takes a holistic approach: It assumes that there are parts of

the API that need to be overridable, and that these will grow over time. It

provides a general framework and a mechanism to avoid a design of a new

protocol each time this is required. This was the goal of ``uarray``: to

allow for overrides in an API without needing the design of a new protocol.

 

This NEP proposes the following: That ``unumpy`` [8]_  becomes the

recommended override mechanism for the parts of the NumPy API not yet covered

by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is

vendored into a new namespace within NumPy to give users and downstream

dependencies access to these overrides.  This vendoring mechanism is similar

to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_).

 

The motivation behind ``uarray`` is manyfold: First, there have been several

attempts to allow dispatch of parts of the NumPy API, including (most

prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the

``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need

for further protocols to be developed, including a protocol for coercion (see

[5]_, [9]_). The reasons these overrides are needed have been extensively

discussed in the references, and this NEP will not attempt to go into the

details of why these are needed; but in short: It is necessary for library

authors to be able to coerce arbitrary objects into arrays of their own types,

such as CuPy needing to coerce to a CuPy array, for example, instead of

a NumPy array. In simpler words, one needs things like ``np.asarray(...)`` or

an alternative to "just work" and return duck-arrays.

 

Usage and Impact

----------------

 

This NEP allows for global and context-local overrides, as well as

automatic overrides a-la ``__array_function__``.

 

Here are some use-cases this NEP would enable, besides the

first one stated in the motivation section:

 

The first is allowing alternate dtypes to return their

respective arrays.

 

.. code:: python

 

    # Returns an XND array

    x = unp.ones((5, 5), dtype=xnd_dtype) # Or torch dtype

 

The second is allowing overrides for parts of the API.

This is to allow alternate and/or optimised implementations

for ``np.linalg``, BLAS, and ``np.random``.

 

.. code:: python

 

    import numpy as np

    import pyfftw # Or mkl_fft

 

    # Makes pyfftw the default for FFT

    np.set_global_backend(pyfftw)

 

    # Uses pyfftw without monkeypatching

    np.fft.fft(numpy_array)   

 

    with np.set_backend(pyfftw) # Or mkl_fft, or numpy

        # Uses the backend you specified

        np.fft.fft(numpy_array)

 

This will allow an official way for overrides to work with NumPy without

monkeypatching or distributing a modified version of NumPy.

 

Here are a few other use-cases, implied but not already

stated:

 

.. code:: python

 

    data = da.from_zarr('myfile.zarr')

    # result should still be dask, all things being equal

    result = library_function(data)

    result.to_zarr('output.zarr')

 

This second one would work if ``magic_library`` was built

on top of ``unumpy``.

 

.. code:: python

 

    from dask import array as da

    from magic_library import pytorch_predict

 

    data = da.from_zarr('myfile.zarr')

    # normally here one would use e.g. data.map_overlap

    result = pytorch_predict(data)

    result.to_zarr('output.zarr')

 

Backward compatibility

----------------------

 

There are no backward incompatible changes proposed in this NEP.

 


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion