Type annotation for Numpy arrays, accelerators and numpy.typing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Type annotation for Numpy arrays, accelerators and numpy.typing

PIERRE AUGIER
Hi,

When Numpy 1.20 was released, I discovered numpy.typing and its documentation https://numpy.org/doc/stable/reference/typing.html

I know that it is very new but I'm a bit lost. A good API to describe Array type would be useful not only for type checkers but also for Python accelerators using ndarrays (in particular Pythran, Numba, Cython, Transonic).

For Transonic, I'd like to be able to use internally numpy.typing to have a better implementation of what we need in transonic.typing (in particular compatible with type checkers like MyPy).

However, it seems that I can't do anything with what I see today in numpy.typing.

For Python-Numpy accelerators, we need to be able to define precise array types to limit the compilation time and give useful hints for optimizations (ndim, partial or full shape). We also need fused types.

What can be done with Transonic is described in these pages: https://transonic.readthedocs.io/en/latest/examples/type_hints.html and https://transonic.readthedocs.io/en/latest/generated/transonic.typing.html

I think it would be good to be able to do things like that with numpy.typing. It may be already possible but I can't find how in the doc.

I can give few examples here. First very simple:

from transonic import Array

Af3d = Array[float, "3d"]

# Note that this can also be written without Array just as
Af3d = "float[:,:,:]"

# same thing but only contiguous C ordered
Af3d = Array[float, "3d", "C"]

Note: being able to limit the compilation just for C-aligned arrays is very important since it can drastically decrease the compilation time/memory and that some numerical kernels are anyway written to be efficient only with C (or Fortran) ordered arrays.

# 2d color image
A_im = Array[np.int16, "[:,:,3]"]

Now, fused types. This example is taken from a real life case (https://foss.heptapod.net/fluiddyn/fluidsim/-/blob/branch/default/fluidsim/base/time_stepping/pseudo_spect.py) so it's really useful in practice.

from transonic import Type, NDim, Array, Union

N = NDim(2, 3, 4)
A = Array[np.complex128, N, "C"]
Am1 = Array[np.complex128, N - 1, "C"]

N123 = NDim(1, 2, 3)
A123c = Array[np.complex128, N123, "C"]
A123f = Array[np.float64, N123, "C"]

T = Type(np.float64, np.complex128)
A1 = Array[T, N, "C"]
A2 = Array[T, N - 1, "C"]
ArrayDiss = Union[A1, A2]

To summarize, type annotations are and will also be used for Python-Numpy accelerators. It would be good to also consider this application when designing numpy.typing.

Cheers,
Pierre
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Type annotation for Numpy arrays, accelerators and numpy.typing

ralfgommers


On Tue, Feb 16, 2021 at 10:20 AM PIERRE AUGIER <[hidden email]> wrote:
Hi,

When Numpy 1.20 was released, I discovered numpy.typing and its documentation https://numpy.org/doc/stable/reference/typing.html

I know that it is very new but I'm a bit lost. A good API to describe Array type would be useful not only for type checkers but also for Python accelerators using ndarrays (in particular Pythran, Numba, Cython, Transonic).

For Transonic, I'd like to be able to use internally numpy.typing to have a better implementation of what we need in transonic.typing (in particular compatible with type checkers like MyPy).

However, it seems that I can't do anything with what I see today in numpy.typing.

For Python-Numpy accelerators, we need to be able to define precise array types to limit the compilation time and give useful hints for optimizations (ndim, partial or full shape). We also need fused types.

Hi Pierre, I think what you are getting at is that ArrayLike isn't useful for accelerators, right?  ArrayLike is needed to add annotations to functions that use np.asarray to coerce their inputs, which may be scalars, lists, etc. That's indeed never what you want for an accelerator, and it'd be great if people stopped writing that kind of code - but we're stuck with a lot of it in SciPy and many other downstream libraries.

For your purposes, I think you want one of two things:
1. functions that only take `ndarray`, or maybe at most `Union[float, ndarray]`
2. perhaps in the future, a well-defined array Protocol, to support multiple array types (this is hinted at in https://data-apis.github.io/array-api/latest/design_topics/static_typing.html)

You don't need numpy.typing for (1), you can directly annotate with `x : np.ndarray`


What can be done with Transonic is described in these pages: https://transonic.readthedocs.io/en/latest/examples/type_hints.html and https://transonic.readthedocs.io/en/latest/generated/transonic.typing.html

I think it would be good to be able to do things like that with numpy.typing. It may be already possible but I can't find how in the doc.

Two things that are still work-in-progress are annotating arrays with dtypes and with shapes. Your examples already have that, so that's useful input. For C/F-contiguity, I believe that's useful but normally shouldn't show up in user-facing APIs (only in internal helper routines) so probably less urgent.

For dtype annotations, a lot of work is being done at the moment by Bas van Beek. Example: https://github.com/numpy/numpy/pull/18128. That all turns out to be quite complex, because there's so many valid ways of specifying a dtype. It's the same kind of flexibility problem as with `asarray` - the complexity is needed to correctly type current code in NumPy, SciPy et al., but it's not what you want for an accelerator. For that you'd want to accept only one way of spelling this, `dtype=<one of a fixed set of dtype literals>`.


I can give few examples here. First very simple:

from transonic import Array

Af3d = Array[float, "3d"]

# Note that this can also be written without Array just as
Af3d = "float[:,:,:]"

# same thing but only contiguous C ordered
Af3d = Array[float, "3d", "C"]

Note: being able to limit the compilation just for C-aligned arrays is very important since it can drastically decrease the compilation time/memory and that some numerical kernels are anyway written to be efficient only with C (or Fortran) ordered arrays.

# 2d color image
A_im = Array[np.int16, "[:,:,3]"]

Now, fused types. This example is taken from a real life case (https://foss.heptapod.net/fluiddyn/fluidsim/-/blob/branch/default/fluidsim/base/time_stepping/pseudo_spect.py) so it's really useful in practice.

Yes definitely useful, there's also a lot of Cython code in downstream libraries that shows this.

Annotations for fused types, when dtypes are just type literals, should hopefully work out of the box with TypeVar without us having to do anything special in numpy.

Cheers,
Ralf


from transonic import Type, NDim, Array, Union

N = NDim(2, 3, 4)
A = Array[np.complex128, N, "C"]
Am1 = Array[np.complex128, N - 1, "C"]

N123 = NDim(1, 2, 3)
A123c = Array[np.complex128, N123, "C"]
A123f = Array[np.float64, N123, "C"]

T = Type(np.float64, np.complex128)
A1 = Array[T, N, "C"]
A2 = Array[T, N - 1, "C"]
ArrayDiss = Union[A1, A2]

To summarize, type annotations are and will also be used for Python-Numpy accelerators. It would be good to also consider this application when designing numpy.typing.

Cheers,
Pierre
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion