matmul as a ufunc

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

matmul as a ufunc

mattip
I have made progress with resolving the issue that matmul, the operation
which implements `a @ b`, is not a ufunc [2]. Discussion on the issue,
which prevents the __array_ufunc__ mechanism for overriding matmul on
subclasses of ndarray, yeilded two approaches:

- create a wrapper that can convince the ufunc mechanism to call
__array_ufunc__ even on functions that are not true ufuncs

- expand the semantics of core signatures so that a single matmul ufunc
can implement matrix-matrix, vector-matrix, matrix-vector, and
vector-vector multiplication.

I have put up prototypes of both approaches as pr 11061 [0] and 11133
[1], they are WIP to prove the concept and are a bit rough. Either
approach can be made to work.

Which is preferable?

What are the criteria we should use to judge the relative merits
(backward compatibility, efficiency, code clarity, enabling further
enhancements, ...) of the approaches?

Matti

[0] https://github.com/numpy/numpy/pull/11061
[1] https://github.com/numpy/numpy/pull/11133
[2] https://github.com/numpy/numpy/issues/9028
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Stephan Hoyer-2
On Mon, May 21, 2018 at 5:42 PM Matti Picus <[hidden email]> wrote:
- create a wrapper that can convince the ufunc mechanism to call
__array_ufunc__ even on functions that are not true ufuncs

I am somewhat opposed to this approach, because __array_ufunc__ is about overloading ufuncs, and as soon as we relax this guarantee the set of invariants __array_ufunc__ implementors rely on becomes much more limited.

We really should have another mechanism for arbitrary function overloading in NumPy (NEP to follow shortly!).
 
- expand the semantics of core signatures so that a single matmul ufunc
can implement matrix-matrix, vector-matrix, matrix-vector, and
vector-vector multiplication.

I was initially concerned that adding optional dimensions for gufuncs would introduce additional complexity for only the benefit of a single function (matmul), but I'm now convinced that it makes sense:
1. All other arithmetic overloads use __array_ufunc__, and it would be nice to keep @/matmul in the same place.
2. There's a common family of gufuncs for which optional dimensions like np.matmul make sense: matrix functions where 1D arrays should be treated as 2D row- or column-vectors.

One example of this class of behavior would be np.linalg.solve, which could support vectors like Ax=b and matrices like Ax=B with the signature (m,m),(m,n?)->(m,n?). We couldn't immediately make np.linalg.solve a gufunc since it uses a subtly different dispatching rule, but it's the same use-case.

Another example would be the "matrix transpose" function that has been occasionally proposed, to swap the last two dimensions of an array. It could have the signature (m?,n)->(n,m?), which ensure that it is still well defined (as the identity) on 1d arrays.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Nathaniel Smith
On Mon, May 28, 2018 at 4:26 PM, Stephan Hoyer <[hidden email]> wrote:

> On Mon, May 21, 2018 at 5:42 PM Matti Picus <[hidden email]> wrote:
>>
>> - create a wrapper that can convince the ufunc mechanism to call
>> __array_ufunc__ even on functions that are not true ufuncs
>
>
> I am somewhat opposed to this approach, because __array_ufunc__ is about
> overloading ufuncs, and as soon as we relax this guarantee the set of
> invariants __array_ufunc__ implementors rely on becomes much more limited.
>
> We really should have another mechanism for arbitrary function overloading
> in NumPy (NEP to follow shortly!).
>
>>
>> - expand the semantics of core signatures so that a single matmul ufunc
>> can implement matrix-matrix, vector-matrix, matrix-vector, and
>> vector-vector multiplication.
>
>
> I was initially concerned that adding optional dimensions for gufuncs would
> introduce additional complexity for only the benefit of a single function
> (matmul), but I'm now convinced that it makes sense:
> 1. All other arithmetic overloads use __array_ufunc__, and it would be nice
> to keep @/matmul in the same place.
> 2. There's a common family of gufuncs for which optional dimensions like
> np.matmul make sense: matrix functions where 1D arrays should be treated as
> 2D row- or column-vectors.
>
> One example of this class of behavior would be np.linalg.solve, which could
> support vectors like Ax=b and matrices like Ax=B with the signature
> (m,m),(m,n?)->(m,n?). We couldn't immediately make np.linalg.solve a gufunc
> since it uses a subtly different dispatching rule, but it's the same
> use-case.

Specifically, np.linalg.solve uses a unique rule where

   solve(a, b)

assumes that b is a stack of vectors if (a.ndim - 1 == b.ndim), and
otherwise assumes that it's a stack of matrices. This is pretty
confusing. You'd think that solve(a, b) should be equivalent to
(inv(a) @ b), but it isn't.

Say a.shape == (10, 3, 3) and b.shape == (3,). Then inv(a) @ b works,
and does what you'd expect: for each of the ten 3x3 matrices in a, it
computes the inverse and multiplies it by the 1-d vector in b (treated
as a column vector). But solve(a, b) is an error, because the
dimension aren't lined up to trigger the special handling for 1-d
vectors.

Or, say a.shape == (10, 3, 3) and b.shape == (3, 3). Then again inv(a)
@ b works, and does what you'd expect: for each of the ten 3x3
matrices in a, it computes the inverse and multiplies it by the 3x3
matrix in b. But again solve(a, b) is an error -- this time because
the special handling for 1-d vectors *does* kick in, even though it
doesn't make sense: it tries to match up the ten 3x3 matrices in a
against the three one-dimensional vectors in b, and 10 != 3 so the
broadcasting fails.

This also points to even more confusing possibilities: if a.shape ==
(3, 3) or (3, 3, 3, 3) and b.shape == (3, 3), then inv(a) @ b and
solve(a, b) both work and do the same thing. But if a.shape == (3, 3,
3), then inv(a) @ b and solve(a, b) both work, and do totally
*different* things.

I wonder if we should deprecate these corner cases, and eventually
migrate to making inv(a) @ b and solve(a, b) the same in all
situations. If we did, then solve(a, b) would actually be a gufunc
with signature (m,m),(m,n?)->(m,n?).

I think the cases that would need to be changed are those where
(a.ndim - 1 == b.ndim and b.ndim > 1). My guess is that this happens
very rarely in existing code, especially since (IIRC) this behavior
was only added a few years ago, when we gufunc-ified numpy.linalg.

> Another example would be the "matrix transpose" function that has been
> occasionally proposed, to swap the last two dimensions of an array. It could
> have the signature (m?,n)->(n,m?), which ensure that it is still well
> defined (as the identity) on 1d arrays.

Unfortunately I don't think we could make "broadcasting matrix
transpose" be literally a gufunc, since it should return a view. But I
guess there'd still be some value in having the notation available
just when talking about it, so we could say "this operation is *like*
a gufunc with signature (m?,n)->(n,m?), except that it returns a
view".

-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Eric Wieser
In reply to this post by Stephan Hoyer-2

which ensure that it is still well defined (as the identity) on 1d arrays.

This strikes me as a bad idea. There’s already enough confusion from beginners that array_1d.T is a no-op. If we introduce a matrix-transpose, it should either error on <1d inputs with a useful message, or insert the extra dimension. I’d favor the former.

Eric

On Mon, 28 May 2018 at 16:27 Stephan Hoyer shoyer@... wrote:

On Mon, May 21, 2018 at 5:42 PM Matti Picus <[hidden email]> wrote:
- create a wrapper that can convince the ufunc mechanism to call
__array_ufunc__ even on functions that are not true ufuncs

I am somewhat opposed to this approach, because __array_ufunc__ is about overloading ufuncs, and as soon as we relax this guarantee the set of invariants __array_ufunc__ implementors rely on becomes much more limited.

We really should have another mechanism for arbitrary function overloading in NumPy (NEP to follow shortly!).
 
- expand the semantics of core signatures so that a single matmul ufunc
can implement matrix-matrix, vector-matrix, matrix-vector, and
vector-vector multiplication.

I was initially concerned that adding optional dimensions for gufuncs would introduce additional complexity for only the benefit of a single function (matmul), but I'm now convinced that it makes sense:
1. All other arithmetic overloads use __array_ufunc__, and it would be nice to keep @/matmul in the same place.
2. There's a common family of gufuncs for which optional dimensions like np.matmul make sense: matrix functions where 1D arrays should be treated as 2D row- or column-vectors.

One example of this class of behavior would be np.linalg.solve, which could support vectors like Ax=b and matrices like Ax=B with the signature (m,m),(m,n?)->(m,n?). We couldn't immediately make np.linalg.solve a gufunc since it uses a subtly different dispatching rule, but it's the same use-case.

Another example would be the "matrix transpose" function that has been occasionally proposed, to swap the last two dimensions of an array. It could have the signature (m?,n)->(n,m?), which ensure that it is still well defined (as the identity) on 1d arrays.
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Stephan Hoyer-2
On Mon, May 28, 2018 at 7:36 PM Eric Wieser <[hidden email]> wrote:

which ensure that it is still well defined (as the identity) on 1d arrays.

This strikes me as a bad idea. There’s already enough confusion from beginners that array_1d.T is a no-op. If we introduce a matrix-transpose, it should either error on <1d inputs with a useful message, or insert the extra dimension. I’d favor the former.

To be clear: matrix transpose is an example use-case rather than a serious proposal in this discussion.

But given that idiomatic NumPy code uses 1D arrays in favor of explicit row/column vectors with shapes (1,n) and (n,1), I do think it does make sense for matrix transpose on 1D arrays to be the identity, because matrix transpose should convert back and forth between row and column vectors representations.

Certainly, matrix transpose should error on 0d arrays, because it doesn't make sense to transpose a scalar.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Andras Deak
On Tue, May 29, 2018 at 5:40 AM, Stephan Hoyer <[hidden email]> wrote:
> But given that idiomatic NumPy code uses 1D arrays in favor of explicit
> row/column vectors with shapes (1,n) and (n,1), I do think it does make
> sense for matrix transpose on 1D arrays to be the identity, because matrix
> transpose should convert back and forth between row and column vectors
> representations.
>
> Certainly, matrix transpose should error on 0d arrays, because it doesn't
> make sense to transpose a scalar.

Apologies for the probably academic nitpick, but if idiomatic code
uses 1d arrays as vectors then shouldn't scalars be compatible with
matrices with dimension (in the mathematical sense) of 1? Since the
matrix product of shapes (1,n) and (n,1) is (1,1) but the same for
shapes (n,) and (n,) is (), it might make sense after all for the
matrix transpose to be identity for scalars.
I'm aware that this is tangential to the primary discussion, but I'm
also wondering if I'm being confused about the subject (wouldn't be
the first time that I got confused about numpy scalars).

András
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Daπid
In reply to this post by Stephan Hoyer-2

On 29 May 2018 at 05:40, Stephan Hoyer <[hidden email]> wrote:
But given that idiomatic NumPy code uses 1D arrays in favor of explicit row/column vectors with shapes (1,n) and (n,1), I do think it does make sense for matrix transpose on 1D arrays to be the identity, because matrix transpose should convert back and forth between row and column vectors representations.

When doing algebra on paper, I like the braket notation. It makes abundantly clear the shape of the outputs, without having to remember on which side the transpose goes: <u|v> is a scalar, |u><v| is a matrix. I don't have a good way of translating this to numpy, but maybe someone else has an idea.


Certainly, matrix transpose should error on 0d arrays, because it doesn't make sense to transpose a scalar.

Unless the scalar is 8, in which case the transpose is np.inf...

Right now, np.int(8).T throws an error, but np.transpose(np.int(8)) gives a 0-d array. On one hand, it is nice to be able to use the same code for scalars as for vectors, but on the other, you may be making a mistake.


/David.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Andras Deak
On Tue, May 29, 2018 at 12:16 PM, Daπid <[hidden email]> wrote:
> Right now, np.int(8).T throws an error, but np.transpose(np.int(8)) gives a
> 0-d array. On one hand, it is nice to be able to use the same code for

`np.int` is just python `int`! What you mean is `np.int64(8).T` which
works fine, so does `np.array(8).T`.
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Ilhan Polat
Apart from the math-validity discussion, in my experience errors are used a bit too generously in the not-allowed ops. No ops are fine once you learn more about them such as transpose on 1D arrays (good or bad is another discussion). But raising errors bloat the computational code too much. "Is it a scalar oh then do this is it 1D oh make this one is it 2D then do something else." type of coding is really making life difficult.

Most of my time in the numerical code is spent on trying to catch scalars and 1D arrays and writing exceptions because I can't predict what the user would do or what the result should be after certain operations. Quite unwillingly, I've started making everything 2D whether it is required or not because then I can just avoid the following

np.eye(4)[:, 1]           # 1d
np.eye(4)[:, 1:2]         #2d
np.eye(4)[:, [1]]         #2d
np.eye(4)[:, [1]] @ 5                      # Error
np.eye(4)[:, [1]] @ np.array(5)        #Error
np.eye(4)[:, [1]] @ np.array([5])      # Result is 1D
np.eye(4)[:, [1]] @ np.array([[5]])    # Result 2D

So imagine I'm trying to get a simple multiply_these function, I have already quite some cases to consider such that the function is "Pythonic". If the second argument is int,float do *-mult, if it is a numpy array but has no dimensions then again *-mult but if it is 1d keep dims and also if it is 2d do @-mult. Add broadcasting rules on top of this and it gets a pretty wordy function.Hence, what I would suggest is to also include the use cases while deciding the behavior of a single functionality.

So indeed it doesn't make sense to transpose 0d array but as an array object now it would start to have a lot of Wat! moments. https://www.destroyallsoftware.com/talks/wat



On Tue, May 29, 2018 at 12:51 PM, Andras Deak <[hidden email]> wrote:
On Tue, May 29, 2018 at 12:16 PM, Daπid <[hidden email]> wrote:
> Right now, np.int(8).T throws an error, but np.transpose(np.int(8)) gives a
> 0-d array. On one hand, it is nice to be able to use the same code for

`np.int` is just python `int`! What you mean is `np.int64(8).T` which
works fine, so does `np.array(8).T`.
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: matmul as a ufunc

Nathaniel Smith
In reply to this post by Stephan Hoyer-2
On Mon, May 28, 2018, 20:41 Stephan Hoyer <[hidden email]> wrote:
On Mon, May 28, 2018 at 7:36 PM Eric Wieser <[hidden email]> wrote:

which ensure that it is still well defined (as the identity) on 1d arrays.

This strikes me as a bad idea. There’s already enough confusion from beginners that array_1d.T is a no-op. If we introduce a matrix-transpose, it should either error on <1d inputs with a useful message, or insert the extra dimension. I’d favor the former.

To be clear: matrix transpose is an example use-case rather than a serious proposal in this discussion.

But given that idiomatic NumPy code uses 1D arrays in favor of explicit row/column vectors with shapes (1,n) and (n,1), I do think it does make sense for matrix transpose on 1D arrays to be the identity, because matrix transpose should convert back and forth between row and column vectors representations.

More concretely, I think the idea is that if you write code like

  a.T @ a

then it's nice if that automatically works for both 2d and 1d arrays. Especially, say, if this is embedded inside a larger function so you have some responsibilities to you users to handle different inputs appropriately, and your users expect that to include both 2d matrices and 1d vectors. It reduces special cases.

But, on the other hand, if you write

a @ a.T

then you'll be in for a surprise... So maybe it's not a great idea after all.

(Note that here I'm using .T as a placeholder for a hypothetical "broadcasting matrix transpose". I don't think anyone proposes that .T itself should be changed to do this; I just needed some notation.)

-n

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion