On Mon, May 28, 2018 at 4:26 PM, Stephan Hoyer <

[hidden email]> wrote:

> On Mon, May 21, 2018 at 5:42 PM Matti Picus <

[hidden email]> wrote:

>>

>> - create a wrapper that can convince the ufunc mechanism to call

>> __array_ufunc__ even on functions that are not true ufuncs

>

>

> I am somewhat opposed to this approach, because __array_ufunc__ is about

> overloading ufuncs, and as soon as we relax this guarantee the set of

> invariants __array_ufunc__ implementors rely on becomes much more limited.

>

> We really should have another mechanism for arbitrary function overloading

> in NumPy (NEP to follow shortly!).

>

>>

>> - expand the semantics of core signatures so that a single matmul ufunc

>> can implement matrix-matrix, vector-matrix, matrix-vector, and

>> vector-vector multiplication.

>

>

> I was initially concerned that adding optional dimensions for gufuncs would

> introduce additional complexity for only the benefit of a single function

> (matmul), but I'm now convinced that it makes sense:

> 1. All other arithmetic overloads use __array_ufunc__, and it would be nice

> to keep @/matmul in the same place.

> 2. There's a common family of gufuncs for which optional dimensions like

> np.matmul make sense: matrix functions where 1D arrays should be treated as

> 2D row- or column-vectors.

>

> One example of this class of behavior would be np.linalg.solve, which could

> support vectors like Ax=b and matrices like Ax=B with the signature

> (m,m),(m,n?)->(m,n?). We couldn't immediately make np.linalg.solve a gufunc

> since it uses a subtly different dispatching rule, but it's the same

> use-case.

Specifically, np.linalg.solve uses a unique rule where

solve(a, b)

assumes that b is a stack of vectors if (a.ndim - 1 == b.ndim), and

otherwise assumes that it's a stack of matrices. This is pretty

confusing. You'd think that solve(a, b) should be equivalent to

(inv(a) @ b), but it isn't.

Say a.shape == (10, 3, 3) and b.shape == (3,). Then inv(a) @ b works,

and does what you'd expect: for each of the ten 3x3 matrices in a, it

computes the inverse and multiplies it by the 1-d vector in b (treated

as a column vector). But solve(a, b) is an error, because the

dimension aren't lined up to trigger the special handling for 1-d

vectors.

Or, say a.shape == (10, 3, 3) and b.shape == (3, 3). Then again inv(a)

@ b works, and does what you'd expect: for each of the ten 3x3

matrices in a, it computes the inverse and multiplies it by the 3x3

matrix in b. But again solve(a, b) is an error -- this time because

the special handling for 1-d vectors *does* kick in, even though it

doesn't make sense: it tries to match up the ten 3x3 matrices in a

against the three one-dimensional vectors in b, and 10 != 3 so the

broadcasting fails.

This also points to even more confusing possibilities: if a.shape ==

(3, 3) or (3, 3, 3, 3) and b.shape == (3, 3), then inv(a) @ b and

solve(a, b) both work and do the same thing. But if a.shape == (3, 3,

3), then inv(a) @ b and solve(a, b) both work, and do totally

*different* things.

I wonder if we should deprecate these corner cases, and eventually

migrate to making inv(a) @ b and solve(a, b) the same in all

situations. If we did, then solve(a, b) would actually be a gufunc

with signature (m,m),(m,n?)->(m,n?).

I think the cases that would need to be changed are those where

(a.ndim - 1 == b.ndim and b.ndim > 1). My guess is that this happens

very rarely in existing code, especially since (IIRC) this behavior

was only added a few years ago, when we gufunc-ified numpy.linalg.

> Another example would be the "matrix transpose" function that has been

> occasionally proposed, to swap the last two dimensions of an array. It could

> have the signature (m?,n)->(n,m?), which ensure that it is still well

> defined (as the identity) on 1d arrays.

Unfortunately I don't think we could make "broadcasting matrix

transpose" be literally a gufunc, since it should return a view. But I

guess there'd still be some value in having the notation available

just when talking about it, so we could say "this operation is *like*

a gufunc with signature (m?,n)->(n,m?), except that it returns a

view".

-n

--

Nathaniel J. Smith --

https://vorpus.org_______________________________________________

NumPy-Discussion mailing list

[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion