Looking for description/insight/documentation on matmul

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Looking for description/insight/documentation on matmul

jeff saremi
Is there any resource available or anyone who's able to describe matmul operation of matrices when n > 2?

The only description i can find is: "If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly." which is very cryptic to me. 
Could someone break this down please? 
when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting dimensions? is there one answer to that? Is it deterministic?
What does "residing in the last two indices" mean? What is broadcast and where?
thanks
jeff

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Looking for description/insight/documentation on matmul

Stephan Hoyer-2
Hi Jeff,

I think PEP 465 would be the definitive reference here. See the section on "Intended usage details" in https://www.python.org/dev/peps/pep-0465/

Cheers,
Stephan

On Mon, Jul 9, 2018 at 9:48 AM jeff saremi <[hidden email]> wrote:
Is there any resource available or anyone who's able to describe matmul operation of matrices when n > 2?

The only description i can find is: "If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly." which is very cryptic to me. 
Could someone break this down please? 
when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting dimensions? is there one answer to that? Is it deterministic?
What does "residing in the last two indices" mean? What is broadcast and where?
thanks
jeff
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Looking for description/insight/documentation on matmul

mattip
In reply to this post by jeff saremi
On 09/07/18 09:48, jeff saremi wrote:

> Is there any resource available or anyone who's able to describe
> matmul operation of matrices when n > 2?
>
> The only description i can find is: "If either argument is N-D, N > 2,
> it is treated as a stack of matrices residing in the last two indexes
> and broadcast accordingly." which is very cryptic to me.
> Could someone break this down please?
> when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting
> dimensions? is there one answer to that? Is it deterministic?
> What does "residing in the last two indices" mean? What is broadcast
> and where?
> thanks
> jeff
>

You could do

np.matmul(np.ones((2, 3, 4, 5, 6)), np.ones((2, 3, 4, 6, 7))).shape

which yields (2, 3, 4, 5, 7).

When ndim >= 2 in both operands, matmul uses the last two dimensions as
(..., n, m) @ (...., m, p) -> (..., n, p). Note the repeating "m", so
your example would not work: n1=5, m1=6 in the first operand and m2=8,
p2=9 in the second so m1 != m2.

The "broadcast" refers only to the "..." dimensions, if in either of the
operands you replace the 2 or 3 or 4 with 1 then that operand will
broadcast (repeat itself) across that dimension to fit the other
operand. Also if one of the three first dimensions is missing in one of
the operands it will broadcast.

When ndim < 2 for one of the operands only, it will be interpreted as
"m", and the other dimension "n" or "p" will not appear on the output,
so the signature is (..., n, m),(m) -> (..., n) or (m),(..., m, p)->(..., p)

When ndim < 2 for both of the operands, it is the same as  a dot product
and will produce a scalar.

You didn't ask, but I will complete the picture: np.dot is different for
the case of n>=2. The result will extend (combine? broadcast across?)
both sets of ... dimensions, so

np.dot(np.ones((2,3,4,5,6)), np.ones((8, 9, 6, 7))).shape

which yields (2, 6, 4, 5, 8, 9, 7). The (2, 3, 4) dimensions are
followed by (8, 9)

Matti
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Looking for description/insight/documentation on matmul

jeff saremi
Thanks a lot Matti. It makes a lot more sense now.

From: NumPy-Discussion <numpy-discussion-bounces+jeffsaremi=[hidden email]> on behalf of Matti Picus <[hidden email]>
Sent: Monday, July 9, 2018 10:54 AM
To: [hidden email]
Subject: Re: [Numpy-discussion] Looking for description/insight/documentation on matmul
 
On 09/07/18 09:48, jeff saremi wrote:
> Is there any resource available or anyone who's able to describe
> matmul operation of matrices when n > 2?
>
> The only description i can find is: "If either argument is N-D, N > 2,
> it is treated as a stack of matrices residing in the last two indexes
> and broadcast accordingly." which is very cryptic to me.
> Could someone break this down please?
> when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting
> dimensions? is there one answer to that? Is it deterministic?
> What does "residing in the last two indices" mean? What is broadcast
> and where?
> thanks
> jeff
>

You could do

np.matmul(np.ones((2, 3, 4, 5, 6)), np.ones((2, 3, 4, 6, 7))).shape

which yields (2, 3, 4, 5, 7).

When ndim >= 2 in both operands, matmul uses the last two dimensions as
(..., n, m) @ (...., m, p) -> (..., n, p). Note the repeating "m", so
your example would not work: n1=5, m1=6 in the first operand and m2=8,
p2=9 in the second so m1 != m2.

The "broadcast" refers only to the "..." dimensions, if in either of the
operands you replace the 2 or 3 or 4 with 1 then that operand will
broadcast (repeat itself) across that dimension to fit the other
operand. Also if one of the three first dimensions is missing in one of
the operands it will broadcast.

When ndim < 2 for one of the operands only, it will be interpreted as
"m", and the other dimension "n" or "p" will not appear on the output,
so the signature is (..., n, m),(m) -> (..., n) or (m),(..., m, p)->(..., p)

When ndim < 2 for both of the operands, it is the same as  a dot product
and will produce a scalar.

You didn't ask, but I will complete the picture: np.dot is different for
the case of n>=2. The result will extend (combine? broadcast across?)
both sets of ... dimensions, so

np.dot(np.ones((2,3,4,5,6)), np.ones((8, 9, 6, 7))).shape

which yields (2, 6, 4, 5, 8, 9, 7). The (2, 3, 4) dimensions are
followed by (8, 9)

Matti
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Looking for description/insight/documentation on matmul

jeff saremi
In reply to this post by Stephan Hoyer-2
Looks great. thanks a lot

From: NumPy-Discussion <numpy-discussion-bounces+jeffsaremi=[hidden email]> on behalf of Stephan Hoyer <[hidden email]>
Sent: Monday, July 9, 2018 10:50 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Looking for description/insight/documentation on matmul
 
Hi Jeff,

I think PEP 465 would be the definitive reference here. See the section on "Intended usage details" in https://www.python.org/dev/peps/pep-0465/

Cheers,
Stephan

On Mon, Jul 9, 2018 at 9:48 AM jeff saremi <[hidden email]> wrote:
Is there any resource available or anyone who's able to describe matmul operation of matrices when n > 2?

The only description i can find is: "If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly." which is very cryptic to me. 
Could someone break this down please? 
when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting dimensions? is there one answer to that? Is it deterministic?
What does "residing in the last two indices" mean? What is broadcast and where?
thanks
jeff
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion