Hi Marten,

On Mon, Jun 18, 2018 at 12:34:03PM -0400, Marten van Kerkwijk wrote:

> That looks quite nice and expressive. In the context of a discussion we

> have been having about describing `matmul/@` and possibly broadcastable

> dimensions, I think from your description it sounds like one would describe

> `@` with multiple functions (the multiple dispatch we have been (are?)

> considering as well):

>

>

> "... * N * M * T, ... * M * P * T -> ... * N * P * T"

> "M * T, ... * M * P * T -> ... P * T"

> "... * N * M * T, M * T -> ... * N * T"

> "M * T, M * T -> T"

Yes, that's the way, and the outer dimensions (the part matched by the

ellipsis) are always broadcast like in NumPy.

> Is there a way to describe broadcasting? The sample case we've come up

> with is a function that calculates a weighted mean. This might take

> (values, sigmas) and return (mean, sigma_mean), which would imply a

> signature like:

>

> "... N * T, ... N * T -> ... * T, ... * T"

>

> But would your signature allow indicating that one could pass in a single

> sigma? I.e., broadcast the second 1 to N if needed?

Actually I came across this today when implementing optimized matching

for binary functions.

I wanted the faster kernel

"... * N * int64, ... * N * int64 -> ... * N * int64"

to also match e.g. the input

"int64, 10 * int64".

The generic datashape spec would forbid this, but perhaps the '?' that

you propose in nep-0020 would offer a way out of this for ndtypes.

It's a bit confusing for datashape, since there is already a questionmark

for missing variable dimensions (that have shape==0 in the data).

>>> ndt("var * ?var * int64")

ndt("var * ?var * int64")

This would be the type for e.g. [[0], None, [1,2,3]].

But for symbolic dimensions (which only match fixed dimensions) perhaps this

"... * ?N * int64, ... * ?N * int64 -> ... * ?N * int64"

or, as in the NEP,

"... * N? * int64, ... * N? * int64 -> ... * N? * int64"

should mean "At least one input has ndim >= 1, broadcast as necessary".

This still means that for the "all ndim==0" case one would need an

additional kernel "int64, int64 -> int64".

> I realize that this is no longer about describing precisely what the

> function doing the calculation expects, but rather what an upper level is

> allowed to do before calling the function (i.e., take a dimension of 1 and

> broadcast it).

Yes, for datashape the problem is that it also allows non-broadcastable

signatures like "N * float64", really the same as "double x[]" in C.

But the '?' with occasionally one additional kernel for ndim==0 could

solve this.

Stefan Krah

_______________________________________________

NumPy-Discussion mailing list

[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion