Allowing broadcasting of code dimensions in generalized ufuncs

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
Hi All,

Following on a PR combining the ability to provide fixed and flexible
dimensions [1] (useful for, e.g., 3-vector input with a signature like
`(3),(3)->(3)`, and for `matmul`, resp.; based on earlier PRs by Jaime
[2] and Matt (Picus) [3]), I've now made a PR with a further
enhancement, which allows one can indicate that a core dimension can
be broadcast [4].

A particular use case is `all_equal`, a new function suggested in a
stalled PR by Matt (Harrigan) [5], which compares two arrays
axis-by-axis, but short-circuits if a non-equality is found (unlike
what is the case if one does `(a==b).all(axis)`). One thing that would
be obviously useful for a routine like `all_equal` is to be able to
provide an array as one argument and a constant as another, i.e., if
the core dimensions can be broadcast if needed, just like they are in
`(a==b).all(axis)`. This is currently not possible: with its signature
of `(n),(n)->()`, the two arrays have to have the same trailing size.

My PR provides the ability to indicate in the signature that a core
dimension can be broadcast, by using a suffix of "|1". Thus, the
signature of `all_equal` would become:

```
(n|1),(n|1)->()
```

Comments most welcome (yes, even on the notation - though I think it
is fairly self-explanatory)!

Marten

p.s. There are some similarities to the new "flexible" dimensions
implemented for `matmul` [1], but also differences. In particular, for
a signature of `(n?),(n?)->()`, one could also pass in an array of
trailing size n and a constant, but it would not be possible to pass
in an array with trailing size 1: the dimensions with the same name
have to be either present and the same or absent. In contrast, for
broadcasting, dimensions with the same name can have trailing size n,
size 1, or be absent (in which case they count as 1). For
broadcasting, any output dimensions with the same name are never
affected, while for flexible dimensions those are removed.

[1] https://github.com/numpy/numpy/pull/11175
[2] https://github.com/numpy/numpy/pull/5015
[3] https://github.com/numpy/numpy/pull/11132
[4] https://github.com/numpy/numpy/pull/11179
[5] https://github.com/numpy/numpy/pull/8528
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Stephan Hoyer-2
On Wed, May 30, 2018 at 11:15 AM Marten van Kerkwijk <[hidden email]> wrote:
My PR provides the ability to indicate in the signature that a core
dimension can be broadcast, by using a suffix of "|1". Thus, the
signature of `all_equal` would become:

```
(n|1),(n|1)->()
```

I read this as "dimensions may have size n or 1", which would exclude the possibility of scalars.

For all_equal, I think you could also use a signature like "(m?),(n?)->()", with a short-cut to automatically return False if m != n.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Matthew Harrigan
"short-cut to automatically return False if m != n", that seems like a silent bug

AFAICT there are 3 possibilities:
1) current behavior
2) a scalar or size 1 array may be substituted, ie a constant
3) a scalar or array with shape[-1] == 1 may be substituted and broadcasted

I am fond of using "n^" to signify #3 since I think of broadcasting as increasing the size of the array.  Although a stretch, "n#" might work for #2, as it reminds me of #define'ing constants in C.

To generalize a bit, most elementwise operations have obvious broadcasting cases and reduce operations have a core dimension.  Fusing any two, ie sumproduct, would result in a gufunc which would benefit from this ability.

On Wed, May 30, 2018 at 7:22 PM, Stephan Hoyer <[hidden email]> wrote:
On Wed, May 30, 2018 at 11:15 AM Marten van Kerkwijk <[hidden email]> wrote:
My PR provides the ability to indicate in the signature that a core
dimension can be broadcast, by using a suffix of "|1". Thus, the
signature of `all_equal` would become:

```
(n|1),(n|1)->()
```

I read this as "dimensions may have size n or 1", which would exclude the possibility of scalars.

For all_equal, I think you could also use a signature like "(m?),(n?)->()", with a short-cut to automatically return False if m != n.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
Hi Stephan, Matt,

My `n|1` was indeed meant to be read as `n or 1`, but with the
(implicit) understanding that any array can have as many ones
pre-pended as needed.

The signature `(n?),(n?)->()` is now set aside for flexible
dimensions: this would allow the constant, but not the trailing shape
of 1 (at least as we currently have implemented it). I do think that
is more consistent with the visual suggestion, that the think may be
absent. It also has implications for output: `(m?,n),(n,p?)->(m?,p?)`
is meant to indicate that if a dimension is absent, it should be
absent in the output as well. In contrast, for broadcasting, I'd
envisage `(n|1),(n|1)->(n)` to indicate that the output dimension will
always be present and be of length n.

I'm not sure I'm sold on `n^` - I don't think it gives an immediate
hint of what it would do...

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Nathaniel Smith
In reply to this post by Marten van Kerkwijk
On Wed, May 30, 2018 at 11:14 AM, Marten van Kerkwijk
<[hidden email]> wrote:

> Hi All,
>
> Following on a PR combining the ability to provide fixed and flexible
> dimensions [1] (useful for, e.g., 3-vector input with a signature like
> `(3),(3)->(3)`, and for `matmul`, resp.; based on earlier PRs by Jaime
> [2] and Matt (Picus) [3]), I've now made a PR with a further
> enhancement, which allows one can indicate that a core dimension can
> be broadcast [4].
>
> A particular use case is `all_equal`, a new function suggested in a
> stalled PR by Matt (Harrigan) [5], which compares two arrays
> axis-by-axis, but short-circuits if a non-equality is found (unlike
> what is the case if one does `(a==b).all(axis)`). One thing that would
> be obviously useful for a routine like `all_equal` is to be able to
> provide an array as one argument and a constant as another, i.e., if
> the core dimensions can be broadcast if needed, just like they are in
> `(a==b).all(axis)`. This is currently not possible: with its signature
> of `(n),(n)->()`, the two arrays have to have the same trailing size.
>
> My PR provides the ability to indicate in the signature that a core
> dimension can be broadcast, by using a suffix of "|1". Thus, the
> signature of `all_equal` would become:
>
> ```
> (n|1),(n|1)->()
> ```
>
> Comments most welcome (yes, even on the notation - though I think it
> is fairly self-explanatory)!

I'm currently -0.5 on both fixed dimensions and this broadcasting
dimension idea. My reasoning is:

- The use cases seem fairly esoteric. For fixed dimensions, I guess
the motivating example is cross-product (are there any others?). But
would it be so bad for a cross-product gufunc to raise an error if it
receives the wrong number of dimensions? For this broadcasting case...
well, obviously we've survived this long without all_equal :-). And
there's something funny about all_equal, since it's really smushing
together two conceptually separate gufuncs for efficiency. Should we
also have all_less_than, sum_square, ...? If this is a big problem,
then wouldn't it be better to solve it in a general way, like dask or
Numba or numexpr do? To be clear, I'm not saying these features are
necessarily *bad* ideas, in isolation -- just that the benefits aren't
very convincing, and there are trade-offs, like:

- When it comes to the core ufunc machinery, we have a limited
complexity budget. I'm nervous that if we add too many bells and
whistles, we'll end up writing ourselves into a corner where we have
trouble maintaining it, where it becomes difficult to predict how
different features interact, it becomes increasingly difficult for
third-parties to handle all the different features in their
__array_ufunc__ methods...

- And, we have a lot of other demands on the core ufunc machinery,
that might be better places to spend our limited complexity budget.
For example, can we come up with an extension to make np.sort a
gufunc? That seems like a much higher priority than figuring out how
to make all_equal a gufunc. What about refactoring the ufunc machinery
to support user-defined dtypes? That'll need some serious work, and
again, it's probably higher priority than supporting cross-product or
all_equal directly (or at least it seems that way to me).

Maybe there are more compelling use cases that I'm missing, but as it
is, I feel like trying to add too many features to the current ufunc
machinery is pretty risky for future maintainability, and we shouldn't
do it without really solid use cases.

-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
Hi Nathaniel,

I think the case for frozen dimensions is much more solid that just
`cross1d` - there are many operations that work on size-3 vectors.
Indeed, as I noted in the PR, I have just been wrapping a
Standards-of-Astronomy library in gufuncs, and many of its functions
require size-3 vectors or 3x3 matrices [1]. Of course, I can put
checks on the sizes, and I've now done that in a custom type resolver
(which I needed anyway since, as you say, user dtypes is currently not
easy), but there is a real problem for functions that take scalars and
produce vectors: with a signature like `(),()->(n)`, I am forced to
pass in an output with size 3, which is very inconvenient (especially
if I then also want to override with `__array_ufunc__` - now my
Quantity implementation also has to start changing an output already
put in. So, having frozen dimensions is definitely helpful for
developers of new gufuncs.

Furthermore, with frozen dimensions, the signature is not just
immediately clear, `(),()->(3)` for the example above, it is also
better in telling users about what a function does.

Indeed, I think this addition has much more justification than the `?`
which is much more complex than the fixed size, yet neither
particularly clear nor useful beyond the single purpose of matmul. (It
is just that that single purpose has fairly high weight...)

As for broadcasting, well, with the flexible dimensions defined, the
*additional* complexity is very small. I have no ready example other
than all_equal, though will say that I currently have code that does
`if a[0] == x && a[1] == x && a[2] ==x && np.all(a[3:] == x):` just
because the short-circuiting is well worth the time (it is unlikely in
this context that all a equals x). So, there is at least one example
of an actual need for this function.

All the best,

Marten

[1] https://github.com/astropy/astropy/pull/7502
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Allan Haldane
In reply to this post by Nathaniel Smith
On 05/31/2018 12:10 AM, Nathaniel Smith wrote:

> On Wed, May 30, 2018 at 11:14 AM, Marten van Kerkwijk
> <[hidden email]> wrote:
>> Hi All,
>>
>> Following on a PR combining the ability to provide fixed and flexible
>> dimensions [1] (useful for, e.g., 3-vector input with a signature like
>> `(3),(3)->(3)`, and for `matmul`, resp.; based on earlier PRs by Jaime
>> [2] and Matt (Picus) [3]), I've now made a PR with a further
>> enhancement, which allows one can indicate that a core dimension can
>> be broadcast [4].
>>
>> A particular use case is `all_equal`, a new function suggested in a
>> stalled PR by Matt (Harrigan) [5], which compares two arrays
>> axis-by-axis, but short-circuits if a non-equality is found (unlike
>> what is the case if one does `(a==b).all(axis)`). One thing that would
>> be obviously useful for a routine like `all_equal` is to be able to
>> provide an array as one argument and a constant as another, i.e., if
>> the core dimensions can be broadcast if needed, just like they are in
>> `(a==b).all(axis)`. This is currently not possible: with its signature
>> of `(n),(n)->()`, the two arrays have to have the same trailing size.
>>
>> My PR provides the ability to indicate in the signature that a core
>> dimension can be broadcast, by using a suffix of "|1". Thus, the
>> signature of `all_equal` would become:
>>
>> ```
>> (n|1),(n|1)->()
>> ```
>>
>> Comments most welcome (yes, even on the notation - though I think it
>> is fairly self-explanatory)!
>
> I'm currently -0.5 on both fixed dimensions and this broadcasting
> dimension idea. My reasoning is:
>
> - The use cases seem fairly esoteric. For fixed dimensions, I guess
> the motivating example is cross-product (are there any others?). But
> would it be so bad for a cross-product gufunc to raise an error if it
> receives the wrong number of dimensions? For this broadcasting case...
> well, obviously we've survived this long without all_equal :-). And
> there's something funny about all_equal, since it's really smushing
> together two conceptually separate gufuncs for efficiency. Should we
> also have all_less_than, sum_square, ...? If this is a big problem,
> then wouldn't it be better to solve it in a general way, like dask or
> Numba or numexpr do? To be clear, I'm not saying these features are
> necessarily *bad* ideas, in isolation -- just that the benefits aren't
> very convincing, and there are trade-offs, like:

I have often wished numpy had these short-circuiting gufuncs, for a very
long time. I specifically remember my fruitless searches for how to do
it back to 2007.

While "on average" short-circuiting only gives a speedup of 2x, in many
situations you can arrange your algorithm so short circuiting will
happen early, eg usually in the first 10 elements of a 10^6 element
array, giving enormous speedups.

Also, I do not imagine these as free-floating ufuncs, I think we can
arrange them in a logical way in a gufunc ecosystem. There would be some
"core ufuncs", with "associated gufuncs" accessible as attributes. For
instance, any_less_than will be accessible as less.any

binary "comparison" ufuncs would have attributes

less.any
less.all
less.first  # returns first matching index
less.count  # counts matches without intermediate bool array

This adds on to the existing attributes, for instance
ufuncs already have:

add.reduce
add.accumulate
add.reduceat
add.outer
add.at

It is unfortunate that all ufuncs currently have these attributes even
if they are unimplemented/inappropriate (eg, np.sin.reduce), I would
like to  remove the inappropriate ones, so each core ufunc will only
have the appropriate attribute "associated gufuncs".

Incidentally, once we make reduce/accumuate/... into "associated
gufuncs", I propose completely removing the "method" argument of
__array_ufunc__, since it is no longer needed and adds a lot
of complexity which implementors of an __array_ufunc__ are forced to
account for.

Cheers,
Allan






>
> - When it comes to the core ufunc machinery, we have a limited
> complexity budget. I'm nervous that if we add too many bells and
> whistles, we'll end up writing ourselves into a corner where we have
> trouble maintaining it, where it becomes difficult to predict how
> different features interact, it becomes increasingly difficult for
> third-parties to handle all the different features in their
> __array_ufunc__ methods...
>
> - And, we have a lot of other demands on the core ufunc machinery,
> that might be better places to spend our limited complexity budget.
> For example, can we come up with an extension to make np.sort a
> gufunc? That seems like a much higher priority than figuring out how
> to make all_equal a gufunc. What about refactoring the ufunc machinery
> to support user-defined dtypes? That'll need some serious work, and
> again, it's probably higher priority than supporting cross-product or
> all_equal directly (or at least it seems that way to me).
>
> Maybe there are more compelling use cases that I'm missing, but as it
> is, I feel like trying to add too many features to the current ufunc
> machinery is pretty risky for future maintainability, and we shouldn't
> do it without really solid use cases.
>
> -n
>

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Sebastian Berg
<snip>


> >
> > I'm currently -0.5 on both fixed dimensions and this broadcasting
> > dimension idea. My reasoning is:
> >
> > - The use cases seem fairly esoteric. For fixed dimensions, I guess
> > the motivating example is cross-product (are there any others?).
> > But
> > would it be so bad for a cross-product gufunc to raise an error if
> > it
> > receives the wrong number of dimensions? For this broadcasting
> > case...
> > well, obviously we've survived this long without all_equal :-). And
> > there's something funny about all_equal, since it's really smushing
> > together two conceptually separate gufuncs for efficiency. Should
> > we
> > also have all_less_than, sum_square, ...? If this is a big problem,
> > then wouldn't it be better to solve it in a general way, like dask
> > or
> > Numba or numexpr do? To be clear, I'm not saying these features are
> > necessarily *bad* ideas, in isolation -- just that the benefits
> > aren't
> > very convincing, and there are trade-offs, like:
>
> I have often wished numpy had these short-circuiting gufuncs, for a
> very
> long time. I specifically remember my fruitless searches for how to
> do
> it back to 2007.
>
> While "on average" short-circuiting only gives a speedup of 2x, in
> many
> situations you can arrange your algorithm so short circuiting will
> happen early, eg usually in the first 10 elements of a 10^6 element
> array, giving enormous speedups.

> Also, I do not imagine these as free-floating ufuncs, I think we can
> arrange them in a logical way in a gufunc ecosystem. There would be
> some
> "core ufuncs", with "associated gufuncs" accessible as attributes.
> For
> instance, any_less_than will be accessible as less.any
>

So then, why is it a gufunc and not an attribute using a ufunc with
binary output? I have asked this before, and even got arguments as to
why it fits gufuncs better, but frankly I still do not really
understand.

If it is an associated gufunc, why gufunc at all? We need any() and
all() here, so that is not that many methods, right? And when it comes
to buffering you have much more flexibility.

Say I have the operation:

(float_arr > int_arr).all(axis=(1, 2))

With int_arr being shaped (2, 1000, 1000) (i.e. large along the
interesting axes). A normal gufunc IIRC will get the whole inner
dimension as a float buffer. In other words, you gain practically
nothing, because the whole int_arr will be cast to float anyway.

If, however, you actually implement np.greater_than.all(float_arr,
int_arr, axis=(1, 2)) as a separate ufunc method, you would have the
freedom to work in the typical cache friendly buffersize chunk size for
each of the outer dimensions one at a time. A gufunc would require to
say: please do not buffer for me, or implement all possible type
combinations to do this.
(of course there are memory layout subtleties, since you would have to
optimize always for the "fast exit" case, potentially making the worst
case scenario much worse -- unless you do seriously fancy stuff
anyway).

A more general question is actually whether we should rather focus on
solving the same problem more generally.
For example if `numexpr` would implement all/any reductions, it may be
able to pretty simply get the identical tradeoffs with even more
flexibility! (I have to admit, it may get tricky with multiple
reduction dimensions, etc.)

- Sebastian


> binary "comparison" ufuncs would have attributes
>
> less.any
> less.all
> less.first  # returns first matching index
> less.count  # counts matches without intermediate bool array
>
> This adds on to the existing attributes, for instance
> ufuncs already have:
>
> add.reduce
> add.accumulate
> add.reduceat
> add.outer
> add.at
>
> It is unfortunate that all ufuncs currently have these attributes
> even
> if they are unimplemented/inappropriate (eg, np.sin.reduce), I would
> like to  remove the inappropriate ones, so each core ufunc will only
> have the appropriate attribute "associated gufuncs".
>
> Incidentally, once we make reduce/accumuate/... into "associated
> gufuncs", I propose completely removing the "method" argument of
> __array_ufunc__, since it is no longer needed and adds a lot
> of complexity which implementors of an __array_ufunc__ are forced to
> account for.
>
> Cheers,
> Allan
>
>
>
>
>
>
> >
> > - When it comes to the core ufunc machinery, we have a limited
> > complexity budget. I'm nervous that if we add too many bells and
> > whistles, we'll end up writing ourselves into a corner where we
> > have
> > trouble maintaining it, where it becomes difficult to predict how
> > different features interact, it becomes increasingly difficult for
> > third-parties to handle all the different features in their
> > __array_ufunc__ methods...
> >
> > - And, we have a lot of other demands on the core ufunc machinery,
> > that might be better places to spend our limited complexity budget.
> > For example, can we come up with an extension to make np.sort a
> > gufunc? That seems like a much higher priority than figuring out
> > how
> > to make all_equal a gufunc. What about refactoring the ufunc
> > machinery
> > to support user-defined dtypes? That'll need some serious work, and
> > again, it's probably higher priority than supporting cross-product
> > or
> > all_equal directly (or at least it seems that way to me).
> >
> > Maybe there are more compelling use cases that I'm missing, but as
> > it
> > is, I feel like trying to add too many features to the current
> > ufunc
> > machinery is pretty risky for future maintainability, and we
> > shouldn't
> > do it without really solid use cases.
> >
> > -n
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
In reply to this post by Allan Haldane
<snip>
> Incidentally, once we make reduce/accumuate/... into "associated gufuncs", I
> propose completely removing the "method" argument of __array_ufunc__, since
> it is no longer needed and adds a lot
> of complexity which implementors of an __array_ufunc__ are forced to
> account for.

For Quantity at least I found it somewhat helpful to have the method
separate from the ufunc, as how one deals with it still has
similarities. Though it would probably have been similarly easy if
`__array_ufunc__` had been passed `np.add.reduce` directly. Aside: am
not sure we can so easily remove `method` any more, but I guess we can
at least start getting to a state where it always is `__call__`.

Anyway, this does argue we should be rather careful with signatures..
In particular, for `__array_function__` we really need to think
carefully about `types` - an `__array_function__` specific dict at the
end may be more useful.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

einstein.edison
In reply to this post by Allan Haldane
While "on average" short-circuiting only gives a speedup of 2x, in many 
situations you can arrange your algorithm so short circuiting will
happen early, eg usually in the first 10 elements of a 10^6 element
array, giving enormous speedups.

Also, I do not imagine these as free-floating ufuncs, I think we can
arrange them in a logical way in a gufunc ecosystem. There would be some
"core ufuncs", with "associated gufuncs" accessible as attributes. For
instance, any_less_than will be accessible as less.any

binary "comparison" ufuncs would have attributes

less.any
less.all
less.first # returns first matching index
less.count # counts matches without intermediate bool array

This adds on to the existing attributes, for instance
ufuncs already have:

add.reduce
add.accumulate
add.reduceat
add.outer
add.at

It is unfortunate that all ufuncs currently have these attributes even
if they are unimplemented/inappropriate (eg, np.sin.reduce), I would
like to remove the inappropriate ones, so each core ufunc will only
have the appropriate attribute "associated gufuncs".

I’m definitely in favour of all this. It’d be great to have this, and it’d be an excellent ecosystem. I’ll add that composing ufuncs is something I’ve wanted, and that has come up from time to time.

Incidentally, once we make reduce/accumuate/... into "associated
gufuncs", I propose completely removing the "method" argument of
__array_ufunc__, since it is no longer needed and adds a lot
of complexity which implementors of an __array_ufunc__ are forced to
account for.

While removing ‘method’ is okay in my book, there should at least be a way to detect if something is e.g., a reduction, or an element-wise ufunc (this one is obvious, all shapes involved will be ()). We, for example, use this in pydata/sparse. As you can imagine, for sparse arrays, element-wise operations behave a certain way and there turns out to be a general way to do reductions that have certain properties as well. See my paper’s draft[1] for details. I don’t mind the __array_ufunc__ api changing, but I’d like it if there was a way to still access the information that was previously available.


Regards,
Hameer Abbasi
Sent from Astro for Mac

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
In reply to this post by Marten van Kerkwijk
Hi Sebastian,

This is getting a bit far off-topic (which is whether it is a good
idea to allow the ability to set frozen dimensions and broadcasting),
but on `all_equal`, I definitely see the point that a method might be
better, but that needs work: to expand the normal ufunc mechanism to
allow the inner loop to carry state (for which there is an issue [1])
and for it to tell the iterator to stop feeding it buffers. It is not
quite clear to me that this is better/easier than expanding gufuncs to
be able to deal with buffers...

All the best,

Marten

[1] https://github.com/numpy/numpy/issues/8773
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Allan Haldane
In reply to this post by Sebastian Berg
On 05/31/2018 09:53 AM, Sebastian Berg wrote:
> <snip>
 >

>> Also, I do not imagine these as free-floating ufuncs, I think we can
>> arrange them in a logical way in a gufunc ecosystem. There would be
>> some
>> "core ufuncs", with "associated gufuncs" accessible as attributes.
>> For
>> instance, any_less_than will be accessible as less.any
>>
>
> So then, why is it a gufunc and not an attribute using a ufunc with
> binary output? I have asked this before, and even got arguments as to
> why it fits gufuncs better, but frankly I still do not really
> understand.
>
> If it is an associated gufunc, why gufunc at all? We need any() and
> all() here, so that is not that many methods, right? And when it comes
> to buffering you have much more flexibility.
>
> Say I have the operation:
>
> (float_arr > int_arr).all(axis=(1, 2))
>
> With int_arr being shaped (2, 1000, 1000) (i.e. large along the
> interesting axes). A normal gufunc IIRC will get the whole inner
> dimension as a float buffer. In other words, you gain practically
> nothing, because the whole int_arr will be cast to float anyway.
>
> If, however, you actually implement np.greater_than.all(float_arr,
> int_arr, axis=(1, 2)) as a separate ufunc method, you would have the
> freedom to work in the typical cache friendly buffersize chunk size for
> each of the outer dimensions one at a time. A gufunc would require to
> say: please do not buffer for me, or implement all possible type
> combinations to do this.
> (of course there are memory layout subtleties, since you would have to
> optimize always for the "fast exit" case, potentially making the worst
> case scenario much worse -- unless you do seriously fancy stuff
> anyway).
>
> A more general question is actually whether we should rather focus on
> solving the same problem more generally.
> For example if `numexpr` would implement all/any reductions, it may be
> able to pretty simply get the identical tradeoffs with even more
> flexibility! (I have to admit, it may get tricky with multiple
> reduction dimensions, etc.)
>
> - Sebastian

Hmm, I hadn't known/considered the limitations of gufunc buffer sizes. I
was just thinking of them as a standardized interface which handles the
where/out/broadcasting for you.

I'll have to read about it.

One thing I don't like about the ufunc-method strategy is that it esily
pollutes all the ufuncs namespaces and their implementations, so many
ufuncs have to account for a new "all" method even if innapropriate, for
example.

Cheers,
Allan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Allan Haldane
In reply to this post by Nathaniel Smith
On 05/31/2018 12:10 AM, Nathaniel Smith wrote:
> On Wed, May 30, 2018 at 11:14 AM, Marten van Kerkwijk
> - When it comes to the core ufunc machinery, we have a limited
> complexity budget. I'm nervous that if we add too many bells and
> whistles, we'll end up writing ourselves into a corner where we have
> trouble maintaining it, where it becomes difficult to predict how
> different features interact, it becomes increasingly difficult for
> third-parties to handle all the different features in their
> __array_ufunc__ methods...

Re: implenetation complexity, I just want to bring up multiple-dispatch
signatures again, where the new signature syntax would just be to join
some signatures together with "|", and try them in order until one works.

I'm not convinced it's better myself, I just wanted to make sure we are
aware of it. The translation from the current proposed syntax would be:

   Current Syntax            Multiple-dispatch syntax

(n|1),(n|1)->()  <===>   (n),(n)->() | (n),()->() | (),(n)->()


(m?,n),(n,p?)->(m?,p?)  <===> (m,n),(n,p)->(m,p) |
                               (n),(n,p)->(p) |
                               (m,n),(n)->(m) |
                               (n),(n)->()

Conceivably, multiple-dispatch could reduce code complexity because we
don't need all the special flags like UFUNC_CORE_DIM_CAN_BROADCAST, and
instead of handling special syntax for ? and | and any future syntax
separately, we just need a split("|") and then loop with the old
signature handling code.

On the other hand the m-d signatures are much less concise and the
intention is perhaps harder to read. Yet they more explicitly state
which combinations are allowed, while with '?' syntax you might have to
puzzle it out.

Cheers,
Allan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
Hi Allan,

Seeing it written out like that, I quite like the multiple dispatch signature: perhaps verbose, but clear.

It does mean a different way of changing the ufunc structure, but I actually think it has the advantage of being possible without extending the structure (though that might still be needed for frozen dimensions...): currently, the relevant parts of _tagPyUFuncObject are:
```
/* 0 for scalar ufunc; 1 for generalized ufunc */
int core_enabled;
/* number of distinct dimension names in signature */
int core_num_dim_ix;
/* numbers of core dimensions of each argument */
int *core_num_dims;
/* dimension indices in a flatted form */
int *core_dim_ixs;
/* positions of 1st core dimensions of each argument in core_dim_ixs */
int *core_offsets;
```
I think this could be changed without making any serious change if we slightly repurposed `core_enabled` as meaning `number of signatures`. Then, the pointers above could become
```
int core_xxx[num_sigs][max_core_num_dim_ixs];
```
Less easy is `core_num_dim_ix`, but that number is not really needed anyway (nor used much in the code) as it is implicitly encoded in the indices. It could quite logically become the `max_core_num_dim_ixs` above.

All the best,

Marten

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
p.s. While my test case of `cube_equal` in the PR is perhaps not super-realistic, I don't know if one really wants to do multiple dispatch on something like "(o|1,n|1,m|1),(o|1,n|1,m|1)->()"... -- Marten

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk
p.p.s. Your multiple dispatch signature for broadcasted dimensions is actually not quite right: should be
(n|1),(n|1)->() ===>

(n),(n)->() | (n),(1)->() | (1),(n)->() | (n),() -> () | (),(n)->()

This is becoming quite verbose... (and perhaps become somewhat slow). Though arguably one could always allow missing dimensions to be 1 for this case, so that one could drop the last two.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Charles R Harris


On Thu, May 31, 2018 at 10:39 AM, Marten van Kerkwijk <[hidden email]> wrote:
p.p.s. Your multiple dispatch signature for broadcasted dimensions is actually not quite right: should be
(n|1),(n|1)->() ===>

(n),(n)->() | (n),(1)->() | (1),(n)->() | (n),() -> () | (),(n)->()

This is becoming quite verbose... (and perhaps become somewhat slow). Though arguably one could always allow missing dimensions to be 1 for this case, so that one could drop the last two.


At some point there should be a formal syntax description for these things.

Chuck 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

mattip
In reply to this post by Allan Haldane
On 31/05/18 08:23, Allan Haldane wrote:

> Re: implenetation complexity, I just want to bring up multiple-dispatch
> signatures again, where the new signature syntax would just be to join
> some signatures together with "|", and try them in order until one works.
>
> I'm not convinced it's better myself, I just wanted to make sure we
> are aware of it. The translation from the current proposed syntax
> would be:
>
>   Current Syntax            Multiple-dispatch syntax
>
> (n|1),(n|1)->()  <===>   (n),(n)->() | (n),()->() | (),(n)->()
>
>
> (m?,n),(n,p?)->(m?,p?)  <===> (m,n),(n,p)->(m,p) |
>                               (n),(n,p)->(p) |
>                               (m,n),(n)->(m) |
>                               (n),(n)->()
>
> ...
> Cheers,
> Allan

I am -1 on multiple signatures. We may revisit this in time, but for now
I find the minimal intrusiveness of the current changes appealing,
especially as it requires few to no changes whatsoever to the inner loop
function. Multiple dispatch could easily break that model by allowing
very different signatures to be aggregated into a single ufunc, leading
to unhandled edge cases and strange segfaults. It also seems to me that
looping over all signatures might slow down ufunc calling, leading to
endless variations of strategies of optimizing signature ordering.

Matti
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Stephan Hoyer-2
In reply to this post by Marten van Kerkwijk
On Thu, May 31, 2018 at 4:21 AM Marten van Kerkwijk <[hidden email]> wrote:
I think the case for frozen dimensions is much more solid that just
`cross1d` - there are many operations that work on size-3 vectors.
Indeed, as I noted in the PR, I have just been wrapping a
Standards-of-Astronomy library in gufuncs, and many of its functions
require size-3 vectors or 3x3 matrices [1]. Of course, I can put
checks on the sizes, and I've now done that in a custom type resolver
(which I needed anyway since, as you say, user dtypes is currently not
easy), but there is a real problem for functions that take scalars and
produce vectors: with a signature like `(),()->(n)`, I am forced to
pass in an output with size 3, which is very inconvenient (especially
if I then also want to override with `__array_ufunc__` - now my
Quantity implementation also has to start changing an output already
put in. So, having frozen dimensions is definitely helpful for
developers of new gufuncs.
 
I agree that the use-cases for frozen dimensions are well motivated. It's not as common as writing code that supports arbitrary dimensions, but given that the real world is three dimensional it comes up with some regularity. Certainly for these use-cases it would add significant values (not requiring pre-allocation of output arrays).

Furthermore, with frozen dimensions, the signature is not just
immediately clear, `(),()->(3)` for the example above, it is also
better in telling users about what a function does.

Yes, frozen dimensions really do feel like a natural fit. There is no ambiguity about what an integer means in a gufunc signature, so the complexity of the gufunc model (for users and __array_ufunc__ implementors) would remain roughly fixed.

In contrast, broadcasting would certainly increase the complexity of the model, as evidenced by the new syntax we would need. This may or may not be justified. Currently I am at -0.5 along with Nathaniel here.
 
Indeed, I think this addition has much more justification than the `?`
which is much more complex than the fixed size, yet neither
particularly clear nor useful beyond the single purpose of matmul. (It
is just that that single purpose has fairly high weight...)

Agreed, though at least in principle there is the slightly broader use of case of handling arguments that are either matrices or column/row vectors.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Allowing broadcasting of code dimensions in generalized ufuncs

Stephan Hoyer-2
In reply to this post by Matthew Harrigan
On Wed, May 30, 2018 at 5:01 PM Matthew Harrigan <[hidden email]> wrote:
"short-cut to automatically return False if m != n", that seems like a silent bug

I guess it depends on the use-cases. This is how np.array_equal() works: https://docs.scipy.org/doc/numpy/reference/generated/numpy.array_equal.html

We could even imagine incorporating this hypothetical "equality along some axes with broadcasting" functionality into axis/axes arguments for array_equal() if we choose this behavior.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
123