On Mon, 2020-07-06 at 12:39 -0600, Aaron Meurer wrote:

> I've been trying to figure out this behavior. It doesn't seem to be

> documented at

>

https://numpy.org/doc/stable/reference/arrays.indexing.html>

> > > > a = np.empty((2, 3))

> > > > a.shape

> (2, 5)

> > > > a[True].shape

> (1, 2, 5)

> > > > a[False].shape

> (0, 2, 5)

>

> It seems like indexing with a raw boolean (True or False) adds an

> axis

> with a dimension 1 or 0, resp.

>

> Except it only works once:

>

> > > > a[:,False]

> array([], shape=(2, 0, 3), dtype=float64)

> > > > a[:,False, False]

> array([], shape=(2, 0, 3), dtype=float64)

> > > > a[:,False,True].shape

> (2, 0, 3)

> > > > a[:,True,False].shape

> (2, 0, 3)

>

> The docs say "A single boolean index array is practically identical

> to

> x[obj.nonzero()]". I have a hard time seeing this as an extension of

> that, since indexing by `np.nonzero(False)` or `np.nonzero(True)`

> *replaces* the given axis.

>

> >>> a[np.nonzero(True)].shape

> (1, 3)

> > > > a[np.nonzero(False)].shape

> (0, 3)

>

> I think at best this behavior should be documented. I'm trying to

> understand the motivation for it, or if it's even intentional. And in

> particular, why do multiple boolean indices not insert multiple axes?

> It would actually be useful to be able to generically add length 0

> axes using an index, similar to how `newaxis` adds a length 1 axis.

Its fully intentional as it is the correct generalization from an N-D

boolean index to include a 0-D boolean index.

To be fair, there is a footnote in the "Detailed notes" saying that:

"the nonzero equivalence for Boolean arrays does not hold for zero

dimensional boolean arrays.", this is for technical reasons since

`nonzero` does not do useful things for 0-D input.

In any case, a boolean index always does the following:

1. It will *remove as many dimensions as the index has, because this

is the number of dimensions effectively indexed by it*

2. It will add a single new dimension at the same place. The length of

this new dimension is the number of `True` elements.

3. If you have multiple advanced indexing you get annoying broadcasting

of all of these. That is *always* confusing for boolean indices.

0-D should not be too special there...

And this generalizes to 0-D just as well, even if it may be a bit

surprising at first.

I have written much of this more clearly once before in this NEP, which

may be a good read to _really_ understand it:

https://numpy.org/neps/nep-0021-advanced-indexing.htmlIn general, I wonder if going into much depth about how 0-D arrays are

not actually really handled very special is good. Yes, its confusing

on its own, but it seems also a bit like overloading the user with

unnecessary knowledge?

Cheers,

Sebastian

>

> Aaron Meurer

> _______________________________________________

> NumPy-Discussion mailing list

>

[hidden email]
>

https://mail.python.org/mailman/listinfo/numpy-discussion>

_______________________________________________

NumPy-Discussion mailing list

[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion