# What is up with raw boolean indices (like a[False])? Classic List Threaded 3 messages Open this post in threaded view
|

## What is up with raw boolean indices (like a[False])?

 I've been trying to figure out this behavior. It doesn't seem to be documented at https://numpy.org/doc/stable/reference/arrays.indexing.html>>> a = np.empty((2, 3)) >>> a.shape (2, 5) >>> a[True].shape (1, 2, 5) >>> a[False].shape (0, 2, 5) It seems like indexing with a raw boolean (True or False) adds an axis with a dimension 1 or 0, resp. Except it only works once: >>> a[:,False] array([], shape=(2, 0, 3), dtype=float64) >>> a[:,False, False] array([], shape=(2, 0, 3), dtype=float64) >>> a[:,False,True].shape (2, 0, 3) >>> a[:,True,False].shape (2, 0, 3) The docs say "A single boolean index array is practically identical to x[obj.nonzero()]". I have a hard time seeing this as an extension of that, since indexing by `np.nonzero(False)` or `np.nonzero(True)` *replaces* the given axis.  >>> a[np.nonzero(True)].shape (1, 3) >>> a[np.nonzero(False)].shape (0, 3) I think at best this behavior should be documented. I'm trying to understand the motivation for it, or if it's even intentional. And in particular, why do multiple boolean indices not insert multiple axes? It would actually be useful to be able to generically add length 0 axes using an index, similar to how `newaxis` adds a length 1 axis. Aaron Meurer _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion
 On Mon, 2020-07-06 at 12:39 -0600, Aaron Meurer wrote: > I've been trying to figure out this behavior. It doesn't seem to be > documented at > https://numpy.org/doc/stable/reference/arrays.indexing.html> > > > > a = np.empty((2, 3)) > > > > a.shape > (2, 5) > > > > a[True].shape > (1, 2, 5) > > > > a[False].shape > (0, 2, 5) > > It seems like indexing with a raw boolean (True or False) adds an > axis > with a dimension 1 or 0, resp. > > Except it only works once: > > > > > a[:,False] > array([], shape=(2, 0, 3), dtype=float64) > > > > a[:,False, False] > array([], shape=(2, 0, 3), dtype=float64) > > > > a[:,False,True].shape > (2, 0, 3) > > > > a[:,True,False].shape > (2, 0, 3) > > The docs say "A single boolean index array is practically identical > to > x[obj.nonzero()]". I have a hard time seeing this as an extension of > that, since indexing by `np.nonzero(False)` or `np.nonzero(True)` > *replaces* the given axis. > >  >>> a[np.nonzero(True)].shape > (1, 3) > > > > a[np.nonzero(False)].shape > (0, 3) > > I think at best this behavior should be documented. I'm trying to > understand the motivation for it, or if it's even intentional. And in > particular, why do multiple boolean indices not insert multiple axes? > It would actually be useful to be able to generically add length 0 > axes using an index, similar to how `newaxis` adds a length 1 axis. Its fully intentional as it is the correct generalization from an N-D boolean index to include a 0-D boolean index. To be fair, there is a footnote in the "Detailed notes" saying that: "the nonzero equivalence for Boolean arrays does not hold for zero dimensional boolean arrays.", this is for technical reasons since `nonzero` does not do useful things for 0-D input. In any case, a boolean index always does the following: 1. It will *remove as many dimensions as the index has, because this    is the number of dimensions effectively indexed by it* 2. It will add a single new dimension at the same place.  The length of    this new dimension is the number of `True` elements. 3. If you have multiple advanced indexing you get annoying broadcasting    of all of these. That is *always* confusing for boolean indices.    0-D should not be too special there... And this generalizes to 0-D just as well, even if it may be a bit surprising at first. I have written much of this more clearly once before in this NEP, which may be a good read to _really_ understand it: https://numpy.org/neps/nep-0021-advanced-indexing.htmlIn general, I wonder if going into much depth about how 0-D arrays are not actually really handled very special is good.  Yes, its confusing on its own, but it seems also a bit like overloading the user with unnecessary knowledge? Cheers, Sebastian > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.python.org/mailman/listinfo/numpy-discussion> _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc (849 bytes) Download Attachment