Hi All, The boolean binary '-' operator was deprecated back in NumPy 1.9 and changed to an error in 1.13. This caused a number of failures in downstream projects. The choices now are to continue the deprecation for another couple of releases, or simply give up on the change. For booleans, `a - b` was implemented as `a xor b`, which leads to the somewhat unexpected identity `a - b == b - a`, but it is a handy operator that allows simplification of some functions, `numpy.diff` among therm. At this point I'm inclined to give up on the deprecation and retain the old behavior. It is a bit impure but perhaps we can consider it a feature rather than a bug.The unary `-` operator for booleans, now an error, was also deprecated in 1.9 and changed to an error in 1.13. There have been no complaints about that (yet), and it seems like a reasonable thing to do, so I am inclined to leave that error in place. What do others think the correct way forward is? Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
Hi Chuck
On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote:
What was the original motivation behind the deprecation? `xor` seems like exactly what one would expect when subtracting boolean arrays.
But, in principle, I'm not against the deprecation (we've had to fix a few problems that arose in skimage, but nothing big).
Stéfan
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On 25.06.2017 18:45, Stefan van der Walt wrote:
> Hi Chuck > > On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote: >> The boolean binary '-' operator was deprecated back in NumPy 1.9 and >> changed to an error in 1.13. This caused a number of failures in >> downstream projects. The choices now are to continue the deprecation >> for another couple of releases, or simply give up on the change. For >> booleans, `a - b` was implemented as `a xor b`, which leads to the >> somewhat unexpected identity `a - b == b - a`, but it is a handy >> operator that allows simplification of some functions, `numpy.diff` >> among therm. At this point I'm inclined to give up on the deprecation >> and retain the old behavior. It is a bit impure but perhaps we can >> consider it a feature rather than a bug. > > What was the original motivation behind the deprecation? `xor` seems > like exactly what one would expect when subtracting boolean arrays. > > But, in principle, I'm not against the deprecation (we've had to fix a > few problems that arose in skimage, but nothing big). > > Stéfan > > I am against this deprecation for apparently cosmetic reasons. Is there any practical drawback in that it makes subtraction commutative for booleans? numpy should not be imposing change of style when the existing sub par historical style does not cause actual bugs. While I don't like it I can accept a deprecation warning that will never be acted upon. _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On Sun, 2017-06-25 at 18:59 +0200, Julian Taylor wrote:
> On 25.06.2017 18:45, Stefan van der Walt wrote: > > Hi Chuck > > > > On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote: > > > The boolean binary '-' operator was deprecated back in NumPy 1.9 > > > and > > > changed to an error in 1.13. This caused a number of failures in > > > downstream projects. The choices now are to continue the > > > deprecation > > > for another couple of releases, or simply give up on the change. > > > For > > > booleans, `a - b` was implemented as `a xor b`, which leads to > > > the > > > somewhat unexpected identity `a - b == b - a`, but it is a handy > > > operator that allows simplification of some functions, > > > `numpy.diff` > > > among therm. At this point I'm inclined to give up on the > > > deprecation > > > and retain the old behavior. It is a bit impure but perhaps we > > > can > > > consider it a feature rather than a bug. > > > > What was the original motivation behind the deprecation? `xor` > > seems > > like exactly what one would expect when subtracting boolean arrays. > > > > But, in principle, I'm not against the deprecation (we've had to > > fix a > > few problems that arose in skimage, but nothing big). > > > > Stéfan > > > > > > I am against this deprecation for apparently cosmetic reasons. > Is there any practical drawback in that it makes subtraction > commutative > for booleans? > > numpy should not be imposing change of style when the existing sub > par > historical style does not cause actual bugs. > > While I don't like it I can accept a deprecation warning that will > never > be acted upon. but more visible.... For the unary minus, there are good reasons. For subtract, I don't remember really, but I don't think there was any huge argument for it. Probably it was mostly that many feel that: `False - True == -1` as is the case in python while we have: `np.False_ - np.True_ == np.True_`. And going to a deprecation would open up that possibility (though maybe you could go there directly). Not that I am convinced of that option. So, I don't mind much either way, but unless there is a concrete plan with quite a bit of support we should maybe just go with the conservative option. - Sebastian > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.python.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc (817 bytes) Download Attachment |
In reply to this post by Stefan van der Walt
On Sun, Jun 25, 2017 at 9:45 AM, Stefan van der Walt
<[hidden email]> wrote: > Hi Chuck > > On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote: > >> The boolean binary '-' operator was deprecated back in NumPy 1.9 and changed >> to an error in 1.13. This caused a number of failures in downstream >> projects. The choices now are to continue the deprecation for another couple >> of releases, or simply give up on the change. For booleans, `a - b` was >> implemented as `a xor b`, which leads to the somewhat unexpected identity `a >> - b == b - a`, but it is a handy operator that allows simplification of some >> functions, `numpy.diff` among therm. At this point I'm inclined to give up >> on the deprecation and retain the old behavior. It is a bit impure but >> perhaps we can consider it a feature rather than a bug. > > > What was the original motivation behind the deprecation? `xor` seems like > exactly what one would expect when subtracting boolean arrays. > > But, in principle, I'm not against the deprecation (we've had to fix a few > problems that arose in skimage, but nothing big). I believe that this happened as part of a review of the whole arithmetic system for np.bool_. Traditionally, we have + is "or", binary - is "xor", and unary - is "not". Here are some identities you might expect, if 'a' and 'b' are np.bool_ objects: a - b = a + (-b) a + b - b = a bool(a + b) = bool(a) + bool(b) bool(a - b) = bool(a) - bool(b) bool(-a) = -bool(a) But in fact none of these identities hold. Furthermore, the np.bool_ arithmetic operations are all confusing synonyms for operations that could be written more clearly using the proper boolean operators |, ^, ~, so they violate TOOWTDI. So I think the general idea was to deprecate all of this nonsense. It looks like what actually happened is that binary - and unary - got deprecated a while back and are now raising errors in 1.13.0, but + did not. This is sort of unfortunate, because binary - is the only one of these that's somewhat defensible (it doesn't match the builtin bool type, but it does at least correspond to subtraction in Z/2, so identities like 'a - (b - b) = a' do hold). I guess my preference would be: 1) deprecate + 2) move binary - back to deprecated-but-not-an-error 3) fix np.diff to use logical_xor when the inputs are boolean, since that seems to be what people expect 4) keep unary - as an error And if we want to be less aggressive, then a reasonable alternative would be: 1) deprecate + 2) un-deprecate binary - 3) keep unary - as an error -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
OMG deprecating + would be a nightmare. I can’t even begin to count the number of times I’ve used e.g. np.sum(arr == num)… Originally with a dtype cast but generally I’ve removed it because it worked.
… But I just saw the behaviour of `sum` is different from that of adding arrays together (where it indeed means `or`), which I agree is confusing. As long as the sum and mean behaviours are unchanged, I won’t raise too much of a fuss. =P
Generally, although one might expect xor, what *I* would expect is for the behaviour to match the Python bool type, which is not the case right now. So my vote would be to modify ***in NumPy 2.0*** the behaviour of + and - to match Python’s built-in bool (ie upcasting to int).
And, in general, I’m in favour of something as foundational as NumPy, in version 1.x, to follow semantic versioning and not break APIs until 2.x.
Juan.
On 27 Jun 2017, 9:25 AM +1000, Nathaniel Smith <[hidden email]>, wrote: On Sun, Jun 25, 2017 at 9:45 AM, Stefan van der Walt _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On Mon, Jun 26, 2017 at 6:14 PM, Juan Nunez-Iglesias <[hidden email]> wrote:
That's because xor corresponds to addition in Z/2 and every element is its own additive inverse.
Using '+' for 'or' and '*' for 'and' is pretty common and the variation of '+' for 'xor' was common back in the day because 'and' and 'xor' make boolean algebra a ring, which appealed to mathematicians as opposed to everyone else ;) You can see the same progression in measure theory where eventually intersection and xor (symmetric difference) was replaced with union and complement. Using '-' for xor is something I hadn't seen outside of numpy, but I suspect it must be standard somewhere. I would leave '*' and '+' alone, as the breakage and inconvenience from removing them would be significant. Chuck _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On Jun 26, 2017 6:56 PM, "Charles R Harris" <[hidden email]> wrote:
'+' for 'xor' and '*' for 'and' is perfectly natural; that's just + and * in Z/2. It's not only a ring, it's a field! '+' for 'or' is much weirder; why would you use '+' for an operation that's not even invertible? I guess it's a semi-ring. But we have the '|' character right there; there's no expectation that every weird mathematical notation will be matched in numpy... The most notable is that '*' doesn't mean matrix multiplication.
'*' doesn't bother me, because it really does have only one sensible behavior; even built-in bool() effectively uses 'and' for '*'. But, now I remember... The major issue here is that some people want dot(a, b) on Boolean matrices to use these semantics, right? Because in this particular case it leads to some useful connections to the matrix representation for logical relations [1]. So it's sort of similar to the diff() case. For the basic operation, using '|' or '^' is fine, but there are these derived operations like 'dot' and 'diff' where people have different expectations. I guess Juan's example of 'sum' is relevant here too. It's pretty weird that if 'a' and 'b' are one-dimensional boolean arrays, 'a @ b' and 'sum(a * b)' give totally different results. So that's the fundamental problem: there are a ton of possible conventions that are each appealing in one narrow context, and they all contradict each other, so trying to shove them all into numpy simultaneously is messy. I'm glad we at least seem to have succeeded in getting rid of unary '-', that one was particularly indefensible in the context of everything else :-). For the rest, I'm really not sure whether it's better to deprecate everything and tell people to use specialized tools for specialized purposes (e.g. add a 'logical_dot'), or to special case the high-level operations people want (make 'dot' and 'diff' continue to work, but deprecate + and -), or just leave the whole incoherent mish-mash alone. _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
Forgive my ignorance, but what is "Z/2"? On Tue, Jun 27, 2017 at 5:35 PM, Nathaniel Smith <[hidden email]> wrote:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On Tue, Jun 27, 2017 at 3:01 PM, Benjamin Root <[hidden email]> wrote:
> > Forgive my ignorance, but what is "Z/2"? https://groupprops.subwiki.org/wiki/Cyclic_group:Z2 _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On Tue, Jun 27, 2017 at 3:09 PM, Robert Kern <[hidden email]> wrote:
> On Tue, Jun 27, 2017 at 3:01 PM, Benjamin Root <[hidden email]> wrote: >> >> Forgive my ignorance, but what is "Z/2"? > > https://groupprops.subwiki.org/wiki/Cyclic_group:Z2 > https://en.wikipedia.org/wiki/Cyclic_group This might be a slightly better link? https://en.wikipedia.org/wiki/Modular_arithmetic#Integers_modulo_n Anyway, it's a math-nerd way of saying "the integers modulo two", i.e. the numbers 0 and 1 with * as AND and + as XOR. But the nice thing about Z/2 is that if you know some abstract algebra, then one of the most fundamental theorems is that if p is prime then Z/p is a "field", meaning that * and + are particularly well-behaved. And 2 is a prime, so pointing out that the bools with AND and XOR is the same as Z/2 is a way of saying "this way of defining * and + is internally consistent and well-behaved". -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
It seems to me that after a healthy post-deprecation cycle, and if we choose to keep the Z/2 meaning of __sub__, it might be worth reintroducing __neg__ as a no-op? AFAICT, this is consistent with the Z/2 interpretation? Eric On Wed, 28 Jun 2017 at 00:08 Nathaniel Smith <[hidden email]> wrote: On Tue, Jun 27, 2017 at 3:09 PM, Robert Kern <[hidden email]> wrote: _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
In reply to this post by Nathaniel Smith
On 6/27/2017 5:35 PM, Nathaniel Smith wrote:
> I remember... The major issue here is that some people want dot(a, b) on Boolean matrices to use these semantics, right? Yes; this has worked in the past, and loss of this functionality is unexpected. That said, I haven't used this outside of a teaching context, so for me it only affects some course notes. I suppose loss of this behavior could be somewhat mitigated if numpy could provide an optimized general inner product (along the line of the Wolfram Language's `Inner`). But note that numpy.linalg.matrix_power is also lost (e.g., the derived graphs that allow an intuitive representation of transitive closure). fwiw, Alan _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
In reply to this post by Eric Wieser
My two ¢: keep things as they are. There is just two much code that
uses the C definition of bools, 0=False, 1=True. Coupled with casting every outcome that is unequal to 0 as True, * as AND, + as OR, and - as XOR makes sense (and -True would indeed be True, but I'm quite happy to have that one removed...). I lost track a little, but isn't this way also consistent with python, the one difference being that numpy does an implicit cast to bool on the result? -- Marten _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
Just as a comment: It would be really nice if NumPy could slow down the pace of deprecations, or at least make the warnings about deprecations more visible. It seems like every release breaks some subset of our test suite (we only had one or two cases of using the binary - operator on boolean arrays so it wasn't a big deal this time). For projects that don't have resources for ongoing maintenance this is a recipe for bitrot... On Wed, Jun 28, 2017 at 9:48 AM, Marten van Kerkwijk <[hidden email]> wrote: My two ¢: keep things as they are. There is just two much code that _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
About visibility of deprecations: this is *very* tricky - if we make
it more visible, every user is going to see deprecation warnings all the time, about things they can do nothing about, because they occur inside other packages. I think in the end the only choice is to have automated testing that turns warnings into errors, so that these warnings are caught early enough. That said, I've certainly taken note of this as an example of the importance of not changing things just because the current implementation is not quite logical; there should be a real benefit to the change. That said, in fairness, I'd argue at least a few of the deprecation warnings are the result of dealing with bit-rot within numpy! -- Marten _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
In reply to this post by Marten van Kerkwijk
On Wed, Jun 28, 2017 at 10:48 AM, Marten van Kerkwijk <[hidden email]> wrote: My two ¢: keep things as they are. There is just two much code that I'm also in favor of practicality beats mathematical purity. AFAIK, the hybrid behavior between boolean and the diff/sum/dot behavior works pretty well when working, e.g., with masks, as for example in masked array stats. Josef
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
Free forum by Nabble | Edit this page |