Hi again,
Thanks for the responses to my question!
Roberts answer worked very well for me, except for 1 small issue:
This line:
close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True)
returns each difference twice – once j in compare to I and once for I in compare to j
for example:
for this input:
MatA = [[10,20,30],[40,50,60]]
MatB = [[10,30,30],[40,50,160]]
My old code will return:
0,1,20,30
1,3,60,160
You code returns:
0,1,20,30
1,3,60,160
0,1,30,20
1,3,160,60
I can simply cut "close_mask" to half so I'll have only 1 iteration, but that does not seems to be efficient..
any ideas?
Also, what should I change to support 3D arrays as well?
Thanks again,
Nissim.
-----Original Message-----
From: NumPy-Discussion [[hidden email]] On Behalf Of [hidden email] Sent: Wednesday, May 17, 2017 8:17 PM To: [hidden email] Subject: NumPy-Discussion Digest, Vol 128, Issue 18 Send NumPy-Discussion mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPy-Discussion digest..."
Today's Topics:
1. Compare NumPy arrays with threshold and return the
differences (Nissim Derdiger)
2. Re: Compare NumPy arrays with threshold and return the
differences (Paul Hobson)
3. Re: Compare NumPy arrays with threshold and return the
differences (Robert Kern)
----------------------------------------------------------------------
Message: 1
Date: Wed, 17 May 2017 16:50:40 +0000
From: Nissim Derdiger <[hidden email]>
To: "[hidden email]" <[hidden email]>
Subject: [Numpy-discussion] Compare NumPy arrays with threshold and
return the differences
Message-ID:
Content-Type: text/plain; charset="us-ascii"
Hi,
In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold.
The compare itself can be done easily done with "allclose" function, like that:
Threshold = 0.1
if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
Print('Same')
But this compare does not return which cells are not the same.
The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one:
def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
if not Arr1.shape == Arr2.shape:
return ['Arrays size not the same']
Dimensions = Arr1.shape
Diff = []
for i in range(Dimensions [0]):
for j in range(Dimensions [1]):
if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True):
Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ','
+ str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
return Diff
(and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells.
Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself?
Thanks!
Nissim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/a8bfd324/attachment-0001.html>
------------------------------
Message: 2
Date: Wed, 17 May 2017 10:13:46 -0700
From: Paul Hobson <[hidden email]>
To: Discussion of Numerical Python <[hidden email]>
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
and return the differences
Message-ID:
Content-Type: text/plain; charset="utf-8"
I would do something like:
diff_is_large = (array1 - array2) > threshold index_at_large_diff = numpy.nonzero(diff_is_large)
array1[index_at_large_diff].tolist()
On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger <[hidden email]>
wrote:
> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
> Print('Same')
> But this compare does not return *which* cells are not the same.
>
> The easiest (yet naive) way to know which cells are not the same is to
> use a simple for loops code like this one:
> def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
> if not Arr1.shape == Arr2.shape:
> return ['Arrays size not the same']
> Dimensions = Arr1.shape
> Diff = []
> for i in range(Dimensions [0]):
> for j in range(Dimensions [1]):
> if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold,
> equal_nan=True):
> Diff.append(',' + str(i) + ',' + str(j) + ',' +
> str(Arr1[i,j]) + ','
> + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
> return Diff
> (and same for 3D arrays - with 1 more for loop) This way is very slow
> when the Arrays are big and full of none-equal cells.
>
> Is there a fast straight forward way in case they are not the same -
> to get a list of the uneven cells? maybe some built-in function in the
> NumPy itself?
> Thanks!
> Nissim
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/6183339c/attachment-0001.html>
------------------------------
Message: 3
Date: Wed, 17 May 2017 10:16:09 -0700
From: Robert Kern <[hidden email]>
To: Discussion of Numerical Python <[hidden email]>
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
and return the differences
Message-ID:
Content-Type: text/plain; charset="utf-8"
On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger <[hidden email]>
wrote:
> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
> Print('Same')
> But this compare does not return *which* cells are not the same.
>
> The easiest (yet naive) way to know which cells are not the same is to
> use a simple for loops code like this one:
> def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
> if not Arr1.shape == Arr2.shape:
> return ['Arrays size not the same']
> Dimensions = Arr1.shape
> Diff = []
> for i in range(Dimensions [0]):
> for j in range(Dimensions [1]):
> if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold,
> equal_nan=True):
> Diff.append(',' + str(i) + ',' + str(j) + ',' +
> str(Arr1[i,j]) + ','
> + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
> return Diff
> (and same for 3D arrays - with 1 more for loop) This way is very slow
> when the Arrays are big and full of none-equal cells.
>
> Is there a fast straight forward way in case they are not the same -
> to get a list of the uneven cells? maybe some built-in function in the
> NumPy itself?
>
Use `close_mask = np.isclose(Arr1, Arr2, Threshold, equal_nan=True)` to return a boolean mask the same shape as the arrays which is True where the elements are close and False where they are not. You
can invert it to get a boolean mask which is True where they are "far" with respect to the
threshold: `far_mask = ~close_mask`. Then you can use `i_idx, j_idx = np.nonzero(far_mask)` to get arrays of the `i` and `j` indices where the values are far. For example:
for i, j in zip(i_idx, j_idx):
print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, Arr1[i, j], Arr2[i, j], Threshold))
--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/3d57f695/attachment.html>
------------------------------
Subject: Digest Footer
_______________________________________________
NumPy-Discussion mailing list
------------------------------
End of NumPy-Discussion Digest, Vol 128, Issue 18
*************************************************
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
On Thu, May 18, 2017 at 5:07 AM, Nissim Derdiger <[hidden email]> wrote:
> > Hi again, > Thanks for the responses to my question! > Roberts answer worked very well for me, except for 1 small issue: > > This line: > close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True) > returns each difference twice – once j in compare to I and once for I in compare to j No, it returns a boolean array the same size as MatA and MatB. It literally can't contain "each difference twice". Maybe there is something else in your code that is producing the doubling that you see, possibly in the printing of the results. I'm not seeing the behavior that you speak of. Please post your complete code that produced the doubled output that you see. import numpy as np MatA = np.array([[10,20,30],[40,50,60]]) MatB = np.array([[10,30,30],[40,50,160]]) Threshold = 1.0 # Note the `atol=` here. I missed it before. close_mask = np.isclose(MatA, MatB, atol=Threshold, equal_nan=True) far_mask = ~close_mask i_idx, j_idx = np.nonzero(far_mask) for i, j in zip(i_idx, j_idx): print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, MatA[i, j], MatB[i, j], Threshold)) I get the following output: $ python isclose.py 0, 1, 20, 30, 1.0, Fail 1, 2, 60, 160, 1.0, Fail -- Robert Kern _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
Free forum by Nabble | Edit this page |