# Compare NumPy arrays with threshold and return the differences

4 messages
Open this post in threaded view
|
Report Content as Inappropriate

## Compare NumPy arrays with threshold and return the differences

 Hi, In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold. The compare itself can be done easily done with "allclose" function, like that: Threshold = 0.1 if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):     Print('Same') But this compare does not return which cells are not the same.   The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one: def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):    if not Arr1.shape == Arr2.shape:        return ['Arrays size not the same']    Dimensions = Arr1.shape     Diff = []    for i in range(Dimensions [0]):        for j in range(Dimensions [1]):            if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True):                Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ','                + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')        return Diff (and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells.   Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself? Thanks! Nissim     _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|
Report Content as Inappropriate

## Re: Compare NumPy arrays with threshold and return the differences

 I would do something like:diff_is_large = (array1 - array2) > thresholdindex_at_large_diff = numpy.nonzero(diff_is_large)array1[index_at_large_diff].tolist()On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger wrote: Hi, In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold. The compare itself can be done easily done with "allclose" function, like that: Threshold = 0.1 if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):     Print('Same') But this compare does not return which cells are not the same.   The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one: def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):    if not Arr1.shape == Arr2.shape:        return ['Arrays size not the same']    Dimensions = Arr1.shape     Diff = []    for i in range(Dimensions [0]):        for j in range(Dimensions [1]):            if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True):                Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ','                + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')        return Diff (and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells.   Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself? Thanks! Nissim     _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|
Report Content as Inappropriate

## Re: Compare NumPy arrays with threshold and return the differences

 In reply to this post by Nissim Derdiger On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger wrote: Hi, In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold. The compare itself can be done easily done with "allclose" function, like that: Threshold = 0.1 if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):     Print('Same') But this compare does not return which cells are not the same.   The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one: def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):    if not Arr1.shape == Arr2.shape:        return ['Arrays size not the same']    Dimensions = Arr1.shape     Diff = []    for i in range(Dimensions [0]):        for j in range(Dimensions [1]):            if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True):                Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ','                + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')        return Diff (and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells.   Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself?Use `close_mask = np.isclose(Arr1, Arr2, Threshold, equal_nan=True)` to return a boolean mask the same shape as the arrays which is True where the elements are close and False where they are not. You can invert it to get a boolean mask which is True where they are "far" with respect to the threshold: `far_mask = ~close_mask`. Then you can use `i_idx, j_idx = np.nonzero(far_mask)` to get arrays of the `i` and `j` indices where the values are far. For example:for i, j in zip(i_idx, j_idx):    print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, Arr1[i, j], Arr2[i, j], Threshold))-- Robert Kern _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion