# How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

4 messages
Open this post in threaded view
|

## How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

 I have written a question in:It was recommended by numpy to send this subject to the mailing lists.The question is as follows. I would be appreciated if you could advise me to solve the problem:At first, I write a small example of to lists:``````F = [[1,2,3],[3,2,7],[4,4,1],[5,6,3],[1,3,7]] # (1*5) 5 lists S = [[1,3,7],[6,8,1],[3,2,7]] # (1*3) 3 lists ``````I want to get Boolean matrix for the same 'list's in two F and S:``````[False, True, False, False, True] # (1*5) 5 Booleans for 5 lists of F ``````By using `IM = reduce(np.in1d, (F, S))` it gives results for each number in each lists of F:``````[ True True True True True True False False True False True True True True True] # (1*15) ``````By using `IM = reduce(np.isin, (F, S))` it gives results for each number in each lists of F, too, but in another shape:``````[[ True True True] [ True True True] [False False True] [False True True] [ True True True]] # (5*3) ``````The true result will be achieved by code `IM = [i in S for i in F]` for the example lists, but when I'm using this code for my two main bigger numpy arrays of lists:https://drive.google.com/file/d/1YUUdqxRu__9-fhE1542xqei-rjB3HOxX/view?usp=sharingnumpy array: 3036 listshttps://drive.google.com/file/d/1FrggAa-JoxxoRqRs8NVV_F69DdVdiq_m/view?usp=sharingnumpy array: 300 listsIt gives wrong answer. For the main files it must give 3036 Boolean, in which 'True' is only 300 numbers. I didn't understand why this get wrong answers?? It seems it applied only on the 3rd characters in each lists of F. It is preferred to use reduce function by the two functions, np.in1d and np.isin, instead of the last method. How could to solve each of the three above methods?? _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

 On Sun, Mar 14, 2021 at 3:06 PM Ali Sheikholeslam <[hidden email]> wrote:I have written a question in:It was recommended by numpy to send this subject to the mailing lists.The question is as follows. I would be appreciated if you could advise me to solve the problem:At first, I write a small example of to lists:``````F = [[1,2,3],[3,2,7],[4,4,1],[5,6,3],[1,3,7]] # (1*5) 5 lists S = [[1,3,7],[6,8,1],[3,2,7]] # (1*3) 3 lists ``````I want to get Boolean matrix for the same 'list's in two F and S:``````[False, True, False, False, True] # (1*5) 5 Booleans for 5 lists of F ``````By using `IM = reduce(np.in1d, (F, S))` it gives results for each number in each lists of F:``````[ True True True True True True False False True False True True True True True] # (1*15) ``````By using `IM = reduce(np.isin, (F, S))` it gives results for each number in each lists of F, too, but in another shape:``````[[ True True True] [ True True True] [False False True] [False True True] [ True True True]] # (5*3) ``````The true result will be achieved by code `IM = [i in S for i in F]` for the example lists, but when I'm using this code for my two main bigger numpy arrays of lists:https://drive.google.com/file/d/1YUUdqxRu__9-fhE1542xqei-rjB3HOxX/view?usp=sharingnumpy array: 3036 listshttps://drive.google.com/file/d/1FrggAa-JoxxoRqRs8NVV_F69DdVdiq_m/view?usp=sharingnumpy array: 300 listsIt gives wrong answer. For the main files it must give 3036 Boolean, in which 'True' is only 300 numbers. I didn't understand why this get wrong answers?? It seems it applied only on the 3rd characters in each lists of F. It is preferred to use reduce function by the two functions, np.in1d and np.isin, instead of the last method. How could to solve each of the three above methods??Thank you for providing the data. Can you show a complete, runnable code sample that fails? There are several things that could go wrong here, and we can't be sure which is which without the exact code that you ran.In general, you may well have problems with the floating point data that you are not seeing with your integer examples.FWIW, I would continue to use something like the `IM = [i in S for i in F]` list comprehension for data of this size. You aren't getting any benefit trying to convert to arrays and using our array set operations. They are written for 1D arrays of numbers, not 2D arrays (attempting to treat them as 1D arrays of lists) and won't really work on your data.-- Robert Kern _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion