It was the first time I tried to
create a structured array in numpy. Usually I use pandas for
It was the first time I tried to
create a structured array in numpy. Usually I use pandas for
heterogeneous arrays, but it is one more dependency to my project. I had list of list of this kind: It took me some time (really, much more than some), to understand the problem with structured array creation. As example: b=[[ 1, 10.3, 12.1, 2.12 ],...] TypeError: a bytes-like object is required, not 'int' p.s.: It looks like that dtype also accepts only list of tuples. But I can not catch the idea for this restrictions.
Hi Kirill,
T he idea is that each tuple assigns a name to the field and a data type. There are a variety of ways to create structured arrays but they all involve giving both a name and data type to each field (I think). See https://docs.scipy.org/doc/numpy/user/basics.rec.htmlJon On Fri, Mar 24, 2017 at 5:09 AM, <[hidden email]> wrote: > From: Kirill Balunov <[hidden email]> > To: [hidden email] > Cc: > Bcc: > Date: Thu, 23 Mar 2017 21:16:28 +0300 > Subject: [Numpy-discussion] Structured array creation with list of lists and others > It was the first time I tried to create a structured array in numpy. Usually I use pandas for heterogeneous arrays, but it is one more dependency to my project. > > It took me some time (really, much more than some), to understand the problem with structured array creation. As example: > > I had list of list of this kind: > b=[[ 1, 10.3, 12.1, 2.12 ],...] > > And tried: > np.array(b, dtype='i4,f4,f4,f4') > > Which raises some weird exception: > TypeError: a bytes-like object is required, not 'int' > > Two hours later I found that I need list of tuples. I didn't find any help in documentation and could not realize that the problem with the inner lists... > > Why there is such restriction - 'list of tuples' to create structured array? What is the idea behind that, why not list of lists, or tuple of lists or ...? > > Also the exception does not help at all... > p.s.: It looks like that dtype also accepts only list of tuples. But I can not catch the idea for this restrictions. -- ________________________________________________________ Jonathan D. Slavin Harvard-Smithsonian CfA [hidden email] 60 Garden Street, MS 83 phone: (617) 496-7981 Cambridge, MA 02138-1516 cell: (781) 363-0035 USA ________________________________________________________
On 03/23/2017 02:16 PM, Kirill Balunov wrote:
> It was the first time I tried to create a structured array in numpy. > Usually I use pandas for heterogeneous arrays, but it is one more > dependency to my project. > > It took me some time (really, much more than some), to understand the > problem with structured array creation. As example: > > I had list of list of this kind: > b=[[ 1, 10.3, 12.1, 2.12 ],...] > > And tried: > np.array(b, dtype='i4,f4,f4,f4') > > Which raises some weird exception: > TypeError: a bytes-like object is required, not 'int' > > Two hours later I found that I need list of tuples. I didn't find any help > in documentation and could not realize that the problem with the inner > lists... > > Why there is such restriction - 'list of tuples' to create structured > array? What is the idea behind that, why not list of lists, or tuple of > lists or ...? > > Also the exception does not help at all... > p.s.: It looks like that dtype also accepts only list of tuples. But I can > not catch the idea for this restrictions. > The problem is that numpy needs to distinguish between multidimensional arrays and structured elements. A "list of lists" will often trigger numpy's broadcasting rules, which is not what you want here. For instance, should numpy interpret your input list as a 2d array of dimension Lx4 containing integer elements, or a 1d array of length L of structs with 4 fields? In this particular case maybe numpy could, in principle, figure it out from what you gave it by calculating that the innermost dimension is the same length as the number of fields. But there are other cases (such as assignment) where similar ambiguities arise that are harder to resolve. So to preserve our sanity we want to require that structures be formatted as tuples all the time. I have a draft of potential updated structured array docs you can read here: https://gist.github.com/ahaldane/7d1873d33d4d0f80ba7a54ccf1052eee See the section "Assignment from Python Native Types (Tuples)", which hopefully better warns that tuples are needed. Let me know if you think something is missing from the draft. (WARNING: the section about multi-field assignment in the doc draft is incorrect for current numpy - that's what I'm proposing for the next release. The rest of the docs are accurate for current numpy) Agreed that the error message could be changed. Allan _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
Allan thank you for your draft! I agree with you that (not in mine ) in general case it would be hard to resolve all corner cases. Also I think if someone read numpy reference linearly, he/she will have some insight that list of tuples are necessary (but it was not my case). For me one problem is that in some cases numpy allows a lot freedom, but in other it is unnecessarily strict. Another one is exception messages (but this is certainly subjective). 2017-03-24 19:48 GMT+03:00 Allan Haldane <[hidden email]>:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.python.org/mailman/listinfo/numpy-discussion |
