ndarray subclassing

classic Classic list List threaded Threaded
3 messages Options
ctw
Reply | Threaded
Open this post in threaded view
|

ndarray subclassing

ctw
Hi!

I ran into some strange (at least to me) issues with sublasses of
ndarray. The following minimal class definition illustrates the
problem:

====================================================

import numpy as np
class TestArray(np.ndarray):
    def __new__(cls, data, info=None, dtype=None, copy=False):
        subarr = np.array(data, dtype=dtype, copy=copy)
        subarr = subarr.view(cls)
        return subarr

    def __array_finalize__(self,obj):
        print "self: ",self.shape
        print "obj: ",obj.shape

=====================================================

When I run this code interactively with IPython and then generate
TestArray instances, __array_finalize__ seems to get called when
printing out arrays with more than 1 dimension and self.shape seems to
drop a dimension. Everything works fine if the array has just 1
dimension:

In [3]: x = TestArray(np.arange(5))
self:  (5,)
obj:  (5,)

In [4]: x
Out[4]: TestArray([0, 1, 2, 3, 4])

This is all expected behavior.
However things change when the array is 2-D:

In [5]: x = TestArray(np.zeros((2,3)))
self:  (2, 3)
obj:  (2, 3)

In [6]: x
Out[6]: self:  (3,)
obj:  (2, 3)
self:  (3,)
obj:  (2, 3)

TestArray([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

Now when printing out the array, __array_finalize__ seems to get
called twice and each time self seems to only refer to one row of the
array. Can anybody explain what is going on and why? This behavior
seems to lead to problems when the __array_finalize__ method performs
checks on the shape of the array. In the matrix class this seems to be
circumvented with a special _getitem flag that bypasses the shape
checks in __array_finalize__ and an analogous solution works for my
class, too. However, I'm still puzzled by this behavior and am hoping
that somebody here can shed some light on it.

Thanks!
CTW
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ndarray subclassing

Travis Oliphant-5
ctw wrote:

> Hi!
>
> I ran into some strange (at least to me) issues with sublasses of
> ndarray. The following minimal class definition illustrates the
> problem:
>
> ====================================================
>
> import numpy as np
> class TestArray(np.ndarray):
>     def __new__(cls, data, info=None, dtype=None, copy=False):
>         subarr = np.array(data, dtype=dtype, copy=copy)
>         subarr = subarr.view(cls)
>         return subarr
>
>     def __array_finalize__(self,obj):
>         print "self: ",self.shape
>         print "obj: ",obj.shape
>
> =====================================================
>
> When I run this code interactively with IPython and then generate
> TestArray instances, __array_finalize__ seems to get called when
> printing out arrays with more than 1 dimension and self.shape seems to
> drop a dimension. Everything works fine if the array has just 1
> dimension:
>
> In [3]: x = TestArray(np.arange(5))
> self:  (5,)
> obj:  (5,)
>
> In [4]: x
> Out[4]: TestArray([0, 1, 2, 3, 4])
>
> This is all expected behavior.
> However things change when the array is 2-D:
>
> In [5]: x = TestArray(np.zeros((2,3)))
> self:  (2, 3)
> obj:  (2, 3)
>
> In [6]: x
> Out[6]: self:  (3,)
> obj:  (2, 3)
> self:  (3,)
> obj:  (2, 3)
>
> TestArray([[ 0.,  0.,  0.],
>        [ 0.,  0.,  0.]])
>  

You are just seeing the result of __repr__.   The printing code works by
accessing slices of the array.  These slices create new instances of
your TestArray class which have a smaller number of dimensions.  That's all.

Is the printing code causing you other kinds of problems?

-Travis

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
ctw
Reply | Threaded
Open this post in threaded view
|

Re: ndarray subclassing

ctw
In reply to this post by ctw
On Thu, May 1, 2008, Travis E. Oliphant  wrote:
> You are just seeing the result of __repr__.   The printing code works by
> accessing slices of the array.  These slices create new instances of
> your TestArray class which have a smaller number of dimensions.  That's all.

Ahh, that makes sense. Thanks so much for the quick reply!

> Is the printing code causing you other kinds of problems?

The problem was that my __array_finalize__ code did some checking on
the shape of the array. In short, one of the attributes of my class is
a list that has one entry for each dimension. __array_finalize__
checks to make sure this list length and the number of dimensions
match and it throws an exception if they don't. This leads to an
exception being thrown whenever I try to print the contents of an
array with more than 1 dimensions, because I haven't yet implemented a
__getitem__ method that adjusts this attribute when slices are taken.
I didn't realize that this was what's going on and was very puzzled by
the results. It looks like I can make things work by using _getitem
flag as is done in the matrix class.

Thanks again for clearing this up! I think it would be great if
somebody with write access to the wiki could make a note of this on
the sublasses page: http://www.scipy.org/Subclasses

CTW
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion