How to concatenate two arrays without duplicating memory?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

How to concatenate two arrays without duplicating memory?

V. Armando Sole
Hello,

Let's say we have two arrays A and B of shapes (10000, 2000) and (10000,
4000).

If I do C=numpy.concatenate((A, B), axis=1), I get a new array of
dimension (10000, 6000) with duplication of memory.

I am looking for a way to have a non contiguous array C in which the
"left" (10000, 2000) elements point to A and the "right" (10000, 4000)
elements point to B.

Any hint will be appreciated.

Thanks,

Armando


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrays without duplicating memory?

Gael Varoquaux
On Wed, Sep 02, 2009 at 09:40:49AM +0200, "V. Armando Solé" wrote:
> Let's say we have two arrays A and B of shapes (10000, 2000) and (10000,
> 4000).

> If I do C=numpy.concatenate((A, B), axis=1), I get a new array of
> dimension (10000, 6000) with duplication of memory.

> I am looking for a way to have a non contiguous array C in which the
> "left" (10000, 2000) elements point to A and the "right" (10000, 4000)
> elements point to B.

You cannot in the numpy memory model. The numpy memory model defines an
array as something that has regular strides to jump from an element to
the next one.

Gaël
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrayswithout duplicating memory?

Citi, Luca
As Gaël pointed out you cannot create A, B and then C
as the concatenation of A and B without duplicating
the vectors.

> I am looking for a way to have a non contiguous array C in which the
> "left" (10000, 2000) elements point to A and the "right" (10000, 4000)
> elements point to B.

But you can still re-link A to the left elements
and B to the right ones afterwards by using views into C.

>>> C=numpy.concatenate((A, B), axis=1)
>>> A,B = C[:,:2000], C[:,2000:]

Best,
Luca
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrays without duplicating memory?

V. Armando Sole
In reply to this post by Gael Varoquaux
Gael Varoquaux wrote:
> You cannot in the numpy memory model. The numpy memory model defines an
> array as something that has regular strides to jump from an element to
> the next one.
>  
I expected problems in the suggested case (concatenating columns) but I
did not expect the problem would be so severe to affect the case of row
concatenation.

I guess I am still considering a 2D array as an array of pointers and
that does not apply to numpy arrays.

Thanks for the info.

Armando

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrayswithout duplicating memory?

V. Armando Sole
In reply to this post by Citi, Luca
Citi, Luca wrote:
> As Gaël pointed out you cannot create A, B and then C
> as the concatenation of A and B without duplicating
> the vectors.
>  
> But you can still re-link A to the left elements
> and B to the right ones afterwards by using views into C.
>  

Thanks for the hint. In my case the A array is already present and the
contents of the B array can be read from disk.

At least I have two workarounds making use of your suggested solution of
re-linking:

- create the C array, copy the contents of A to it and read the contents
of B directly into C with duplication of the memory of A during some time.

- save the array A in disk, create the array C, read the contents of A
and B into it and re-link A and B with no duplication but ugly.

Thanks,

Armando


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrayswithout duplicating memory?

Sebastian Haase-3
Hi,
depending on the needs you have you might be interested in my "minimal
implementation" of what I call a
mock-ndarray.
I needed somthing like this to analyze higher dimensional stacks of 2d
images and what I needed was mostly the indexing features of
nd-arrays.
A mockarray is initialized with a list of nd-arrays. The result is a
mock array having one additional dimention "in front".
>>> a = N.arange(9)
>>> b = N.arange(9)
>>> a.shape=3,3
>>> b.shape=3,3
>>> c = F.mockNDarray(a,b)
>>> c.shape
(2, 3, 3)
>>> c[2,2,2]
>>> c[1,2,2]
8

No memory copy is done.

I put the module file here
http://drop.io/kpu4bib/asset/mockndarray-py
Otherwise this is part of my (BSD) "Priithon" image analysis framework.

Regards
Sebastian Haase

On Wed, Sep 2, 2009 at 11:31 AM, "V. Armando Solé"<[hidden email]> wrote:

> Citi, Luca wrote:
>> As Gaël pointed out you cannot create A, B and then C
>> as the concatenation of A and B without duplicating
>> the vectors.
>>
>> But you can still re-link A to the left elements
>> and B to the right ones afterwards by using views into C.
>>
>
> Thanks for the hint. In my case the A array is already present and the
> contents of the B array can be read from disk.
>
> At least I have two workarounds making use of your suggested solution of
> re-linking:
>
> - create the C array, copy the contents of A to it and read the contents
> of B directly into C with duplication of the memory of A during some time.
>
> - save the array A in disk, create the array C, read the contents of A
> and B into it and re-link A and B with no duplication but ugly.
>
> Thanks,
>
> Armando
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrays without duplicating memory?

Sturla Molden-2
In reply to this post by V. Armando Sole
V. Armando Solé skrev:
> I am looking for a way to have a non contiguous array C in which the
> "left" (10000, 2000) elements point to A and the "right" (10000, 4000)
> elements point to B.
>
> Any hint will be appreciated.

If you know in advance that A and B are going to be duplicated, you can
use views:

C = np.zeros((10000, 6000))
A = C[:,:2000]
B = C[:,2000:]

Now C is A and B concatenated horizontally.

If you can't to this, you could write the data to a temporary file and
read it back, but it would be slow.

Sturla
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrayswithout duplicating memory?

Sturla Molden-2
In reply to this post by Sebastian Haase-3
Sebastian Haase skrev:
> A mockarray is initialized with a list of nd-arrays. The result is a
> mock array having one additional dimention "in front".
This is important, because often in the case of  'concatenation' a real
concatenation is not needed. But then there is a common tool called
Matlab, which unlike Python has no concept of lists and make numerical  
programmers think they do. C = [A, B] is a horizontal concatenation in
Matlab. Too much exposure to Matlab cripples the mind easily.

Sturla

 
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: How to concatenate two arrayswithout duplicating memory?

Sebastian Haase-3
I forgot to mention I also support transpose.
-S.


On Wed, Sep 2, 2009 at 5:23 PM, Sturla Molden<[hidden email]> wrote:

> Sebastian Haase skrev:
>> A mockarray is initialized with a list of nd-arrays. The result is a
>> mock array having one additional dimention "in front".
> This is important, because often in the case of  'concatenation' a real
> concatenation is not needed. But then there is a common tool called
> Matlab, which unlike Python has no concept of lists and make numerical
> programmers think they do. C = [A, B] is a horizontal concatenation in
> Matlab. Too much exposure to Matlab cripples the mind easily.
>
> Sturla
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion