Initializing array from buffer

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Initializing array from buffer

Andrea Arteaga
Hello.
Using the numpy.frombuffer function [1] one can initialize a numpy array using an existing python object that implements the buffer protocol [2]. This is great, but currently this function supports only 1D buffers, even if the provided buffer is multidimensional and it exposes all information about its structure (shape, strides, data type).

Apparently, one can extract every kind of buffer information out of a buffer of a numpy array (pointer, number of dimensions, shape, strides, suboffsets,...), but the other way around is only partially implemented: providing a multidimensional buffer does not mean being able of creating a numpy array the uses that buffer with the desired structure.

My use case is the following: we have a some 3D arrays in our C++ framework. The ordering of the elements in these arrays is neither C nor Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ. Moreover we put some padding to optimize aligned access. This kind of memory structure cannot be just expressed as 'C' or 'Fortran', but it can be perfectly expressed using the Python buffer protocol by providing the shape and the strides. We would like to export this structure to a numpy array that should be able of accessing the same memory locations in a consistent way and make some operations like initializing the content or plotting it.

Is this currently possible?
If not, is it planned to implement such a feature?

==========

Maybe just to clarify I could show an example entirely in python. Assume a in a 2D numpy array:

a = np.ones((10,20))

It contains information about its structure which can be portably accessed using its data member:

a.data.format == 'd'
a.data.ndim == 2
a.data.shape == (10,20)
a.data.strides == (160,8)

Unfortunately, when initializing an array b from this buffer, the structure of the buffer is "downgraded" to unidimensional shape:

b = np.frombuffer(a.data)

b.ndim == 1
b.shape == (200,)
b.strides == (8,)

I wished b had the same multi-dimensional structure of a.

(This is of course a very simple example. In my use case I would initialize b with my own buffer instead of that of another numpy array).

Best regards

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Sturla Molden
Andrea Arteaga <[hidden email]> wrote:

> My use case is the following: we have a some 3D arrays in our C++
> framework. The ordering of the elements in these arrays is neither C nor
> Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in
> memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ.
> Moreover we put some padding to optimize aligned access. This kind of
> memory structure cannot be just expressed as 'C' or 'Fortran', but it can
> be perfectly expressed using the Python buffer protocol by providing the
> shape and the strides. We would like to export this structure to a numpy
> array that should be able of accessing the same memory locations in a
> consistent way and make some operations like initializing the content or
> plotting it.
>
> Is this currently possible?
> If not, is it planned to implement such a feature?

If you are already coding in C++, just use PyArray_New or
PyArray_NewFromDescr:

http://docs.scipy.org/doc/numpy/reference/c-api.array.html#c.PyArray_New
http://docs.scipy.org/doc/numpy/reference/c-api.array.html#c.PyArray_NewFromDescr

Apart from that, numpy.array and numpy.asarray can also accept a PEP 3118
buffer.

Sturla

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Stéfan van der Walt
In reply to this post by Andrea Arteaga
Hi Andrea

On 2014-11-16 19:42:09, Andrea Arteaga <[hidden email]> wrote:

> My use case is the following: we have a some 3D arrays in our C++
> framework. The ordering of the elements in these arrays is neither C nor
> Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in
> memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ.
> Moreover we put some padding to optimize aligned access. This kind of
> memory structure cannot be just expressed as 'C' or 'Fortran', but it can
> be perfectly expressed using the Python buffer protocol by providing the
> shape and the strides. We would like to export this structure to a numpy
> array that should be able of accessing the same memory locations in a
> consistent way and make some operations like initializing the content or
> plotting it.
>
> Is this currently possible?
> If not, is it planned to implement such a feature?

This looks like something that should be accomplished fairly easily
using the ``__array_interface__`` dictionary, as described here:

http://docs.scipy.org/doc/numpy/reference/arrays.interface.html

Any object that exposes a suitable dictionary named
``__array_interface__`` may be converted to a NumPy array. It has the
following important keys:

    shape
    typestr
    data: (20495857, True); 2-tuple—pointer to data and boolean to indicate whether memory is read-only
    strides
    version: 3

Regards
Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Edison Gustavo Muenz
In reply to this post by Andrea Arteaga
Have you tried using the C-API to create the array? This link might be of help: http://docs.scipy.org/doc/numpy/reference/c-api.array.html#creating-arrays

I know that Boost.Python can handle this.

On Sun, Nov 16, 2014 at 3:42 PM, Andrea Arteaga <[hidden email]> wrote:
Hello.
Using the numpy.frombuffer function [1] one can initialize a numpy array using an existing python object that implements the buffer protocol [2]. This is great, but currently this function supports only 1D buffers, even if the provided buffer is multidimensional and it exposes all information about its structure (shape, strides, data type).

Apparently, one can extract every kind of buffer information out of a buffer of a numpy array (pointer, number of dimensions, shape, strides, suboffsets,...), but the other way around is only partially implemented: providing a multidimensional buffer does not mean being able of creating a numpy array the uses that buffer with the desired structure.

My use case is the following: we have a some 3D arrays in our C++ framework. The ordering of the elements in these arrays is neither C nor Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ. Moreover we put some padding to optimize aligned access. This kind of memory structure cannot be just expressed as 'C' or 'Fortran', but it can be perfectly expressed using the Python buffer protocol by providing the shape and the strides. We would like to export this structure to a numpy array that should be able of accessing the same memory locations in a consistent way and make some operations like initializing the content or plotting it.

Is this currently possible?
If not, is it planned to implement such a feature?

==========

Maybe just to clarify I could show an example entirely in python. Assume a in a 2D numpy array:

a = np.ones((10,20))

It contains information about its structure which can be portably accessed using its data member:

a.data.format == 'd'
a.data.ndim == 2
a.data.shape == (10,20)
a.data.strides == (160,8)

Unfortunately, when initializing an array b from this buffer, the structure of the buffer is "downgraded" to unidimensional shape:

b = np.frombuffer(a.data)

b.ndim == 1
b.shape == (200,)
b.strides == (8,)

I wished b had the same multi-dimensional structure of a.

(This is of course a very simple example. In my use case I would initialize b with my own buffer instead of that of another numpy array).

Best regards

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Andrea Arteaga
Hi.
Yesterday I tried to make use of the C API, but I did not manage to have anything useful. The reference is very well done, but I feel the lack for some tutorial that would guide with some examples. Do you know of any?

The array interface looks sounds like a very good solution. In a sense it is a Numpy specific version of the buffer protocol, a bit simpler but less generic. It looks very easy to implement and clean. I will try this way.

Thank you so much for the useful links.
Andrea Arteaga

2014-11-17 18:08 GMT+01:00 Edison Gustavo Muenz <[hidden email]>:
Have you tried using the C-API to create the array? This link might be of help: http://docs.scipy.org/doc/numpy/reference/c-api.array.html#creating-arrays

I know that Boost.Python can handle this.

On Sun, Nov 16, 2014 at 3:42 PM, Andrea Arteaga <[hidden email]> wrote:
Hello.
Using the numpy.frombuffer function [1] one can initialize a numpy array using an existing python object that implements the buffer protocol [2]. This is great, but currently this function supports only 1D buffers, even if the provided buffer is multidimensional and it exposes all information about its structure (shape, strides, data type).

Apparently, one can extract every kind of buffer information out of a buffer of a numpy array (pointer, number of dimensions, shape, strides, suboffsets,...), but the other way around is only partially implemented: providing a multidimensional buffer does not mean being able of creating a numpy array the uses that buffer with the desired structure.

My use case is the following: we have a some 3D arrays in our C++ framework. The ordering of the elements in these arrays is neither C nor Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ. Moreover we put some padding to optimize aligned access. This kind of memory structure cannot be just expressed as 'C' or 'Fortran', but it can be perfectly expressed using the Python buffer protocol by providing the shape and the strides. We would like to export this structure to a numpy array that should be able of accessing the same memory locations in a consistent way and make some operations like initializing the content or plotting it.

Is this currently possible?
If not, is it planned to implement such a feature?

==========

Maybe just to clarify I could show an example entirely in python. Assume a in a 2D numpy array:

a = np.ones((10,20))

It contains information about its structure which can be portably accessed using its data member:

a.data.format == 'd'
a.data.ndim == 2
a.data.shape == (10,20)
a.data.strides == (160,8)

Unfortunately, when initializing an array b from this buffer, the structure of the buffer is "downgraded" to unidimensional shape:

b = np.frombuffer(a.data)

b.ndim == 1
b.shape == (200,)
b.strides == (8,)

I wished b had the same multi-dimensional structure of a.

(This is of course a very simple example. In my use case I would initialize b with my own buffer instead of that of another numpy array).

Best regards

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Nathaniel Smith
In reply to this post by Andrea Arteaga
On Sun, Nov 16, 2014 at 5:42 PM, Andrea Arteaga <[hidden email]> wrote:
> Hello.
> Using the numpy.frombuffer function [1] one can initialize a numpy array
> using an existing python object that implements the buffer protocol [2].
> This is great, but currently this function supports only 1D buffers, even if
> the provided buffer is multidimensional and it exposes all information about
> its structure (shape, strides, data type).

np.frombuffer is not often used, and I'm not sure if it's been updated
for the new py3 buffer protocol. (The old buffer protocol only
supported 1d buffers.)

Have you tried just using np.asarray? It seems to work fine with
multidimensional memoryview objects at least, which use the new buffer
protocol:

In [12]: a = np.ones((2, 3))

In [13]: a_buf = memoryview(a)

In [14]: a_buf
Out[14]: <memory at 0x7ffd071a2d60>

In [15]: np.asarray(a_buf)
Out[15]:
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Robert McGibbon
In reply to this post by Sturla Molden
The np.ndarray constructor takes a strides argument argument, and a buffer. Is it not sufficiently flexible?

-Robert

On Sun, Nov 16, 2014 at 4:27 PM, Sturla Molden <[hidden email]> wrote:
Andrea Arteaga <[hidden email]> wrote:

> My use case is the following: we have a some 3D arrays in our C++
> framework. The ordering of the elements in these arrays is neither C nor
> Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in
> memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ.
> Moreover we put some padding to optimize aligned access. This kind of
> memory structure cannot be just expressed as 'C' or 'Fortran', but it can
> be perfectly expressed using the Python buffer protocol by providing the
> shape and the strides. We would like to export this structure to a numpy
> array that should be able of accessing the same memory locations in a
> consistent way and make some operations like initializing the content or
> plotting it.
>
> Is this currently possible?
> If not, is it planned to implement such a feature?

If you are already coding in C++, just use PyArray_New or
PyArray_NewFromDescr:

http://docs.scipy.org/doc/numpy/reference/c-api.array.html#c.PyArray_New
http://docs.scipy.org/doc/numpy/reference/c-api.array.html#c.PyArray_NewFromDescr

Apart from that, numpy.array and numpy.asarray can also accept a PEP 3118
buffer.

Sturla

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Sturla Molden
On 18/11/14 04:21, Robert McGibbon wrote:
> The np.ndarray constructor
> <http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html> takes
> a strides argument argument, and a buffer. Is it not sufficiently flexible?
>
> -Robert

AFAIK the buffer argument is not a memory address but an object
exporting the old buffer protocol. We can abuse the __array_interface__
to do this though, but I prefer the C API functions. Wrapping a C
pointer with __array_interface__ then becomes something like this (not
tested, but should work):


import numpy as np

cdef class wrapper_array(object):

     cdef:
         object readonly __array_interface__

     def __init__(wrapper_array self, addr, shape, dtype,
                    order, strides, offset):
         if strides is None:
             if order == 'C':
                 strides = None
             else:
                 strides = _get_fortran_strides(shape, dtype)
         self.__array_interface__ = dict(
             data = (addr + offset, False),
             descr = dtype.descr,
             shape = shape,
             strides = strides,
             typestr = dtype.str,
             version = 3,
         )

cdef object _get_fortran_strides(shape, dtype):
     strides = tuple(dtype.itemsize * np.cumprod((1,) + shape[:-1]))
     return strides

def wrap_pointer(void *addr, shape, dtype, order, strides, offset):
     """Wraps a C pointer with an ndarray"""
     return np.asarray(wrapper_array(<Py_uintptr_t> addr, shape,
        dtype, order, strides, offset))


https://github.com/sturlamolden/sharedmem-numpy/blob/master/sharedmem/array.py



Sturla


       



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Initializing array from buffer

Andrea Arteaga
Thanks everybody for suggesting many different ways to achieve this result.

While all of them seem valid methods, I decided to use the constructor, as proposed by Robert:

> The np.ndarray constructor takes a strides argument argument, and a buffer.

I could easily do this from within C++ in a clean way and without making use of Numpy-specific C code. It sounds just a bit redundant to me the fact that you have to provide the strides and the shape, even if the buffer object already contains this information, but I suppose this is done so to support an older buffer protocol, where only 1D arrays could be defined.

I have it working.
Thanks once more.

All the best
Andrea


2014-11-22 4:04 GMT+01:00 Sturla Molden <[hidden email]>:
On 18/11/14 04:21, Robert McGibbon wrote:
> The np.ndarray constructor
> <http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html> takes
> a strides argument argument, and a buffer. Is it not sufficiently flexible?
>
> -Robert

AFAIK the buffer argument is not a memory address but an object
exporting the old buffer protocol. We can abuse the __array_interface__
to do this though, but I prefer the C API functions. Wrapping a C
pointer with __array_interface__ then becomes something like this (not
tested, but should work):


import numpy as np

cdef class wrapper_array(object):

     cdef:
         object readonly __array_interface__

     def __init__(wrapper_array self, addr, shape, dtype,
                    order, strides, offset):
         if strides is None:
             if order == 'C':
                 strides = None
             else:
                 strides = _get_fortran_strides(shape, dtype)
         self.__array_interface__ = dict(
             data = (addr + offset, False),
             descr = dtype.descr,
             shape = shape,
             strides = strides,
             typestr = dtype.str,
             version = 3,
         )

cdef object _get_fortran_strides(shape, dtype):
     strides = tuple(dtype.itemsize * np.cumprod((1,) + shape[:-1]))
     return strides

def wrap_pointer(void *addr, shape, dtype, order, strides, offset):
     """Wraps a C pointer with an ndarray"""
     return np.asarray(wrapper_array(<Py_uintptr_t> addr, shape,
        dtype, order, strides, offset))


https://github.com/sturlamolden/sharedmem-numpy/blob/master/sharedmem/array.py



Sturla






_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion