Handle type convertion in C API

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Handle type convertion in C API

Benoit Gschwind
Hello,

I writing a python binding of one of our library. The binding intend to
vectorize the function call. for exemple:

double foo(double, double) will be bound to a python:

<numpy.array of double> module.foo(<numpy.array>, <numpy.array>)

and the function foo will be called like :

for (int i = 0; i < size; ++i)
        outarr[i] = foo(inarr0[i], inarr[1]);

My question is about how can I handle type conversion of input array,
preferably in efficient manner, given that each input array may require
different input type.

Currently I basically enforce the type and no type conversion is
performed. But I would like relax it. I thought of several possibility
starting from the obvious solution consisting on recasting in the inner
loop that would give:

for (int i = 0; i < size; ++i)
        if (inarr[i] need recast)
                in0 =
recast(inarr[i])
        else
                in0 = inarr[i]
        [... same for all
inputs parameter ...]
        outarr[i] = foo(in0, in1, ...);

This solution is memory efficient, but not actually computationally
efficient.

The second solution is to copy&recast the entire inputs arrays, but in
that case it's not memory efficient. And my final thought is to mix the
first and the second by chunking the second method, i.e. converting N
input in a raw, then applying the function to them en so on until all
the array is processed.

Thus my questions are:
 - there is another way to do what I want?
 - there is an existing or recommended way to do it?

And a side question, I use the PyArray_FROM_OTF, but I do not
understand well it's semantic. If I pass a python list, it is converted
to the desired type and requirement; when I pass a non-continuous array
it is converted to the continuous one; but when I pass a numpy array of
another type than the one specified I do not get the conversion. There
is a function that do the conversion unconditionally? Did I missed
something ?

Thank you by advance for your help

Best regards



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Handle type convertion in C API

Eric Moore-11
Hi Benoit, 

Since you have a function that takes two scalars to one scalar, it sounds to me as though you would be best off creating a ufunc.  This will then handle the conversion to and looping over the arrays, etc for you.  The documentation is available here: https://numpy.org/doc/1.18/user/c-info.ufunc-tutorial.html.

Regards,

Eric

On Tue, Mar 10, 2020 at 12:28 PM Benoit Gschwind <[hidden email]> wrote:
Hello,

I writing a python binding of one of our library. The binding intend to
vectorize the function call. for exemple:

double foo(double, double) will be bound to a python:

<numpy.array of double> module.foo(<numpy.array>, <numpy.array>)

and the function foo will be called like :

for (int i = 0; i < size; ++i)
        outarr[i] = foo(inarr0[i], inarr[1]);

My question is about how can I handle type conversion of input array,
preferably in efficient manner, given that each input array may require
different input type.

Currently I basically enforce the type and no type conversion is
performed. But I would like relax it. I thought of several possibility
starting from the obvious solution consisting on recasting in the inner
loop that would give:

for (int i = 0; i < size; ++i)
        if (inarr[i] need recast)
                in0 =
recast(inarr[i])
        else
                in0 = inarr[i]
        [... same for all
inputs parameter ...]
        outarr[i] = foo(in0, in1, ...);

This solution is memory efficient, but not actually computationally
efficient.

The second solution is to copy&recast the entire inputs arrays, but in
that case it's not memory efficient. And my final thought is to mix the
first and the second by chunking the second method, i.e. converting N
input in a raw, then applying the function to them en so on until all
the array is processed.

Thus my questions are:
 - there is another way to do what I want?
 - there is an existing or recommended way to do it?

And a side question, I use the PyArray_FROM_OTF, but I do not
understand well it's semantic. If I pass a python list, it is converted
to the desired type and requirement; when I pass a non-continuous array
it is converted to the continuous one; but when I pass a numpy array of
another type than the one specified I do not get the conversion. There
is a function that do the conversion unconditionally? Did I missed
something ?

Thank you by advance for your help

Best regards



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Handle type convertion in C API

Benoit Gschwind
Hello Eric,

Thank you for pointing out ufunc, I implemented my binding using it,
it's working and it's simpler than my previous implementation, but I
still not have the flexibility to dynamically recast input array, i.e.
using a int64 array as input for a int32. For instance for testing my
code I use numpy.random that provide int64 array, I have to convert the
array manually before calling my ufunc, which is somehow annoying.

For function that have 1 or 2 parameters it's practical to have 4
variant of the function, but in case of 6-8 parameters it's becoming
more difficult.

Best regards

Le mardi 10 mars 2020 à 13:13 -0400, Eric Moore a écrit :

> Hi Benoit,
>
> Since you have a function that takes two scalars to one scalar, it
> sounds to me as though you would be best off creating a ufunc.  This
> will then handle the conversion to and looping over the arrays, etc
> for you.  The documentation is available here:
> https://numpy.org/doc/1.18/user/c-info.ufunc-tutorial.html.
>
> Regards,
>
> Eric
>
> On Tue, Mar 10, 2020 at 12:28 PM Benoit Gschwind <
> [hidden email]> wrote:
> > Hello,
> >
> > I writing a python binding of one of our library. The binding
> > intend to
> > vectorize the function call. for exemple:
> >
> > double foo(double, double) will be bound to a python:
> >
> > <numpy.array of double> module.foo(<numpy.array>, <numpy.array>)
> >
> > and the function foo will be called like :
> >
> > for (int i = 0; i < size; ++i)
> >         outarr[i] = foo(inarr0[i], inarr[1]);
> >
> > My question is about how can I handle type conversion of input
> > array,
> > preferably in efficient manner, given that each input array may
> > require
> > different input type.
> >
> > Currently I basically enforce the type and no type conversion is
> > performed. But I would like relax it. I thought of several
> > possibility
> > starting from the obvious solution consisting on recasting in the
> > inner
> > loop that would give:
> >
> > for (int i = 0; i < size; ++i)
> >         if (inarr[i] need recast)
> >                 in0 =
> > recast(inarr[i])
> >         else
> >                 in0 = inarr[i]
> >         [... same for all
> > inputs parameter ...]
> >         outarr[i] = foo(in0, in1, ...);
> >
> > This solution is memory efficient, but not actually computationally
> > efficient.
> >
> > The second solution is to copy&recast the entire inputs arrays, but
> > in
> > that case it's not memory efficient. And my final thought is to mix
> > the
> > first and the second by chunking the second method, i.e. converting
> > N
> > input in a raw, then applying the function to them en so on until
> > all
> > the array is processed.
> >
> > Thus my questions are:
> >  - there is another way to do what I want?
> >  - there is an existing or recommended way to do it?
> >
> > And a side question, I use the PyArray_FROM_OTF, but I do not
> > understand well it's semantic. If I pass a python list, it is
> > converted
> > to the desired type and requirement; when I pass a non-continuous
> > array
> > it is converted to the continuous one; but when I pass a numpy
> > array of
> > another type than the one specified I do not get the conversion.
> > There
> > is a function that do the conversion unconditionally? Did I missed
> > something ?
> >
> > Thank you by advance for your help
> >
> > Best regards
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Handle type convertion in C API

Eric Moore-11
There are a variety of ways to resolve your issues.  You can try using the optional arguments casting, dtype or signature that work for all ufuncs see https://numpy.org/doc/1.18/reference/ufuncs.html#optional-keyword-arguments  This will allow you to override the default type checks.  What you'll find for many ufuncs in, for instance, scipy.special, is that the underlying function in cython, C or Fortran is only defined for doubles, but the function has a signature for f->f, which just casts the input and output in the loop.  Generally, downcasting is not permitted without explicit opt in, e.g. int64 to int32, since this is not a safe cast as there are many values that an int64 can hold that an int32 cannot.  Generally speaking, you do have to manage the types of your arrays when the defaults aren't what you want.  There really isn't anyway around it.

Eric 

On Wed, Mar 11, 2020 at 4:43 AM Benoit Gschwind <[hidden email]> wrote:
Hello Eric,

Thank you for pointing out ufunc, I implemented my binding using it,
it's working and it's simpler than my previous implementation, but I
still not have the flexibility to dynamically recast input array, i.e.
using a int64 array as input for a int32. For instance for testing my
code I use numpy.random that provide int64 array, I have to convert the
array manually before calling my ufunc, which is somehow annoying.

For function that have 1 or 2 parameters it's practical to have 4
variant of the function, but in case of 6-8 parameters it's becoming
more difficult.

Best regards

Le mardi 10 mars 2020 à 13:13 -0400, Eric Moore a écrit :
> Hi Benoit,
>
> Since you have a function that takes two scalars to one scalar, it
> sounds to me as though you would be best off creating a ufunc.  This
> will then handle the conversion to and looping over the arrays, etc
> for you.  The documentation is available here:
> https://numpy.org/doc/1.18/user/c-info.ufunc-tutorial.html.
>
> Regards,
>
> Eric
>
> On Tue, Mar 10, 2020 at 12:28 PM Benoit Gschwind <
> [hidden email]> wrote:
> > Hello,
> >
> > I writing a python binding of one of our library. The binding
> > intend to
> > vectorize the function call. for exemple:
> >
> > double foo(double, double) will be bound to a python:
> >
> > <numpy.array of double> module.foo(<numpy.array>, <numpy.array>)
> >
> > and the function foo will be called like :
> >
> > for (int i = 0; i < size; ++i)
> >         outarr[i] = foo(inarr0[i], inarr[1]);
> >
> > My question is about how can I handle type conversion of input
> > array,
> > preferably in efficient manner, given that each input array may
> > require
> > different input type.
> >
> > Currently I basically enforce the type and no type conversion is
> > performed. But I would like relax it. I thought of several
> > possibility
> > starting from the obvious solution consisting on recasting in the
> > inner
> > loop that would give:
> >
> > for (int i = 0; i < size; ++i)
> >         if (inarr[i] need recast)
> >                 in0 =
> > recast(inarr[i])
> >         else
> >                 in0 = inarr[i]
> >         [... same for all
> > inputs parameter ...]
> >         outarr[i] = foo(in0, in1, ...);
> >
> > This solution is memory efficient, but not actually computationally
> > efficient.
> >
> > The second solution is to copy&recast the entire inputs arrays, but
> > in
> > that case it's not memory efficient. And my final thought is to mix
> > the
> > first and the second by chunking the second method, i.e. converting
> > N
> > input in a raw, then applying the function to them en so on until
> > all
> > the array is processed.
> >
> > Thus my questions are:
> >  - there is another way to do what I want?
> >  - there is an existing or recommended way to do it?
> >
> > And a side question, I use the PyArray_FROM_OTF, but I do not
> > understand well it's semantic. If I pass a python list, it is
> > converted
> > to the desired type and requirement; when I pass a non-continuous
> > array
> > it is converted to the continuous one; but when I pass a numpy
> > array of
> > another type than the one specified I do not get the conversion.
> > There
> > is a function that do the conversion unconditionally? Did I missed
> > something ?
> >
> > Thank you by advance for your help
> >
> > Best regards
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion