numpy1.2 : make sorts unary ufuncs

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

numpy1.2 : make sorts unary ufuncs

Charles R Harris
The signature for a ufunc is something like

@TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)

Which contains all the info necessary to do a sort. Means and other such functions could also be implemented that way.

Chuck

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Robert Kern-2
On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
<[hidden email]> wrote:
> The signature for a ufunc is something like
>
> @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
>
> Which contains all the info necessary to do a sort. Means and other such
> functions could also be implemented that way.

axis?

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Fri, Apr 18, 2008 at 10:53 PM, Robert Kern <[hidden email]> wrote:
On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
<[hidden email]> wrote:
> The signature for a ufunc is something like
>
> @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
>
> Which contains all the info necessary to do a sort. Means and other such
> functions could also be implemented that way.

axis?

I believe Travis is already selecting an axis to get the best cache usage, so it just needs a tweak. Axis = None is the special case.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Robert Kern-2
On Fri, Apr 18, 2008 at 11:58 PM, Charles R Harris
<[hidden email]> wrote:

>
> On Fri, Apr 18, 2008 at 10:53 PM, Robert Kern <[hidden email]> wrote:
> >
> > On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
> > <[hidden email]> wrote:
> > > The signature for a ufunc is something like
> > >
> > > @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
> > >
> > > Which contains all the info necessary to do a sort. Means and other such
> > > functions could also be implemented that way.
> >
> > axis?
>
> I believe Travis is already selecting an axis to get the best cache usage,
> so it just needs a tweak. Axis = None is the special case.

So what is the benefit here? Why bother?

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Fri, Apr 18, 2008 at 11:02 PM, Robert Kern <[hidden email]> wrote:
On Fri, Apr 18, 2008 at 11:58 PM, Charles R Harris
<[hidden email]> wrote:
>
> On Fri, Apr 18, 2008 at 10:53 PM, Robert Kern <[hidden email]> wrote:
> >
> > On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
> > <[hidden email]> wrote:
> > > The signature for a ufunc is something like
> > >
> > > @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
> > >
> > > Which contains all the info necessary to do a sort. Means and other such
> > > functions could also be implemented that way.
> >
> > axis?
>
> I believe Travis is already selecting an axis to get the best cache usage,
> so it just needs a tweak. Axis = None is the special case.

So what is the benefit here? Why bother?

Code reuse and a unified concept. Why duplicate all the complication as we do now? Argsorts become binary ufuncs, etc. Besides, I'm trying to track down a ufunc call bug, feeling annoyed, and it would be nice to have all the crap in one location so we can work on cleaning it up.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Robert Kern-2
On Sat, Apr 19, 2008 at 12:18 AM, Charles R Harris
<[hidden email]> wrote:

> On Fri, Apr 18, 2008 at 11:02 PM, Robert Kern <[hidden email]> wrote:
> > On Fri, Apr 18, 2008 at 11:58 PM, Charles R Harris
> >
> > <[hidden email]> wrote:
> > >
> > > On Fri, Apr 18, 2008 at 10:53 PM, Robert Kern <[hidden email]>
> wrote:
> > > >
> > > > On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
> > > > <[hidden email]> wrote:
> > > > > The signature for a ufunc is something like
> > > > >
> > > > > @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void
> *func)
> > > > >
> > > > > Which contains all the info necessary to do a sort. Means and other
> such
> > > > > functions could also be implemented that way.
> > > >
> > > > axis?
> > >
> > > I believe Travis is already selecting an axis to get the best cache
> usage,
> > > so it just needs a tweak. Axis = None is the special case.
> >
> > So what is the benefit here? Why bother?
>
> Code reuse and a unified concept. Why duplicate all the complication as we
> do now? Argsorts become binary ufuncs, etc. Besides, I'm trying to track
> down a ufunc call bug, feeling annoyed, and it would be nice to have all the
> crap in one location so we can work on cleaning it up.

I have a suspicion that tweaking ufuncs to allow the special case of
sorting will negate the benefits. It smells like an abuse of the
machinery rather than a natural merging of similar functionality.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Fri, Apr 18, 2008 at 11:23 PM, Robert Kern <[hidden email]> wrote:
On Sat, Apr 19, 2008 at 12:18 AM, Charles R Harris
<[hidden email]> wrote:
> On Fri, Apr 18, 2008 at 11:02 PM, Robert Kern <[hidden email]> wrote:
> > On Fri, Apr 18, 2008 at 11:58 PM, Charles R Harris
> >
> > <[hidden email]> wrote:
> > >
> > > On Fri, Apr 18, 2008 at 10:53 PM, Robert Kern <[hidden email]>
> wrote:
> > > >
> > > > On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
> > > > <[hidden email]> wrote:
> > > > > The signature for a ufunc is something like
> > > > >
> > > > > @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void
> *func)
> > > > >
> > > > > Which contains all the info necessary to do a sort. Means and other
> such
> > > > > functions could also be implemented that way.
> > > >
> > > > axis?
> > >
> > > I believe Travis is already selecting an axis to get the best cache
> usage,
> > > so it just needs a tweak. Axis = None is the special case.
> >
> > So what is the benefit here? Why bother?
>
> Code reuse and a unified concept. Why duplicate all the complication as we
> do now? Argsorts become binary ufuncs, etc. Besides, I'm trying to track
> down a ufunc call bug, feeling annoyed, and it would be nice to have all the
> crap in one location so we can work on cleaning it up.

I have a suspicion that tweaking ufuncs to allow the special case of
sorting will negate the benefits. It smells like an abuse of the
machinery rather than a natural merging of similar functionality.
 
Don't think so, the ufunc inner loops are all over a given axis with the other indices varying. I think anything that processes arrays along an axis is a natural for the framework. And if we add bells and whistles, like threads, it might be nice to keep everything in one place.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Fri, Apr 18, 2008 at 11:29 PM, Charles R Harris <[hidden email]> wrote:


On Fri, Apr 18, 2008 at 11:23 PM, Robert Kern <[hidden email]> wrote:
On Sat, Apr 19, 2008 at 12:18 AM, Charles R Harris
<[hidden email]> wrote:
> On Fri, Apr 18, 2008 at 11:02 PM, Robert Kern <[hidden email]> wrote:
> > On Fri, Apr 18, 2008 at 11:58 PM, Charles R Harris
> >
> > <[hidden email]> wrote:
> > >
> > > On Fri, Apr 18, 2008 at 10:53 PM, Robert Kern <[hidden email]>
> wrote:
> > > >
> > > > On Fri, Apr 18, 2008 at 11:47 PM, Charles R Harris
> > > > <[hidden email]> wrote:
> > > > > The signature for a ufunc is something like
> > > > >
> > > > > @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void
> *func)
> > > > >
> > > > > Which contains all the info necessary to do a sort. Means and other
> such
> > > > > functions could also be implemented that way.
> > > >
> > > > axis?
> > >
> > > I believe Travis is already selecting an axis to get the best cache
> usage,
> > > so it just needs a tweak. Axis = None is the special case.
> >
> > So what is the benefit here? Why bother?
>
> Code reuse and a unified concept. Why duplicate all the complication as we
> do now? Argsorts become binary ufuncs, etc. Besides, I'm trying to track
> down a ufunc call bug, feeling annoyed, and it would be nice to have all the
> crap in one location so we can work on cleaning it up.

I have a suspicion that tweaking ufuncs to allow the special case of
sorting will negate the benefits. It smells like an abuse of the
machinery rather than a natural merging of similar functionality.
 
Don't think so, the ufunc inner loops are all over a given axis with the other indices varying. I think anything that processes arrays along an axis is a natural for the framework. And if we add bells and whistles, like threads, it might be nice to keep everything in one place.

You can also do matrix multiplication in a ufunc. It's got two inputs and an output. Needs two axis specs, but other than that goes in quite nicely. IIRC, BLAS already has the strides built in.

Chuck


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Travis Oliphant-5
In reply to this post by Charles R Harris
Charles R Harris wrote:
> The signature for a ufunc is something like
>
> @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
>
> Which contains all the info necessary to do a sort. Means and other
> such functions could also be implemented that way.
I dont' think the ufunc is the right place for this.    Perhaps sorting
can be viewed however as a "general function" which I'd like to add.  
Maybe, in that case the ufunc could be a special case of the "general
function" concept.   But, that is the direction I would expect the
discussion to go.

The ufunc object is a very particular kind of thing.   The 1-d inner
loop signature is a key part of it, but there is more to it than that.  
I don't see  how the sorting function can be pushed into this concept.

But, there is a case to be made for perhaps figuring out how to merge
the data-type functions and the ufuncs into one idea that allows for
better code re-use.

-Travis


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Sat, Apr 19, 2008 at 12:20 AM, Travis E. Oliphant <[hidden email]> wrote:
Charles R Harris wrote:
> The signature for a ufunc is something like
>
> @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
>
> Which contains all the info necessary to do a sort. Means and other
> such functions could also be implemented that way.
I dont' think the ufunc is the right place for this.    Perhaps sorting
can be viewed however as a "general function" which I'd like to add.
Maybe, in that case the ufunc could be a special case of the "general
function" concept.   But, that is the direction I would expect the
discussion to go.

The ufunc object is a very particular kind of thing.   The 1-d inner
loop signature is a key part of it, but there is more to it than that.
I don't see  how the sorting function can be pushed into this concept.

Yes, but the inner loop is just something that uses the array values along that axis to produce another set of values, i.e., it is a vector valued function of vectors. So is a sort, so is argsort, so is the inner product, so on and so forth. That's what we have here:

typedef void (*PyUFuncGenericFunction) (char **, npy_intp *, npy_intp *, void *);

 No difference that I can see. It is the call function in PyUFuncObject that matters.


But, there is a case to be made for perhaps figuring out how to merge
the data-type functions and the ufuncs into one idea that allows for
better code re-use.

I don't think we can get all of them, but I think we can get most.

Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Robert Kern-2
On Sat, Apr 19, 2008 at 1:55 AM, Charles R Harris
<[hidden email]> wrote:

> Yes, but the inner loop is just something that uses the array values along
> that axis to produce another set of values, i.e., it is a vector valued
> function of vectors. So is a sort, so is argsort, so is the inner product,
> so on and so forth. That's what we have here:
>
> typedef void (*PyUFuncGenericFunction) (char **, npy_intp *, npy_intp *,
> void *);
>
>  No difference that I can see. It is the call function in PyUFuncObject that
> matters.

I believe this is the disconnect. From my perspective, the fact that
the inner kernel function of a ufunc has a sufficient argument list to
do a sort isn't important. The signature of that kernel function isn't
what makes a ufunc; it's all of the code around it that does
broadcasting, type matching and manipulation, etc. If we're changing
that code to accommodate sorting, we haven't gained anything. We've
just moved some code around; possibly we've reduced the line count,
but I fear that we will muddy ufunc implementation with non-ufunc
functionality and special cases.

If you want to go down this road, I think you need to do what Travis
suggests: factor out some of the common code between ufuncs and sorts
into a "superclass" (not really, but you get the idea), and then
implement ufuncs and sorts based on that. I think trying to shove
sorts into ufuncs-qua-ufuncs is a bad idea. There is more than one
path to code reuse.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Travis Oliphant-5
In reply to this post by Charles R Harris
Charles R Harris wrote:

>
>
> On Sat, Apr 19, 2008 at 12:20 AM, Travis E. Oliphant
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Charles R Harris wrote:
>     > The signature for a ufunc is something like
>     >
>     > @TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void
>     *func)
>     >
>     > Which contains all the info necessary to do a sort. Means and other
>     > such functions could also be implemented that way.
>     I dont' think the ufunc is the right place for this.    Perhaps
>     sorting
>     can be viewed however as a "general function" which I'd like to add.
>     Maybe, in that case the ufunc could be a special case of the "general
>     function" concept.   But, that is the direction I would expect the
>     discussion to go.
>
>     The ufunc object is a very particular kind of thing.   The 1-d inner
>     loop signature is a key part of it, but there is more to it than that.
>     I don't see  how the sorting function can be pushed into this concept.
>
>
> Yes, but the inner loop is just something that uses the array values
> along that axis to produce another set of values, i.e., it is a vector
> valued function of vectors. So is a sort, so is argsort, so is the
> inner product, so on and so forth. That's what we have here:
>
> typedef void (*PyUFuncGenericFunction) (char **, npy_intp *, npy_intp
> *, void *);
>
>  No difference that I can see. It is the call function in
> PyUFuncObject that matters.
Yes, you are right that the low-level 1-d loop signature is similar (or
can be made similar) to other signatures.  I'm not sure what is gained
by that, though.   Can you explain exactly what you want to do?

In my mind, the call function is exactly what makes a ufunc a ufunc, so
I'm not sure why discussing sort in the same breath is meaningful.  
Perhaps we are just running around each others definitions.

-Travis




_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris
In reply to this post by Robert Kern-2


On Sat, Apr 19, 2008 at 1:12 AM, Robert Kern <[hidden email]> wrote:
On Sat, Apr 19, 2008 at 1:55 AM, Charles R Harris
<[hidden email]> wrote:

> Yes, but the inner loop is just something that uses the array values along
> that axis to produce another set of values, i.e., it is a vector valued
> function of vectors. So is a sort, so is argsort, so is the inner product,
> so on and so forth. That's what we have here:
>
> typedef void (*PyUFuncGenericFunction) (char **, npy_intp *, npy_intp *,
> void *);
>
>  No difference that I can see. It is the call function in PyUFuncObject that
> matters.

I believe this is the disconnect. From my perspective, the fact that
the inner kernel function of a ufunc has a sufficient argument list to
do a sort isn't important. The signature of that kernel function isn't
what makes a ufunc; it's all of the code around it that does
broadcasting, type matching and manipulation, etc. If we're changing
that code to accommodate sorting, we haven't gained anything. We've
just moved some code around; possibly we've reduced the line count,
but I fear that we will muddy ufunc implementation with non-ufunc
functionality and special cases.

If you want to go down this road, I think you need to do what Travis
suggests: factor out some of the common code between ufuncs and sorts
into a "superclass" (not really, but you get the idea), and then
implement ufuncs and sorts based on that. I think trying to shove
sorts into ufuncs-qua-ufuncs is a bad idea. There is more than one
path to code reuse.

Right now we have:

typedef struct {
    PyObject_HEAD
    int nin, nout, nargs;
    int identity;
    PyUFuncGenericFunction *functions;
    void **data;
    int ntypes;
    int check_return;
    char *name, *types;
    char *doc;
    void *ptr;
    PyObject *obj;
    PyObject *userloops;
} PyUFuncObject;
 
Which could be derived from something slightly more general. We could also leave out reduce, accumulate, etc., which are special cases. We then have common code for registration, etc. The call function still has to check types, dispatch the calls for the axis, maybe create output arrays, as for maximum.reduce, and so on. Broadcasting isn't applicable to unary type things and many functions, say in argsort, look unary from the top, so that doesn't enter in.

Chuck
--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Sat, Apr 19, 2008 at 1:29 AM, Charles R Harris <[hidden email]> wrote:
<snip>

On Sat, Apr 19, 2008 at 1:12 AM, Robert Kern <[hidden email]> wrote:
On Sat, Apr 19, 2008 at 1:55 AM, Charles R Harris
<[hidden email]> wrote:

> Yes, but the inner loop is just something that uses the array values along
> that axis to produce another set of values, i.e., it is a vector valued
> function of vectors. So is a sort, so is argsort, so is the inner product,
> so on and so forth. That's what we have here:
>
> typedef void (*PyUFuncGenericFunction) (char **, npy_intp *, npy_intp *,
> void *);
>
>  No difference that I can see. It is the call function in PyUFuncObject that
> matters.

I believe this is the disconnect. From my perspective, the fact that
the inner kernel function of a ufunc has a sufficient argument list to
do a sort isn't important. The signature of that kernel function isn't
what makes a ufunc; it's all of the code around it that does
broadcasting, type matching and manipulation, etc. If we're changing
that code to accommodate sorting, we haven't gained anything. We've
just moved some code around; possibly we've reduced the line count,
but I fear that we will muddy ufunc implementation with non-ufunc
functionality and special cases.

If you want to go down this road, I think you need to do what Travis
suggests: factor out some of the common code between ufuncs and sorts
into a "superclass" (not really, but you get the idea), and then
implement ufuncs and sorts based on that. I think trying to shove
sorts into ufuncs-qua-ufuncs is a bad idea. There is more than one
path to code reuse.

Right now we have:

typedef struct {
    PyObject_HEAD
    int nin, nout, nargs;
    int identity;
    PyUFuncGenericFunction *functions;
    void **data;
    int ntypes;
    int check_return;
    char *name, *types;
    char *doc;
    void *ptr;
    PyObject *obj;
    PyObject *userloops;
} PyUFuncObject;
 
Which could be derived from something slightly more general. We could also leave out reduce, accumulate, etc., which are special cases. We then have common code for registration, etc. The call function still has to check types, dispatch the calls for the axis, maybe create output arrays, as for maximum.reduce, and so on. Broadcasting isn't applicable to unary type things and many functions, say in argsort, look unary from the top, so that doesn't enter in.

For instance

static void
BOOL_@kind@(char **args, intp *dimensions, intp *steps, void *func)
{
    register intp i;
    intp is1=steps[0],is2=steps[1],os=steps[2], n=dimensions[0];
    char *i1=args[0], *i2=args[1], *op=args[2];
    Bool in1, in2;
    for(i=0; i<n; i++, i1+=is1, i2+=is2, op+=os) {
        in1 = (*((Bool *)i1) != 0);
        in2 = (*((Bool *)i2) != 0);
        *((Bool *)op)= in1 @OP@ in2;
    }
}

It looks to me like broadcasting is achieved by adjusting the step size. The only bothersome detail here is getting the count from the first dimension, that looks a bit fragile.

Chuck


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Travis Oliphant-5
Charles R Harris wrote:

>
>
> On Sat, Apr 19, 2008 at 1:29 AM, Charles R Harris
> <[hidden email] <mailto:[hidden email]>> wrote:
> <snip>
>
>
>     On Sat, Apr 19, 2008 at 1:12 AM, Robert Kern
>     <[hidden email] <mailto:[hidden email]>> wrote:
>
>         On Sat, Apr 19, 2008 at 1:55 AM, Charles R Harris
>         <[hidden email] <mailto:[hidden email]>>
>         wrote:
>
>         > Yes, but the inner loop is just something that uses the
>         array values along
>         > that axis to produce another set of values, i.e., it is a
>         vector valued
>         > function of vectors. So is a sort, so is argsort, so is the
>         inner product,
>         > so on and so forth. That's what we have here:
>         >
>         > typedef void (*PyUFuncGenericFunction) (char **, npy_intp *,
>         npy_intp *,
>         > void *);
>         >
>         >  No difference that I can see. It is the call function in
>         PyUFuncObject that
>         > matters.
>
>         I believe this is the disconnect. From my perspective, the
>         fact that
>         the inner kernel function of a ufunc has a sufficient argument
>         list to
>         do a sort isn't important. The signature of that kernel
>         function isn't
>         what makes a ufunc; it's all of the code around it that does
>         broadcasting, type matching and manipulation, etc. If we're
>         changing
>         that code to accommodate sorting, we haven't gained anything.
>         We've
>         just moved some code around; possibly we've reduced the line
>         count,
>         but I fear that we will muddy ufunc implementation with non-ufunc
>         functionality and special cases.
>
>         If you want to go down this road, I think you need to do what
>         Travis
>         suggests: factor out some of the common code between ufuncs
>         and sorts
>         into a "superclass" (not really, but you get the idea), and then
>         implement ufuncs and sorts based on that. I think trying to shove
>         sorts into ufuncs-qua-ufuncs is a bad idea. There is more than one
>         path to code reuse.
>
>
>     Right now we have:
>
>     typedef struct {
>         PyObject_HEAD
>         int nin, nout, nargs;
>         int identity;
>         PyUFuncGenericFunction *functions;
>         void **data;
>         int ntypes;
>         int check_return;
>         char *name, *types;
>         char *doc;
>         void *ptr;
>         PyObject *obj;
>         PyObject *userloops;
>     } PyUFuncObject;
>      
>     Which could be derived from something slightly more general. We
>     could also leave out reduce, accumulate, etc., which are special
>     cases. We then have common code for registration, etc. The call
>     function still has to check types, dispatch the calls for the
>     axis, maybe create output arrays, as for maximum.reduce, and so
>     on. Broadcasting isn't applicable to unary type things and many
>     functions, say in argsort, look unary from the top, so that
>     doesn't enter in.
>
>
> For instance
>
> static void
> BOOL_@kind@(char **args, intp *dimensions, intp *steps, void *func)
> {
>     register intp i;
>     intp is1=steps[0],is2=steps[1],os=steps[2], n=dimensions[0];
>     char *i1=args[0], *i2=args[1], *op=args[2];
>     Bool in1, in2;
>     for(i=0; i<n; i++, i1+=is1, i2+=is2, op+=os) {
>         in1 = (*((Bool *)i1) != 0);
>         in2 = (*((Bool *)i2) != 0);
>         *((Bool *)op)= in1 @OP@ in2;
>     }
> }
>
> It looks to me like broadcasting is achieved by adjusting the step
> size. The only bothersome detail here is getting the count from the
> first dimension, that looks a bit fragile.
It shouldn't be fragile.   It's a historical accident that the signature
looks like that.  This is the signature inherited from Numeric.  All of
scipy-special would have to be changed in order to change it.

Perhaps the thinking was that there would be multiple "counts" to keep
track of at some time.  But, I'm not sure.   I've only seen the "first"
entry used so dimensions is really just ptr_to_int rather than any kind
of "shape".

-Travis O.

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy1.2 : make sorts unary ufuncs

Charles R Harris


On Sat, Apr 19, 2008 at 11:40 AM, Travis E. Oliphant <[hidden email]> wrote:
Charles R Harris wrote:
>
>
> On Sat, Apr 19, 2008 at 1:29 AM, Charles R Harris
> <[hidden email] <mailto:[hidden email]>> wrote:
> <snip>
>
>
>     On Sat, Apr 19, 2008 at 1:12 AM, Robert Kern
>     <[hidden email] <mailto:[hidden email]>> wrote:
>
>         On Sat, Apr 19, 2008 at 1:55 AM, Charles R Harris
>         <[hidden email] <mailto:[hidden email]>>
>         wrote:
>
>         > Yes, but the inner loop is just something that uses the
>         array values along
>         > that axis to produce another set of values, i.e., it is a
>         vector valued
>         > function of vectors. So is a sort, so is argsort, so is the
>         inner product,
>         > so on and so forth. That's what we have here:
>         >
>         > typedef void (*PyUFuncGenericFunction) (char **, npy_intp *,
>         npy_intp *,
>         > void *);
>         >
>         >  No difference that I can see. It is the call function in
>         PyUFuncObject that
>         > matters.
>
>         I believe this is the disconnect. From my perspective, the
>         fact that
>         the inner kernel function of a ufunc has a sufficient argument
>         list to
>         do a sort isn't important. The signature of that kernel
>         function isn't
>         what makes a ufunc; it's all of the code around it that does
>         broadcasting, type matching and manipulation, etc. If we're
>         changing
>         that code to accommodate sorting, we haven't gained anything.
>         We've
>         just moved some code around; possibly we've reduced the line
>         count,
>         but I fear that we will muddy ufunc implementation with non-ufunc
>         functionality and special cases.
>
>         If you want to go down this road, I think you need to do what
>         Travis
>         suggests: factor out some of the common code between ufuncs
>         and sorts
>         into a "superclass" (not really, but you get the idea), and then
>         implement ufuncs and sorts based on that. I think trying to shove
>         sorts into ufuncs-qua-ufuncs is a bad idea. There is more than one
>         path to code reuse.
>
>
>     Right now we have:
>
>     typedef struct {
>         PyObject_HEAD
>         int nin, nout, nargs;
>         int identity;
>         PyUFuncGenericFunction *functions;
>         void **data;
>         int ntypes;
>         int check_return;
>         char *name, *types;
>         char *doc;
>         void *ptr;
>         PyObject *obj;
>         PyObject *userloops;
>     } PyUFuncObject;
>
>     Which could be derived from something slightly more general. We
>     could also leave out reduce, accumulate, etc., which are special
>     cases. We then have common code for registration, etc. The call
>     function still has to check types, dispatch the calls for the
>     axis, maybe create output arrays, as for maximum.reduce, and so
>     on. Broadcasting isn't applicable to unary type things and many
>     functions, say in argsort, look unary from the top, so that
>     doesn't enter in.
>
>
> For instance
>
> static void
> BOOL_@kind@(char **args, intp *dimensions, intp *steps, void *func)
> {
>     register intp i;
>     intp is1=steps[0],is2=steps[1],os=steps[2], n=dimensions[0];
>     char *i1=args[0], *i2=args[1], *op=args[2];
>     Bool in1, in2;
>     for(i=0; i<n; i++, i1+=is1, i2+=is2, op+=os) {
>         in1 = (*((Bool *)i1) != 0);
>         in2 = (*((Bool *)i2) != 0);
>         *((Bool *)op)= in1 @OP@ in2;
>     }
> }
>
> It looks to me like broadcasting is achieved by adjusting the step
> size. The only bothersome detail here is getting the count from the
> first dimension, that looks a bit fragile.
It shouldn't be fragile.   It's a historical accident that the signature
looks like that.  This is the signature inherited from Numeric.  All of
scipy-special would have to be changed in order to change it.

Perhaps the thinking was that there would be multiple "counts" to keep
track of at some time.  But, I'm not sure.   I've only seen the "first"
entry used so dimensions is really just ptr_to_int rather than any kind
of "shape".
 
Ah, that was my mistake, then.

Chuck


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion