# ufunc for sum of squared difference

13 messages
Open this post in threaded view
|

## ufunc for sum of squared difference

 I was reading this and got thinking about if a ufunc could compute the sum of squared differences in a single pass without a temporary array.  The python code below demonstrates a possible approach.import numpy as npx = np.arange(10)c = 1.0def add_square_diff(x1, x2):    return x1 + (x2-c)**2ufunc = np.frompyfunc(add_square_diff, 2, 1)print(ufunc.reduce(x) - x[0] + (x[0]-c)**2)print(np.sum(np.square(x-c)))I have (at least) 4 questions:1. Is it possible to pass run time constants to a ufunc written in C for use in its inner loop, and if so how?2. Is it possible to pass an initial value to reduce to avoid the clean up required for the first element?3. Does that ufunc work, or are there special cases which cause it to fall apart?4. Would a very specialized ufunc such as this be considered for incorporating in numpy since it would help reduce time and memory of functions already in numpy?Thank you,Matt _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: ufunc for sum of squared difference

 On Fr, 2016-11-04 at 13:11 -0400, Matthew Harrigan wrote: > I was reading this and got thinking about if a ufunc could compute > the sum of squared differences in a single pass without a temporary > array.  The python code below demonstrates a possible approach. > > import numpy as np > x = np.arange(10) > c = 1.0 > def add_square_diff(x1, x2): >     return x1 + (x2-c)**2 > ufunc = np.frompyfunc(add_square_diff, 2, 1) > print(ufunc.reduce(x) - x[0] + (x[0]-c)**2) > print(np.sum(np.square(x-c))) > > I have (at least) 4 questions: > 1. Is it possible to pass run time constants to a ufunc written in C > for use in its inner loop, and if so how? I don't think its anticipated, since a ufunc could in most cases use a third argument, but a 3 arg ufunc can't be reduced. Not sure if there might be some trickery possible. > 2. Is it possible to pass an initial value to reduce to avoid the > clean up required for the first element? This is the identity normally. But the identity can only be 0, 1 or -1 right now I think. The identity is what the output array gets initialized with (which effectively makes it the first value passed into the inner loop). > 3. Does that ufunc work, or are there special cases which cause it to > fall apart? > 4. Would a very specialized ufunc such as this be considered for > incorporating in numpy since it would help reduce time and memory of > functions already in numpy? > Might be mixing up things, however, IIRC the single pass approach has a bad numerical accuracy, so that I doubt that it is a good default algorithm. - Sebastian > Thank you, > Matt > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/numpy-discussion_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc (836 bytes) Download Attachment
Open this post in threaded view
|

## Re: ufunc for sum of squared difference

 I didn't notice identity before.  Seems like frompyfunc always sets it to None.  If it were zero maybe it would work as desired here.In the writing your own ufunc doc, I was wondering if the pointer to data could be used to get a constant at runtime.  If not, what could that be used for?static void double_logit(char **args, npy_intp *dimensions, npy_intp* steps, void* data)Why would the numerical accuracy be any different?  The subtraction and square operations look identical and I thought np.sum just calls np.add.reduce, so the reduction step uses the same code and would therefore have the same accuracy.ThanksOn Fri, Nov 4, 2016 at 1:56 PM, Sebastian Berg wrote:On Fr, 2016-11-04 at 13:11 -0400, Matthew Harrigan wrote: > I was reading this and got thinking about if a ufunc could compute > the sum of squared differences in a single pass without a temporary > array.  The python code below demonstrates a possible approach. > > import numpy as np > x = np.arange(10) > c = 1.0 > def add_square_diff(x1, x2): >     return x1 + (x2-c)**2 > ufunc = np.frompyfunc(add_square_diff, 2, 1) > print(ufunc.reduce(x) - x[0] + (x[0]-c)**2) > print(np.sum(np.square(x-c))) > > I have (at least) 4 questions: > 1. Is it possible to pass run time constants to a ufunc written in C > for use in its inner loop, and if so how? I don't think its anticipated, since a ufunc could in most cases use a third argument, but a 3 arg ufunc can't be reduced. Not sure if there might be some trickery possible. > 2. Is it possible to pass an initial value to reduce to avoid the > clean up required for the first element? This is the identity normally. But the identity can only be 0, 1 or -1 right now I think. The identity is what the output array gets initialized with (which effectively makes it the first value passed into the inner loop). > 3. Does that ufunc work, or are there special cases which cause it to > fall apart? > 4. Would a very specialized ufunc such as this be considered for > incorporating in numpy since it would help reduce time and memory of > functions already in numpy? > Might be mixing up things, however, IIRC the single pass approach has a bad numerical accuracy, so that I doubt that it is a good default algorithm. - Sebastian > Thank you, > Matt > _______________________________________________ > NumPy-Discussion mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/numpy-discussion_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: ufunc for sum of squared difference

Open this post in threaded view
|

## Re: ufunc for sum of squared difference

Open this post in threaded view
|

## Re: ufunc for sum of squared difference

Open this post in threaded view
|

## Re: ufunc for sum of squared difference

Open this post in threaded view
|

## Re: ufunc for sum of squared difference

 In reply to this post by Matthew Harrigan On Fri, 11 Nov 2016 11:25:58 -0500 Matthew Harrigan <[hidden email]> wrote: > I started a ufunc to compute the sum of square differences here > . > It is about 4x faster and uses half the memory compared to > np.sum(np.square(x-c)). Hi Matt, Using *blas* you win already a factor two (maybe more depending on you blas implementation): % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" "np.sum(np.square(x-2.))" 10 loops, best of 3: 135 msec per loop % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" "y=x-2.;np.dot(y,y)" 10 loops, best of 3: 70.2 msec per loop Cheers, -- Jérôme Kieffer _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: ufunc for sum of squared difference

 Yeah,but it's not so obvious what's happening "under the hoods". Consider this (with an old Win7 machine):Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)]np.__version__'1.11.1'On Mon, Nov 14, 2016 at 10:38 AM, Jerome Kieffer wrote:On Fri, 11 Nov 2016 11:25:58 -0500 Matthew Harrigan <[hidden email]> wrote: > I started a ufunc to compute the sum of square differences here > . > It is about 4x faster and uses half the memory compared to > np.sum(np.square(x-c)). Hi Matt, Using *blas* you win already a factor two (maybe more depending on you blas implementation): % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" "np.sum(np.square(x-2.))" 10 loops, best of 3: 135 msec per loop % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" "y=x-2.;np.dot(y,y)" 10 loops, best of 3: 70.2 msec per loopx= np.linspace(0, 1, int(1e6))timeit np.sum(np.square(x- 2.))10 loops, best of 3: 23 ms per loopy= x- 2.timeit np.dot(y, y)The slowest run took 18.60 times longer than the fastest. This could mean that an intermediate result is being cached.1000 loops, best of 3: 1.78 ms per looptimeit np.dot(y, y)1000 loops, best of 3: 1.73 ms per loopBest,eat Cheers, -- Jérôme Kieffer _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: ufunc for sum of squared difference

 In reply to this post by Jerome Kieffer Still slower and worse uses 2x the memory for the intermediate temporary array.I propose allowing implicit reductions with ufuncs.  Specifically if out is provided with shape[axis] = 1, then pass it on to the ufunc with a stride of 0.  That should allow this to work:x = np.arange(10)def add_square_diff(x1, x2, x3):    return x1 + (x2-x3)**2result  =np.zeros(1)np.frompyfunc(add_square_diff, 3, 1)(result, x, np.mean(x), result)Essentially it creates a reduce for a function which isn't binary.  I think this would be generally useful.  For instance, finding the min and max in one pass would be nice:def minmax(x1, x2, x3):    return min(x1,x3), max(x2,x3)minn = np.array([np.inf])maxx = np.array([-np.inf])np.frompyfunc(minmax, 3, 2)(minn, maxx, x, minn, maxx)Note it also allows for arbitrary initial values or identity to be specified, possibly determined at run time.  I think this would make ufuncs even more universal.On Mon, Nov 14, 2016 at 3:38 AM, Jerome Kieffer wrote:On Fri, 11 Nov 2016 11:25:58 -0500 Matthew Harrigan <[hidden email]> wrote: > I started a ufunc to compute the sum of square differences here > . > It is about 4x faster and uses half the memory compared to > np.sum(np.square(x-c)). Hi Matt, Using *blas* you win already a factor two (maybe more depending on you blas implementation): % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" "np.sum(np.square(x-2.))" 10 loops, best of 3: 135 msec per loop % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" "y=x-2.;np.dot(y,y)" 10 loops, best of 3: 70.2 msec per loop Cheers, -- Jérôme Kieffer _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion
Open this post in threaded view
|

## Re: ufunc for sum of squared difference

 On Mon, Nov 14, 2016 at 5:40 PM, Matthew Harrigan wrote:Essentially it creates a reduce for a function which isn't binary.  I think this would be generally useful.NumPy already has a generic enough interface for creating such ufuncs. In fact, it's called a "generalized ufunc":I think you could already write "implicit reductions" using gufuncs? _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion