dot/tensordot limitations

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

dot/tensordot limitations

Anne Archibald
Timothy Hochberg has proposed a generalization of the matrix mechanism
to support manipulating arrays of linear algebra objects. For example,
one might have an array of matrices one wants to apply to an array of
vectors, to yield an array of vectors:

In [88]: A = np.repeat(np.eye(3)[np.newaxis,...],2,axis=0)

In [89]: A
Out[89]:
array([[[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]],

       [[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]]])

In [90]: V = np.array([[1,0,0],[0,1,0]])

Currently, it is very clumsy to handle this kind of situation even
with arrays, keeping track of dimensions by hand. For example if one
wants to multiply A by V "elementwise", one cannot simply use dot:

In [92]: np.dot(A,V.T)
Out[92]:
array([[[ 1.,  0.],
        [ 0.,  1.],
        [ 0.,  0.]],

       [[ 1.,  0.],
        [ 0.,  1.],
        [ 0.,  0.]]])


tensordot() suffers from the same problem. It arises because when you
combine two multiindexed objects there are a number of ways you can do
it:

A: Treat all indices as different and form all pairwise products
(outer product):
In [93]: np.multiply.outer(A,V).shape
Out[93]: (2, 3, 3, 2, 3)

B: Contract the outer product on a pair of indices:
In [98]: np.tensordot(A,V,axes=(-1,-1)).shape
Out[98]: (2, 3, 2)

C: Take the diagonal along a pair of indices:
In [111]: np.diagonal(np.multiply.outer(A,V),axis1=0,axis2=3).shape
Out[111]: (3, 3, 3, 2)

What we want in this case is a combination of B and C:
In [106]: np.diagonal(np.tensordot(A,V,axes=(-1,-1)),axis1=0,axis2=2).T.shape
Out[106]: (2, 3)

 but it cannot be done without constructing a larger array and pulling
out its diagonal.

If this seems like an exotic thing to want to do, suppose instead we
have two arrays of vectors, and we want to evaluate the array of dot
products. None of no.dot, np.tensordot, or np.inner produce the
results you want. You have to multiply elementwise and sum. (This also
precludes automatic use of BLAS, if indeed tensordot uses BLAS.)

Does it make sense to implement a generalized tensordot that can do
this, or is it the sort of thing one should code by hand every time?

Is there any way we can make it easier to do simple common operations
like take the elementwise matrix-vector product of A and V?


The more general issue, of making linear algebra natural by keeping
track of which indices are for elementwise operations, and which
should be used for dots (and how), is a whole other kettle of fish. I
think for that someone should think hard about writing a full-featured
multilinear algebra package (might as well keep track of coordinate
transformations too while one was at it) if this is intended.


Anne
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: dot/tensordot limitations

Nadav Horesh
You open here a Pandora box:
What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).

Would your issue can be rephrase like this:
A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?

  Nadav.


-----הודעה מקורית-----
מאת: [hidden email] בשם Anne Archibald
נשלח: ד 30-אפריל-08 01:41
אל: Discussion of Numerical Python
נושא: [Numpy-discussion] dot/tensordot limitations
 
Timothy Hochberg has proposed a generalization of the matrix mechanism
to support manipulating arrays of linear algebra objects. For example,
one might have an array of matrices one wants to apply to an array of
vectors, to yield an array of vectors:

In [88]: A = np.repeat(np.eye(3)[np.newaxis,...],2,axis=0)

In [89]: A
Out[89]:
array([[[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]],

       [[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]]])

In [90]: V = np.array([[1,0,0],[0,1,0]])

Currently, it is very clumsy to handle this kind of situation even
with arrays, keeping track of dimensions by hand. For example if one
wants to multiply A by V "elementwise", one cannot simply use dot:

In [92]: np.dot(A,V.T)
Out[92]:
array([[[ 1.,  0.],
        [ 0.,  1.],
        [ 0.,  0.]],

       [[ 1.,  0.],
        [ 0.,  1.],
        [ 0.,  0.]]])


tensordot() suffers from the same problem. It arises because when you
combine two multiindexed objects there are a number of ways you can do
it:

A: Treat all indices as different and form all pairwise products
(outer product):
In [93]: np.multiply.outer(A,V).shape
Out[93]: (2, 3, 3, 2, 3)

B: Contract the outer product on a pair of indices:
In [98]: np.tensordot(A,V,axes=(-1,-1)).shape
Out[98]: (2, 3, 2)

C: Take the diagonal along a pair of indices:
In [111]: np.diagonal(np.multiply.outer(A,V),axis1=0,axis2=3).shape
Out[111]: (3, 3, 3, 2)

What we want in this case is a combination of B and C:
In [106]: np.diagonal(np.tensordot(A,V,axes=(-1,-1)),axis1=0,axis2=2).T.shape
Out[106]: (2, 3)

 but it cannot be done without constructing a larger array and pulling
out its diagonal.

If this seems like an exotic thing to want to do, suppose instead we
have two arrays of vectors, and we want to evaluate the array of dot
products. None of no.dot, np.tensordot, or np.inner produce the
results you want. You have to multiply elementwise and sum. (This also
precludes automatic use of BLAS, if indeed tensordot uses BLAS.)

Does it make sense to implement a generalized tensordot that can do
this, or is it the sort of thing one should code by hand every time?

Is there any way we can make it easier to do simple common operations
like take the elementwise matrix-vector product of A and V?


The more general issue, of making linear algebra natural by keeping
track of which indices are for elementwise operations, and which
should be used for dots (and how), is a whole other kettle of fish. I
think for that someone should think hard about writing a full-featured
multilinear algebra package (might as well keep track of coordinate
transformations too while one was at it) if this is intended.


Anne
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: dot/tensordot limitations

Travis Oliphant-5
Nadav Horesh wrote:
> You open here a Pandora box:
> What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).
>
> Would your issue can be rephrase like this:
> A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?
>  
My plan for this is the general function listed on the NumPy Trac area
along with a weave-like kernel creation from Python code.

http://www.scipy.org/scipy/numpy/wiki/GeneralLoopingFunctions

I'd love to get time to do this or mentor someone else in doing it.

-Travis

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: dot/tensordot limitations

Travis Oliphant-5
In reply to this post by Nadav Horesh
Nadav Horesh wrote:
> You open here a Pandora box:
> What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).
>
> Would your issue can be rephrase like this:
> A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?
>  

Quoting from the GeneralLoopingFunction wiki page:




There seems to be a general need for looping over functions that work on
whole arrays and return other arrays. The function has information
associated with it that states what the basic dimensionality of the
inputs are as well as the corresponding dimensionality of the outputs.
This is in addition to information like supported data-types and so forth.

Then, when "larger" arrays are provided as inputs, the extra dimensions
should be "broadcast" to create a looping that calls the underlying
construct on the different sub-arrays and produces multiple outputs.

Thus, let's say we have a function, basic_inner, that takes two vectors
and returns a scalar where the signature of the function is known to
take two 1-d arrays of the same shape and return a scalar.

Then, when this same function is converted to a general looping function:

loopfuncs.inner(a, b)

where a is (3,5,N) and b is (5,N) will return an output that is (3,5)
whose elements are constructed by calling the underlying function
repeatedly on each piece. Perl has a notion of threading that captures a
part of this idea, I think. The concept is to have a more general
function than the ufuncs where the signature includes dimensionality so
that the function does more than "element-by-element" but does
"sub-array" by "sub-array".

Such a facility would prove very useful, I think and would abstract many
operations very well.



-Travis



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: dot/tensordot limitations

Charles R Harris


On Tue, Apr 29, 2008 at 10:31 PM, Travis E. Oliphant <[hidden email]> wrote:
Nadav Horesh wrote:
> You open here a Pandora box:
> What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).
>
> Would your issue can be rephrase like this:
> A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?
>

Quoting from the GeneralLoopingFunction wiki page:




There seems to be a general need for looping over functions that work on
whole arrays and return other arrays. The function has information
associated with it that states what the basic dimensionality of the
inputs are as well as the corresponding dimensionality of the outputs.
This is in addition to information like supported data-types and so forth.

Then, when "larger" arrays are provided as inputs, the extra dimensions
should be "broadcast" to create a looping that calls the underlying
construct on the different sub-arrays and produces multiple outputs.

Thus, let's say we have a function, basic_inner, that takes two vectors
and returns a scalar where the signature of the function is known to
take two 1-d arrays of the same shape and return a scalar.

Then, when this same function is converted to a general looping function:

loopfuncs.inner(a, b)

where a is (3,5,N) and b is (5,N) will return an output that is (3,5)
whose elements are constructed by calling the underlying function
repeatedly on each piece. Perl has a notion of threading that captures a
part of this idea, I think. The concept is to have a more general
function than the ufuncs where the signature includes dimensionality so
that the function does more than "element-by-element" but does
"sub-array" by "sub-array".

Such a facility would prove very useful, I think and would abstract many
operations very well.

Basically, element wise operations on elements that are subarrays; generalized ufuncs, if you will. Sorting would be a unary type operation in that context ;)

Chuck


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: dot/tensordot limitations

Charles R Harris
In reply to this post by Anne Archibald


On Tue, Apr 29, 2008 at 4:41 PM, Anne Archibald <[hidden email]> wrote:
Timothy Hochberg has proposed a generalization of the matrix mechanism
to support manipulating arrays of linear algebra objects. For example,
one might have an array of matrices one wants to apply to an array of
vectors, to yield an array of vectors:

In [88]: A = np.repeat(np.eye(3)[np.newaxis,...],2,axis=0)

In [89]: A
Out[89]:
array([[[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]],

      [[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]]])

In [90]: V = np.array([[1,0,0],[0,1,0]])

Currently, it is very clumsy to handle this kind of situation even
with arrays, keeping track of dimensions by hand. For example if one
wants to multiply A by V "elementwise", one cannot simply use dot:

Let A have dimensions LxMxN and b dimensions LxN, then sum(A*b[:,newaxis,:], axis=-1) will do the trick.

Example:

In [1]: A = ones((2,2,2))

In [2]: b = array([[1,2],[3,4]])

In [3]: A*b[:,newaxis,:]
Out[3]:
array([[[ 1.,  2.],
        [ 1.,  2.]],

       [[ 3.,  4.],
        [ 3.,  4.]]])

In [4]: sum(A*b[:,newaxis,:], axis=-1)
Out[4]:
array([[ 3.,  3.],
       [ 7.,  7.]])

Chuck




Chuck



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion