einsum

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

einsum

Mark Wiebe
I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing.

In testing it, it is also faster than many of NumPy's built-in functions, except for dot and inner.  At the bottom of this email you can find the documentation blurb I wrote for it, and here are some timings:

In [1]: import numpy as np
In [2]: a = np.arange(25).reshape(5,5)

In [3]: timeit np.einsum('ii', a)
100000 loops, best of 3: 3.45 us per loop
In [4]: timeit np.trace(a)
100000 loops, best of 3: 9.8 us per loop

In [5]: timeit np.einsum('ii->i', a)
1000000 loops, best of 3: 1.19 us per loop
In [6]: timeit np.diag(a)
100000 loops, best of 3: 7 us per loop

In [7]: b = np.arange(30).reshape(5,6)

In [8]: timeit np.einsum('ij,jk', a, b)
10000 loops, best of 3: 11.4 us per loop
In [9]: timeit np.dot(a, b)
100000 loops, best of 3: 2.8 us per loop

In [10]: a = np.arange(10000.)

In [11]: timeit np.einsum('i->', a)
10000 loops, best of 3: 22.1 us per loop
In [12]: timeit np.sum(a)
10000 loops, best of 3: 25.5 us per loop

-Mark

The documentation:

    einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe')

    Evaluates the Einstein summation convention on the operands.

    Using the Einstein summation convention, many common multi-dimensional
    array operations can be represented in a simple fashion.  This function
    provides a way compute such summations.

    The best way to understand this function is to try the examples below,
    which show how many common NumPy functions can be implemented as
    calls to einsum.
    
    The subscripts string is a comma-separated list of subscript labels,
    where each label refers to a dimension of the corresponding operand.
    Repeated subscripts labels in one operand take the diagonal.  For example,
    ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``.
    
    Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)``
    is equivalent to ``np.inner(a,b)``.  If a label appears only once,
    it is not summed, so ``np.einsum('i', a)`` produces a view of ``a``
    with no changes.

    The order of labels in the output is by default alphabetical.  This
    means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while
    ``np.einsum('ji', a)`` takes its transpose.

    The output can be controlled by specifying output subscript labels
    as well.  This specifies the label order, and allows summing to be
    disallowed or forced when desired.  The call ``np.einsum('i->', a)``
    is equivalent to ``np.sum(a, axis=-1)``, and
    ``np.einsum('ii->i', a)`` is equivalent to ``np.diag(a)``.

    It is also possible to control how broadcasting occurs using
    an ellipsis.  To take the trace along the first and last axes,
    you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix
    product with the left-most indices instead of rightmost, you can do
    ``np.einsum('ij...,jk...->ik...', a, b)``.

    When there is only one operand, no axes are summed, and no output
    parameter is provided, a view into the operand is returned instead
    of a new array.  Thus, taking the diagonal as ``np.einsum('ii->i', a)``
    produces a view.

    Parameters
    ----------
    subscripts : string
        Specifies the subscripts for summation.
    operands : list of array_like
        These are the arrays for the operation.
    out : None or array
        If provided, the calculation is done into this array.
    dtype : None or data type
        If provided, forces the calculation to use the data type specified.
        Note that you may have to also give a more liberal ``casting``
        parameter to allow the conversions.
    order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be Fortran contiguous. 'F' means it should be Fortran contiguous,
        'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise.
        'K' means it should be as close to the layout as the inputs as
        is possible, including arbitrarily permuted axes.
    casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe'
        Controls what kind of data casting may occur.  Setting this to
        'unsafe' is not recommended, as it can adversely affect accumulations.
        'no' means the data types should not be cast at all. 'equiv' means
        only byte-order changes are allowed. 'safe' means only casts
        which can preserve values are allowed. 'same_kind' means only
        safe casts or casts within a kind, like float64 to float32, are
        allowed.  'unsafe' means any data conversions may be done.

    Returns
    -------
    output : ndarray
        The calculation based on the Einstein summation convention.

    See Also
    --------
    dot, inner, outer, tensordot

    
    Examples
    --------

    >>> a = np.arange(25).reshape(5,5)
    >>> b = np.arange(5)
    >>> c = np.arange(6).reshape(2,3)

    >>> np.einsum('ii', a)
    60
    >>> np.trace(a)
    60

    >>> np.einsum('ii->i', a)
    array([ 0,  6, 12, 18, 24])
    >>> np.diag(a)
    array([ 0,  6, 12, 18, 24])

    >>> np.einsum('ij,j', a, b)
    array([ 30,  80, 130, 180, 230])
    >>> np.dot(a, b)
    array([ 30,  80, 130, 180, 230])

    >>> np.einsum('ji', c)
    array([[0, 3],
           [1, 4],
           [2, 5]])
    >>> c.T
    array([[0, 3],
           [1, 4],
           [2, 5]])

    >>> np.einsum(',', 3, c)
    array([[ 0,  3,  6],
           [ 9, 12, 15]])
    >>> np.multiply(3, c)
    array([[ 0,  3,  6],
           [ 9, 12, 15]])

    >>> np.einsum('i,i', b, b)
    30
    >>> np.inner(b,b)
    30

    >>> np.einsum('i,j', np.arange(2)+1, b)
    array([[0, 1, 2, 3, 4],
           [0, 2, 4, 6, 8]])
    >>> np.outer(np.arange(2)+1, b)
    array([[0, 1, 2, 3, 4],
           [0, 2, 4, 6, 8]])

    >>> np.einsum('i...->', a)
    array([50, 55, 60, 65, 70])
    >>> np.sum(a, axis=0)
    array([50, 55, 60, 65, 70])

    >>> a = np.arange(60.).reshape(3,4,5)
    >>> b = np.arange(24.).reshape(4,3,2)
    >>> np.einsum('ijk,jil->kl', a, b)
    array([[ 4400.,  4730.],
           [ 4532.,  4874.],
           [ 4664.,  5018.],
           [ 4796.,  5162.],
           [ 4928.,  5306.]])
    >>> np.tensordot(a,b, axes=([1,0],[0,1]))
    array([[ 4400.,  4730.],
           [ 4532.,  4874.],
           [ 4664.,  5018.],
           [ 4796.,  5162.],
           [ 4928.,  5306.]])


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Joshua Holbrook
On Wed, Jan 26, 2011 at 11:27 AM, Mark Wiebe <[hidden email]> wrote:
> I wrote a new function, einsum, which implements Einstein summation
> notation, and I'd like comments/thoughts from people who might be interested
> in this kind of thing.

This sounds really cool! I've definitely considered doing something
like this previously, but never really got around to seriously
figuring out any sensible API.

Do you have the source up somewhere? I'd love to try it out myself.

--Josh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Mark Wiebe
On Wed, Jan 26, 2011 at 1:36 PM, Joshua Holbrook <[hidden email]> wrote:
On Wed, Jan 26, 2011 at 11:27 AM, Mark Wiebe <[hidden email]> wrote:
> I wrote a new function, einsum, which implements Einstein summation
> notation, and I'd like comments/thoughts from people who might be interested
> in this kind of thing.

This sounds really cool! I've definitely considered doing something
like this previously, but never really got around to seriously
figuring out any sensible API.

Do you have the source up somewhere? I'd love to try it out myself.

You can check out the new_iterator branch from here:


Cloning into numpy...

-Mark

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Joshua Holbrook
On Wed, Jan 26, 2011 at 12:48 PM, Mark Wiebe <[hidden email]> wrote:

> On Wed, Jan 26, 2011 at 1:36 PM, Joshua Holbrook <[hidden email]>
> wrote:
>>
>> On Wed, Jan 26, 2011 at 11:27 AM, Mark Wiebe <[hidden email]> wrote:
>> > I wrote a new function, einsum, which implements Einstein summation
>> > notation, and I'd like comments/thoughts from people who might be
>> > interested
>> > in this kind of thing.
>>
>> This sounds really cool! I've definitely considered doing something
>> like this previously, but never really got around to seriously
>> figuring out any sensible API.
>>
>> Do you have the source up somewhere? I'd love to try it out myself.
>
> You can check out the new_iterator branch from here:
> https://github.com/m-paradox/numpy
> $ git clone https://github.com/m-paradox/numpy.git
> Cloning into numpy...
> -Mark
>

Thanks for the link!

How closely coupled is this new code with numpy's internals? That is,
could you factor it out into its own package? If so, then people could
have immediate use out of it without having to integrate it into numpy
proper.

--Josh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Mark Wiebe
On Wed, Jan 26, 2011 at 2:01 PM, Joshua Holbrook <[hidden email]> wrote:
<snip>
How closely coupled is this new code with numpy's internals? That is,
could you factor it out into its own package? If so, then people could
have immediate use out of it without having to integrate it into numpy
proper.

The code depends heavily on the iterator I wrote, and I think the idea itself depends on having a good dynamic multi-dimensional array library.  When the numpy-refactor branch is complete, this would be part of libndarray, and could be used directly from C without depending on Python.

-Mark

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Robert Kern-2
On Wed, Jan 26, 2011 at 16:43, Mark Wiebe <[hidden email]> wrote:

> On Wed, Jan 26, 2011 at 2:01 PM, Joshua Holbrook <[hidden email]>
> wrote:
>>
>> <snip>
>> How closely coupled is this new code with numpy's internals? That is,
>> could you factor it out into its own package? If so, then people could
>> have immediate use out of it without having to integrate it into numpy
>> proper.
>
> The code depends heavily on the iterator I wrote, and I think the idea
> itself depends on having a good dynamic multi-dimensional array library.
>  When the numpy-refactor branch is complete, this would be part of
> libndarray, and could be used directly from C without depending on Python.

It think his real question is whether einsum() and the iterator stuff
can live in a separate module that *uses* a released version of numpy
rather than a development branch.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Joshua Holbrook
>
> It think his real question is whether einsum() and the iterator stuff
> can live in a separate module that *uses* a released version of numpy
> rather than a development branch.
>
> --
> Robert Kern
>

Indeed, I would like to be able to install and use einsum() without
having to install another version of numpy. Even if it depends on
features of a new numpy, it'd be nice to have it be a separate module.

--Josh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Hanno Klemm
In reply to this post by Mark Wiebe

Mark,

interesting idea. Given the fact that in 2-d euclidean metric, the  
Einstein summation conventions are only a way to write out  
conventional matrix multiplications, do you consider at some point to  
include a non-euclidean metric in this thing? (As you have in special  
relativity, for example)

Something along the lines:

eta = np.diag(-1,1,1,1)

a = np.array(1,2,3,4)
b = np.array(1,1,1,1)

such that

einsum('i,i', a,b, metric=eta) = -1 + 2 + 3 + 4

I don't know how useful it would be, just a thought,
Hanno


Am 26.01.2011 um 21:27 schrieb Mark Wiebe:

> I wrote a new function, einsum, which implements Einstein summation  
> notation, and I'd like comments/thoughts from people who might be  
> interested in this kind of thing.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Gael Varoquaux
On Thu, Jan 27, 2011 at 12:18:30AM +0100, Hanno Klemm wrote:
> interesting idea. Given the fact that in 2-d euclidean metric, the  
> Einstein summation conventions are only a way to write out  
> conventional matrix multiplications, do you consider at some point to  
> include a non-euclidean metric in this thing? (As you have in special  
> relativity, for example)

In my experience, Einstein summation conventions are quite
incomprehensible for people who haven't studies relativity (they aren't
used much outside some narrow fields of physics). If you start adding
metrics, you'll make it even harder for people to follow.

My 2 cents,

Gaël
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Mark Wiebe
In reply to this post by Joshua Holbrook

On Wed, Jan 26, 2011 at 3:05 PM, Joshua Holbrook <[hidden email]> wrote:
>
> It think his real question is whether einsum() and the iterator stuff
> can live in a separate module that *uses* a released version of numpy
> rather than a development branch.
>
> --
> Robert Kern
>

Indeed, I would like to be able to install and use einsum() without
having to install another version of numpy. Even if it depends on
features of a new numpy, it'd be nice to have it be a separate module.

--Josh

Ah, sorry for misunderstanding.  That would actually be very difficult, as the iterator required a fair bit of fixes and adjustments to the core.  The new_iterator branch should be 1.5 ABI compatible, if that helps.

-Mark

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Mark Wiebe
In reply to this post by Hanno Klemm
On Wed, Jan 26, 2011 at 3:18 PM, Hanno Klemm <[hidden email]> wrote:

Mark,

interesting idea. Given the fact that in 2-d euclidean metric, the
Einstein summation conventions are only a way to write out
conventional matrix multiplications, do you consider at some point to
include a non-euclidean metric in this thing? (As you have in special
relativity, for example)

Something along the lines:

eta = np.diag(-1,1,1,1)

a = np.array(1,2,3,4)
b = np.array(1,1,1,1)

such that

einsum('i,i', a,b, metric=eta) = -1 + 2 + 3 + 4

This particular example is already doable as follows:

>>> eta = np.diag([-1,1,1,1])
>>> eta
array([[-1,  0,  0,  0],
       [ 0,  1,  0,  0],
       [ 0,  0,  1,  0],
       [ 0,  0,  0,  1]])
>>> a = np.array([1,2,3,4])
>>> b = np.array([1,1,1,1])
>>> np.einsum('i,j,ij', a, b, eta)
8

I think that's right, did I understand you correctly?

Cheers,
Mark

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Hanno Klemm

Am 27.01.2011 um 00:29 schrieb Mark Wiebe:

On Wed, Jan 26, 2011 at 3:18 PM, Hanno Klemm <[hidden email]> wrote:

Mark,

interesting idea. Given the fact that in 2-d euclidean metric, the
Einstein summation conventions are only a way to write out
conventional matrix multiplications, do you consider at some point to
include a non-euclidean metric in this thing? (As you have in special
relativity, for example)

Something along the lines:

eta = np.diag(-1,1,1,1)

a = np.array(1,2,3,4)
b = np.array(1,1,1,1)

such that

einsum('i,i', a,b, metric=eta) = -1 + 2 + 3 + 4

This particular example is already doable as follows:

>>> eta = np.diag([-1,1,1,1])
>>> eta
array([[-1,  0,  0,  0],
       [ 0,  1,  0,  0],
       [ 0,  0,  1,  0],
       [ 0,  0,  0,  1]])
>>> a = np.array([1,2,3,4])
>>> b = np.array([1,1,1,1])
>>> np.einsum('i,j,ij', a, b, eta)
8

I think that's right, did I understand you correctly?

Cheers,
Mark

Yes, that's what I had in mind. Thanks. 



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Benjamin Root-2
In reply to this post by Gael Varoquaux
On Wednesday, January 26, 2011, Gael Varoquaux
<[hidden email]> wrote:

> On Thu, Jan 27, 2011 at 12:18:30AM +0100, Hanno Klemm wrote:
>> interesting idea. Given the fact that in 2-d euclidean metric, the
>> Einstein summation conventions are only a way to write out
>> conventional matrix multiplications, do you consider at some point to
>> include a non-euclidean metric in this thing? (As you have in special
>> relativity, for example)
>
> In my experience, Einstein summation conventions are quite
> incomprehensible for people who haven't studies relativity (they aren't
> used much outside some narrow fields of physics). If you start adding
> metrics, you'll make it even harder for people to follow.
>
> My 2 cents,
>
> Gaël
>

Just to dispel the notion that Einstein notation is only used in the
study of relativity, I can personally attest that Einstein notation is
used in the field of fluid dynamics and some aspects of meteorology.
This is really a neat idea and I support the idea of packaging it as a
separate module.

Ben Root
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Joshua Holbrook
> Ah, sorry for misunderstanding.  That would actually be very difficult,
> as the iterator required a fair bit of fixes and adjustments to the core.
> The new_iterator branch should be 1.5 ABI compatible, if that helps.

I see. Perhaps the fixes and adjustments can/should be included with
numpy standard, even if the Einstein notation package is made a
separate module.

> Just to dispel the notion that Einstein notation is only used in the
> study of relativity, I can personally attest that Einstein notation is
> used in the field of fluid dynamics and some aspects of meteorology.

Einstein notation is also used in solid mechanics.

--Josh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

T J-4
On Wed, Jan 26, 2011 at 5:02 PM, Joshua Holbrook
<[hidden email]> wrote:
>> Ah, sorry for misunderstanding.  That would actually be very difficult,
>> as the iterator required a fair bit of fixes and adjustments to the core.
>> The new_iterator branch should be 1.5 ABI compatible, if that helps.
>
> I see. Perhaps the fixes and adjustments can/should be included with
> numpy standard, even if the Einstein notation package is made a
> separate module.
>
<snip>
> Indeed, I would like to be able to install and use einsum() without
> having to install another version of numpy. Even if it depends on
> features of a new numpy, it'd be nice to have it be a separate module.

I don't really understand the desire to have this single function
exist in a separate package.  If it requires the new version of NumPy,
then you'll have to install/upgrade either way...and if it comes as
part of that new NumPy, then you are already set.  Doesn't a separate
package complicate things unnecessarily?  It make sense to me if
einsum consisted of many functions (such as Bottleneck).
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

josef.pktd
In reply to this post by Benjamin Root-2
On Wed, Jan 26, 2011 at 7:35 PM, Benjamin Root <[hidden email]> wrote:

> On Wednesday, January 26, 2011, Gael Varoquaux
> <[hidden email]> wrote:
>> On Thu, Jan 27, 2011 at 12:18:30AM +0100, Hanno Klemm wrote:
>>> interesting idea. Given the fact that in 2-d euclidean metric, the
>>> Einstein summation conventions are only a way to write out
>>> conventional matrix multiplications, do you consider at some point to
>>> include a non-euclidean metric in this thing? (As you have in special
>>> relativity, for example)
>>
>> In my experience, Einstein summation conventions are quite
>> incomprehensible for people who haven't studies relativity (they aren't
>> used much outside some narrow fields of physics). If you start adding
>> metrics, you'll make it even harder for people to follow.
>>
>> My 2 cents,
>>
>> Gaël
>>
>
> Just to dispel the notion that Einstein notation is only used in the
> study of relativity, I can personally attest that Einstein notation is
> used in the field of fluid dynamics and some aspects of meteorology.
> This is really a neat idea and I support the idea of packaging it as a
> separate module.

So, if I read the examples correctly we finally get dot along an axis

np.einsum('ijk,ji->', a, b)
np.einsum('ijk,jik->k', a, b)

or something like this.

the notation might require getting used to but it doesn't look worse
than figuring out what tensordot does.

The only disadvantage I see, is that choosing the axes to operate on
in a program or function requires string manipulation.

Josef


>
> Ben Root
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Jonathan Rocher
In reply to this post by Mark Wiebe
Nice function, and wonderful that it speeds some tasks up.

some feedback: the following notation is a little counter intuitive to me:
    >>> np.einsum('i...->', a)
    array([50, 55, 60, 65, 70])
    >>> np.sum(a, axis=0)
    array([50, 55, 60, 65, 70])
Since there is nothing after the ->, I expected a scalar not a vector. I might suggest 'i...->...'

Just noticed also a typo in the doc:
     order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be Fortran contiguous. 'F' means it should be Fortran contiguous,
should be changed to
    order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be C contiguous. 'F' means it should be Fortran contiguous,


Hope this helps,
Jonathan

On Wed, Jan 26, 2011 at 2:27 PM, Mark Wiebe <[hidden email]> wrote:
I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing.

In testing it, it is also faster than many of NumPy's built-in functions, except for dot and inner.  At the bottom of this email you can find the documentation blurb I wrote for it, and here are some timings:

In [1]: import numpy as np
In [2]: a = np.arange(25).reshape(5,5)

In [3]: timeit np.einsum('ii', a)
100000 loops, best of 3: 3.45 us per loop
In [4]: timeit np.trace(a)
100000 loops, best of 3: 9.8 us per loop

In [5]: timeit np.einsum('ii->i', a)
1000000 loops, best of 3: 1.19 us per loop
In [6]: timeit np.diag(a)
100000 loops, best of 3: 7 us per loop

In [7]: b = np.arange(30).reshape(5,6)

In [8]: timeit np.einsum('ij,jk', a, b)
10000 loops, best of 3: 11.4 us per loop
In [9]: timeit np.dot(a, b)
100000 loops, best of 3: 2.8 us per loop

In [10]: a = np.arange(10000.)

In [11]: timeit np.einsum('i->', a)
10000 loops, best of 3: 22.1 us per loop
In [12]: timeit np.sum(a)
10000 loops, best of 3: 25.5 us per loop

-Mark

The documentation:

    einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe')

    Evaluates the Einstein summation convention on the operands.

    Using the Einstein summation convention, many common multi-dimensional
    array operations can be represented in a simple fashion.  This function
    provides a way compute such summations.

    The best way to understand this function is to try the examples below,
    which show how many common NumPy functions can be implemented as
    calls to einsum.
    
    The subscripts string is a comma-separated list of subscript labels,
    where each label refers to a dimension of the corresponding operand.
    Repeated subscripts labels in one operand take the diagonal.  For example,
    ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``.
    
    Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)``
    is equivalent to ``np.inner(a,b)``.  If a label appears only once,
    it is not summed, so ``np.einsum('i', a)`` produces a view of ``a``
    with no changes.

    The order of labels in the output is by default alphabetical.  This
    means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while
    ``np.einsum('ji', a)`` takes its transpose.

    The output can be controlled by specifying output subscript labels
    as well.  This specifies the label order, and allows summing to be
    disallowed or forced when desired.  The call ``np.einsum('i->', a)``
    is equivalent to ``np.sum(a, axis=-1)``, and
    ``np.einsum('ii->i', a)`` is equivalent to ``np.diag(a)``.

    It is also possible to control how broadcasting occurs using
    an ellipsis.  To take the trace along the first and last axes,
    you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix
    product with the left-most indices instead of rightmost, you can do
    ``np.einsum('ij...,jk...->ik...', a, b)``.

    When there is only one operand, no axes are summed, and no output
    parameter is provided, a view into the operand is returned instead
    of a new array.  Thus, taking the diagonal as ``np.einsum('ii->i', a)``
    produces a view.

    Parameters
    ----------
    subscripts : string
        Specifies the subscripts for summation.
    operands : list of array_like
        These are the arrays for the operation.
    out : None or array
        If provided, the calculation is done into this array.
    dtype : None or data type
        If provided, forces the calculation to use the data type specified.
        Note that you may have to also give a more liberal ``casting``
        parameter to allow the conversions.
    order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be Fortran contiguous. 'F' means it should be Fortran contiguous,
        'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise.
        'K' means it should be as close to the layout as the inputs as
        is possible, including arbitrarily permuted axes.
    casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe'
        Controls what kind of data casting may occur.  Setting this to
        'unsafe' is not recommended, as it can adversely affect accumulations.
        'no' means the data types should not be cast at all. 'equiv' means
        only byte-order changes are allowed. 'safe' means only casts
        which can preserve values are allowed. 'same_kind' means only
        safe casts or casts within a kind, like float64 to float32, are
        allowed.  'unsafe' means any data conversions may be done.

    Returns
    -------
    output : ndarray
        The calculation based on the Einstein summation convention.

    See Also
    --------
    dot, inner, outer, tensordot

    
    Examples
    --------

    >>> a = np.arange(25).reshape(5,5)
    >>> b = np.arange(5)
    >>> c = np.arange(6).reshape(2,3)

    >>> np.einsum('ii', a)
    60
    >>> np.trace(a)
    60

    >>> np.einsum('ii->i', a)
    array([ 0,  6, 12, 18, 24])
    >>> np.diag(a)
    array([ 0,  6, 12, 18, 24])

    >>> np.einsum('ij,j', a, b)
    array([ 30,  80, 130, 180, 230])
    >>> np.dot(a, b)
    array([ 30,  80, 130, 180, 230])

    >>> np.einsum('ji', c)
    array([[0, 3],
           [1, 4],
           [2, 5]])
    >>> c.T
    array([[0, 3],
           [1, 4],
           [2, 5]])

    >>> np.einsum(',', 3, c)
    array([[ 0,  3,  6],
           [ 9, 12, 15]])
    >>> np.multiply(3, c)
    array([[ 0,  3,  6],
           [ 9, 12, 15]])

    >>> np.einsum('i,i', b, b)
    30
    >>> np.inner(b,b)
    30

    >>> np.einsum('i,j', np.arange(2)+1, b)
    array([[0, 1, 2, 3, 4],
           [0, 2, 4, 6, 8]])
    >>> np.outer(np.arange(2)+1, b)
    array([[0, 1, 2, 3, 4],
           [0, 2, 4, 6, 8]])

    >>> np.einsum('i...->', a)
    array([50, 55, 60, 65, 70])
    >>> np.sum(a, axis=0)
    array([50, 55, 60, 65, 70])

    >>> a = np.arange(60.).reshape(3,4,5)
    >>> b = np.arange(24.).reshape(4,3,2)
    >>> np.einsum('ijk,jil->kl', a, b)
    array([[ 4400.,  4730.],
           [ 4532.,  4874.],
           [ 4664.,  5018.],
           [ 4796.,  5162.],
           [ 4928.,  5306.]])
    >>> np.tensordot(a,b, axes=([1,0],[0,1]))
    array([[ 4400.,  4730.],
           [ 4532.,  4874.],
           [ 4664.,  5018.],
           [ 4796.,  5162.],
           [ 4928.,  5306.]])


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion




--
Jonathan Rocher,
Enthought, Inc.
[hidden email]
1-512-536-1057
http://www.enthought.com



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Mark Wiebe
On Wed, Jan 26, 2011 at 6:41 PM, Jonathan Rocher <[hidden email]> wrote:
Nice function, and wonderful that it speeds some tasks up.

some feedback: the following notation is a little counter intuitive to me:
    >>> np.einsum('i...->', a)
    array([50, 55, 60, 65, 70])
    >>> np.sum(a, axis=0)
    array([50, 55, 60, 65, 70])
Since there is nothing after the ->, I expected a scalar not a vector. I might suggest 'i...->...'

Hmm, the dimension that's left is a a broadcast dimension, and the dimension labeled 'i' did go away.  I suppose disallowing the empty output string and forcing a '...' is reasonable.  Would disallowing broadcasting by default be a good approach?  Then,
einsum('ii->i', a) would only except two dimensional inputs, and you would have to specify einsum('...ii->...i', a) to get the current default behavior for it.

Just noticed also a typo in the doc:

     order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be Fortran contiguous. 'F' means it should be Fortran contiguous,
should be changed to
    order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be C contiguous. 'F' means it should be Fortran contiguous,


Thanks,
Mark 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Mark Wiebe
In reply to this post by josef.pktd


On Wed, Jan 26, 2011 at 5:23 PM, <[hidden email]> wrote:
<snip>
So, if I read the examples correctly we finally get dot along an axis

np.einsum('ijk,ji->', a, b)
np.einsum('ijk,jik->k', a, b)

or something like this.

the notation might require getting used to but it doesn't look worse
than figuring out what tensordot does.

I thought of various extensions to the notation, but the idea is tricky enough as is I think. Decoding a regex-like syntax probably wouldn't help. 

The only disadvantage I see, is that choosing the axes to operate on
in a program or function requires string manipulation.
 
One possibility would be for the Python exposure to accept lists or tuples of integers.  The subscript 'ii' could be [(0,0)], and 'ij,jk->ik' could be [(0,1), (1,2), (0,2)].  Internally it would convert this directly to a C-string to pass to the API function.

-Mark


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: einsum

Joshua Holbrook
>>
>> The only disadvantage I see, is that choosing the axes to operate on
>> in a program or function requires string manipulation.
>
>
> One possibility would be for the Python exposure to accept lists or tuples
> of integers.  The subscript 'ii' could be [(0,0)], and 'ij,jk->ik' could be
> [(0,1), (1,2), (0,2)].  Internally it would convert this directly to a
> C-string to pass to the API function.
> -Mark
>

What if you made objects i, j, etc. such that i*j = (0, 1) and
etcetera? Maybe you could generate them with something like (i, j, k)
= einstein((1, 2, 3)) .

Feel free to disregard me since I haven't really thought too hard
about things and might not even really understand what the problem is
*anyway*. I'm just trying to help brainstorm. :)

--Josh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
12
Loading...