SciPy 2014 BoF NumPy Participation

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

SciPy 2014 BoF NumPy Participation

mandli
Hello everyone,

As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year.  The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics.  I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).

Thanks!

Kyle Manldi (and via proxy Matt McCormick)



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Charles R Harris



On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli <[hidden email]> wrote:
Hello everyone,

As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year.  The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics.  I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).


I'll be at the conference, but I don't know who else will be there. I feel that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd like to see discussed is where do we go from here. One option to look at is Blaze, which looks to have matured a lot in the last year. The problem with making it a NumPy replacement is that NumPy has become quite widespread, with downloads from PyPi running at about 3 million per year. With that much penetration it may be difficult for a new core like Blaze to gain traction.  So I'd like to also discuss ways to bring the two branches of development together at some point and explore what NumPy can do to pave the way. Mind, there are definitely things that would be nice to add to NumPy, a better type system, missing values, etc., but doing that is difficult given the current design.

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Nathaniel Smith
On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris
<[hidden email]> wrote:

> On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli <[hidden email]> wrote:
>>
>> Hello everyone,
>>
>> As one of the co-chairs in charge of organizing the birds-of-a-feather
>> sesssions at the SciPy conference this year, I wanted to solicit through the
>> NumPy list to see if we could get enough interest to hold a NumPy centered
>> BoF this year.  The BoF format would be up to those who would lead the
>> discussion, a couple of ideas used in the past include picking out a few of
>> the lead devs to be on a panel and have a Q&A type of session or an open Q&A
>> with perhaps audience guided list of topics.  I can help facilitate
>> organization of something but we would really like to get something
>> organized this year (last year NumPy was the only major project that was not
>> really represented in the BoF sessions).
>
> I'll be at the conference, but I don't know who else will be there. I feel
> that NumPy has matured to the point where most of the current work is
> cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd
> like to see discussed is where do we go from here. One option to look at is
> Blaze, which looks to have matured a lot in the last year. The problem with
> making it a NumPy replacement is that NumPy has become quite widespread,
> with downloads from PyPi running at about 3 million per year. With that much
> penetration it may be difficult for a new core like Blaze to gain traction.
> So I'd like to also discuss ways to bring the two branches of development
> together at some point and explore what NumPy can do to pave the way. Mind,
> there are definitely things that would be nice to add to NumPy, a better
> type system, missing values, etc., but doing that is difficult given the
> current design.

I won't be at the conference unfortunately (I'm on the wrong continent
and have family commitments then anyway), but I think there's lots of
exciting stuff that can be done in numpy-land.

We absolutely could rewrite the dtype system, and this would
straightforwardly give us excellent support for missing values, units,
categorical data, automatic differentiation, better datetimes, etc.
etc. -- and make numpy much more friendly in general to third-party
extensions.

I'd like to see the ufunc system revisited in the light of all the
things we know now, to make gufuncs more first-class, provide better
support for user-defined types, more flexible loop selection (e.g.
make it possible to implement np.add.reduce(a, type="kahan")), etc.;
one goal would be to convert a lot of ufunc-like functions (np.mean
etc.) into being real ufuncs, and then they'd automatically benefit
from __numpy_ufunc__, which would also massively improve
interoperability with alternative array types like blaze.

I'd like to see support for extensible label-based indexing, like pandas.

Internally, I'd like to see internal migrating out of C and into
Cython -- we have hundreds of lines of code that could be replaced
with a few lines of Cython and no-one would notice. (Combining this
with a cffi cython backend and pypy would be pretty interesting
too...)

I'd like to see sparse ndarrays, with integration into the ufunc
looping machinery so all ufuncs just work. Or even better, I'd like to
see the right hooks added so that anyone can write a sparse ndarray
package using only public APIs, and have all ufuncs just work. (I was
going to put down deferred/loop-fused/out-of-core computation as a
wishlist item too, but if we do it right then this too could be
implemented by anyone without needing to be baked into numpy proper.)

All of these things would take some work and care, but I think they
could all be done incrementally and without breaking backwards
compatibility. Compare to ipython, which -- as Fernando likes to point
out :-) -- went from a little console program to its current
distributed-notebook-skynet-whatever-it-is by merging one working PR
at a time. Certainly these changes would much easier and less
disruptive than any plan that involves throwing out numpy and starting
over. But they also do help smooth the way for an incremental
transition to a world where numpy is regularly used alongside other
libraries.

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Travis Oliphant-6
In reply to this post by mandli
I will be at the conference as will Mark Wiebe for at least part of the time.  Others from the Blaze team like Andy Terrel and Matthew Rocklin will also be available at least part of the time (so it depends on when the BoF is).   I'm sure they will all have opinions about this.   I would be happy to be involved with a discussion around the future of NumPy as it is one of the things I've been thinking about for quite a while.   

Obviously, what happens will be more a function of what people have resources to do than just what is discussed, but it is helpful to get people from multiple projects discussing what they are working on and how it relates or could relate to a possible NumPy 2.0 effort.    I'm happy to participate. 

My bias is that I do not believe it is going to be possible practically to simply modify NumPy itself directly.  This was the original direction we considered when we started Continuum -- and spent some time and money in that direction --- but it's a difficult problem that would require a lot of time and patience and testing from multiple people.   I'm not sure IPython is the right project to compare against here as it's user-story is quite different.   NumPy is already a hybrid that evolved from Numeric.  

Of course it is likely *technically* feasible.    We could replace every implementation detail with something different --- but not without likely impact on users and more cost than it would be just to re-write sections.   However, the challenge is more about the user-base (especially the silent but large user-base), the semantic expectations of that user base, and the challenge that exists in really creating a test suite that covers the entire surface area of actual NumPy use.   

Even relatively simple changes can have significant impact at this point.  Nathaniel has laid out a fantastic list of great features.  These are the kind of features I have been eager to see as well.  This is why I have been working to fund and help explore these ideas in the Numba array object as well as in Blaze.    Gnumpy, Theano, Pandas, and other projects also have useful tales to tell regarding a potential NumPy 2.0.   

Ultimately, I do think it is time to talk seriously about NumPy 2.0, and what it might look like.   I personally think it looks a lot more like a re-write, than a continuation of the modifications of Numeric that became NumPy 1.0.     Right out of the gate,  for example, I would make sure that NumPy 2.0 objects somehow used PyObject_VAR_HEAD so that they were variable-sized objects where the strides and dimension information was stored directly in the object structure itself instead of allocated separately (thus requiring additional loads and stores from memory).   This would be a relatively simple change.  But, it can't be done and preserve ABI compatibility.  It may also, at this point, have impact on Cython code, or other code that is deeply-aware of the NumPy code-structure.     Some of the changes that should be made will ultimately require a porting exercise for new code --- at which point why not just use a new project. 

Dynd (which is a separate but related project from Blaze) is actually a pretty good start to a NumPy 2.0 already:  https://github.com/ContinuumIO/dynd-python and https://github.com/ContinuumIO/libdynd (C++ library).

It can be provided with a backwards-compatible API without too much difficulty so that extension modules built for NumPy 1.X would still work.   Numba can support Dynd and Numba's array object provides a useful, deferred-expression evaluation mechanism along with JIT compilation when desired that can support the GPU.    

I would make the case that by the end of the year this combination of Dynd plus Numba (and it's array object) could easily provide much of the functionality needed for a solid NumPy++.    Separate from that, Blaze provides a pluggable mechanism so that array-oriented computations can be done on a large-variety of backends (including distributed systems). 

I agree that users of NumPy should not have to see a big API change in 2.0 --- but any modification of indexing or calculations would present slightly different semantics in certain corner cases --- which I think will be unavoidable in NumPy 2.0 regardless of how it is created.   I also think NumPy 2.0 should take the opportunity to look hard at the API and what can be simplified (do we have the right collection of methods?).  I'm also a big fan of introducing a common "array of structure" object that has a smaller API footprint than Pandas but has indexing and group-by functionality. 

Fortunately, with the buffer protocol in Python, multiple array objects can easily co-exist in the Python ecosystem with no memory copies.   I think that is where we are headed and I don't see it as a bad thing.   I think agreeing on how to describe types would be very beneficial (it's an under-developed part of the buffer protocol).  This is exactly why we have made datashape an independent project that other projects can use as a data-type-description mini-language:  https://github.com/ContinuumIO/datashape

I think that a really good project for an enterprising young graduate student, post-doc, or professor (who is willing to delay their PhD or risk their tenure) would be to re-write the ufunc system using more modern techniques and put generalized ufuncs front and center as Nathaniel described.    

It sounds like many agree that we can improve the ufunc object implementation.    A new ufunc system is an entirely achievable goal and could even be shipped as an "add-on" project external from NumPy for several years before being adopted fully.    I know at least 4 people with demo-ware versions of a new ufunc-object that could easily replace current NumPy ufuncs eventually.    If you are interested in that, I would love to share what I know with you. 

After spending quite a bit of time thinking about this over the past 2 years, interacting with many in the user community outside of this list, and working with people as they explore a few options --- I do have a fair set of opinions.   But, there are also a lot of possibilities and many opportunities.  I'm looking forward to seeing what emerges in the coming months and years and cooperating where possible with others having overlapping interests.  

Best,

-Travis




On Tue, Jun 3, 2014 at 6:08 PM, Kyle Mandli <[hidden email]> wrote:
Hello everyone,

As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year.  The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics.  I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).

Thanks!

Kyle Manldi (and via proxy Matt McCormick)



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion




--

Travis Oliphant
CEO
Continuum Analytics, Inc.

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Sebastian Berg
In reply to this post by Nathaniel Smith
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:

> On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris
> <[hidden email]> wrote:
> > On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli <[hidden email]> wrote:
> >>
> >> Hello everyone,
> >>
> >> As one of the co-chairs in charge of organizing the birds-of-a-feather
> >> sesssions at the SciPy conference this year, I wanted to solicit through the
> >> NumPy list to see if we could get enough interest to hold a NumPy centered
> >> BoF this year.  The BoF format would be up to those who would lead the
> >> discussion, a couple of ideas used in the past include picking out a few of
> >> the lead devs to be on a panel and have a Q&A type of session or an open Q&A
> >> with perhaps audience guided list of topics.  I can help facilitate
> >> organization of something but we would really like to get something
> >> organized this year (last year NumPy was the only major project that was not
> >> really represented in the BoF sessions).
> >
> > I'll be at the conference, but I don't know who else will be there. I feel
> > that NumPy has matured to the point where most of the current work is
> > cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd
> > like to see discussed is where do we go from here. One option to look at is
> > Blaze, which looks to have matured a lot in the last year. The problem with
> > making it a NumPy replacement is that NumPy has become quite widespread,
> > with downloads from PyPi running at about 3 million per year. With that much
> > penetration it may be difficult for a new core like Blaze to gain traction.
> > So I'd like to also discuss ways to bring the two branches of development
> > together at some point and explore what NumPy can do to pave the way. Mind,
> > there are definitely things that would be nice to add to NumPy, a better
> > type system, missing values, etc., but doing that is difficult given the
> > current design.
>
> I won't be at the conference unfortunately (I'm on the wrong continent
> and have family commitments then anyway), but I think there's lots of
> exciting stuff that can be done in numpy-land.
>
I wouldn't like to come, but to be honest have not planned to yet and it
doesn't fit too well with the stuff I work on mostly right now. So will
have to see.

- Sebastian

> We absolutely could rewrite the dtype system, and this would
> straightforwardly give us excellent support for missing values, units,
> categorical data, automatic differentiation, better datetimes, etc.
> etc. -- and make numpy much more friendly in general to third-party
> extensions.
>
> I'd like to see the ufunc system revisited in the light of all the
> things we know now, to make gufuncs more first-class, provide better
> support for user-defined types, more flexible loop selection (e.g.
> make it possible to implement np.add.reduce(a, type="kahan")), etc.;
> one goal would be to convert a lot of ufunc-like functions (np.mean
> etc.) into being real ufuncs, and then they'd automatically benefit
> from __numpy_ufunc__, which would also massively improve
> interoperability with alternative array types like blaze.
>
> I'd like to see support for extensible label-based indexing, like pandas.
>
> Internally, I'd like to see internal migrating out of C and into
> Cython -- we have hundreds of lines of code that could be replaced
> with a few lines of Cython and no-one would notice. (Combining this
> with a cffi cython backend and pypy would be pretty interesting
> too...)
>
> I'd like to see sparse ndarrays, with integration into the ufunc
> looping machinery so all ufuncs just work. Or even better, I'd like to
> see the right hooks added so that anyone can write a sparse ndarray
> package using only public APIs, and have all ufuncs just work. (I was
> going to put down deferred/loop-fused/out-of-core computation as a
> wishlist item too, but if we do it right then this too could be
> implemented by anyone without needing to be baked into numpy proper.)
>
> All of these things would take some work and care, but I think they
> could all be done incrementally and without breaking backwards
> compatibility. Compare to ipython, which -- as Fernando likes to point
> out :-) -- went from a little console program to its current
> distributed-notebook-skynet-whatever-it-is by merging one working PR
> at a time. Certainly these changes would much easier and less
> disruptive than any plan that involves throwing out numpy and starting
> over. But they also do help smooth the way for an incremental
> transition to a world where numpy is regularly used alongside other
> libraries.
>
> -n
>

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

David Cournapeau
I won't be able to make it at scipy this year sadly.

I concur with Nathaniel that we can do a lot of things without a full rewrite -- it is all too easy to see what is gained with a rewrite and lose sight of what is lost. I have yet to see a really strong argument for a full rewrite. It may be easier to do a rewrite for a core when you have a few full-time people, but that's a different story for a community effort like numpy.

The main issue preventing new features in numpy is the lack of internal architecture at the C level, but nothing that could not be done by refactoring. Using cython to move away from the python C api would be great, though we need to talk with the cython people so that we can share common code between multiple extensions using cython, to avoid binary size explosion.

There are things that may require some backward incompatible changes in the C API, but that's much more acceptable than a significant break at the python level.

David


On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg <[hidden email]> wrote:
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
> On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris
> <[hidden email]> wrote:
> > On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli <[hidden email]> wrote:
> >>
> >> Hello everyone,
> >>
> >> As one of the co-chairs in charge of organizing the birds-of-a-feather
> >> sesssions at the SciPy conference this year, I wanted to solicit through the
> >> NumPy list to see if we could get enough interest to hold a NumPy centered
> >> BoF this year.  The BoF format would be up to those who would lead the
> >> discussion, a couple of ideas used in the past include picking out a few of
> >> the lead devs to be on a panel and have a Q&A type of session or an open Q&A
> >> with perhaps audience guided list of topics.  I can help facilitate
> >> organization of something but we would really like to get something
> >> organized this year (last year NumPy was the only major project that was not
> >> really represented in the BoF sessions).
> >
> > I'll be at the conference, but I don't know who else will be there. I feel
> > that NumPy has matured to the point where most of the current work is
> > cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd
> > like to see discussed is where do we go from here. One option to look at is
> > Blaze, which looks to have matured a lot in the last year. The problem with
> > making it a NumPy replacement is that NumPy has become quite widespread,
> > with downloads from PyPi running at about 3 million per year. With that much
> > penetration it may be difficult for a new core like Blaze to gain traction.
> > So I'd like to also discuss ways to bring the two branches of development
> > together at some point and explore what NumPy can do to pave the way. Mind,
> > there are definitely things that would be nice to add to NumPy, a better
> > type system, missing values, etc., but doing that is difficult given the
> > current design.
>
> I won't be at the conference unfortunately (I'm on the wrong continent
> and have family commitments then anyway), but I think there's lots of
> exciting stuff that can be done in numpy-land.
>

I wouldn't like to come, but to be honest have not planned to yet and it
doesn't fit too well with the stuff I work on mostly right now. So will
have to see.

- Sebastian

> We absolutely could rewrite the dtype system, and this would
> straightforwardly give us excellent support for missing values, units,
> categorical data, automatic differentiation, better datetimes, etc.
> etc. -- and make numpy much more friendly in general to third-party
> extensions.
>
> I'd like to see the ufunc system revisited in the light of all the
> things we know now, to make gufuncs more first-class, provide better
> support for user-defined types, more flexible loop selection (e.g.
> make it possible to implement np.add.reduce(a, type="kahan")), etc.;
> one goal would be to convert a lot of ufunc-like functions (np.mean
> etc.) into being real ufuncs, and then they'd automatically benefit
> from __numpy_ufunc__, which would also massively improve
> interoperability with alternative array types like blaze.
>
> I'd like to see support for extensible label-based indexing, like pandas.
>
> Internally, I'd like to see internal migrating out of C and into
> Cython -- we have hundreds of lines of code that could be replaced
> with a few lines of Cython and no-one would notice. (Combining this
> with a cffi cython backend and pypy would be pretty interesting
> too...)
>
> I'd like to see sparse ndarrays, with integration into the ufunc
> looping machinery so all ufuncs just work. Or even better, I'd like to
> see the right hooks added so that anyone can write a sparse ndarray
> package using only public APIs, and have all ufuncs just work. (I was
> going to put down deferred/loop-fused/out-of-core computation as a
> wishlist item too, but if we do it right then this too could be
> implemented by anyone without needing to be baked into numpy proper.)
>
> All of these things would take some work and care, but I think they
> could all be done incrementally and without breaking backwards
> compatibility. Compare to ipython, which -- as Fernando likes to point
> out :-) -- went from a little console program to its current
> distributed-notebook-skynet-whatever-it-is by merging one working PR
> at a time. Certainly these changes would much easier and less
> disruptive than any plan that involves throwing out numpy and starting
> over. But they also do help smooth the way for an incremental
> transition to a world where numpy is regularly used alongside other
> libraries.
>
> -n
>


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

mandli
It sounds like there is a lot to discuss come July and I am sure there will be others "willing" to voice their opinions as well.  The primary goal in all of this would be to have a constructive discussion concerning the future of NumPy, do you guys have a feeling for what might be the most effective way to do this?  A panel comes to mind but then people for the panel would have to be chosen.  In the past I know that we have simply gathered in a circle and discussed which works as well.  Whatever the case, if someone could volunteer to "lead" the discussion and also submit it via the SciPy conference website (you have to sign into the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative.

Kyle


On Wed, Jun 4, 2014 at 5:09 AM, David Cournapeau <[hidden email]> wrote:
I won't be able to make it at scipy this year sadly.

I concur with Nathaniel that we can do a lot of things without a full rewrite -- it is all too easy to see what is gained with a rewrite and lose sight of what is lost. I have yet to see a really strong argument for a full rewrite. It may be easier to do a rewrite for a core when you have a few full-time people, but that's a different story for a community effort like numpy.

The main issue preventing new features in numpy is the lack of internal architecture at the C level, but nothing that could not be done by refactoring. Using cython to move away from the python C api would be great, though we need to talk with the cython people so that we can share common code between multiple extensions using cython, to avoid binary size explosion.

There are things that may require some backward incompatible changes in the C API, but that's much more acceptable than a significant break at the python level.

David


On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg <[hidden email]> wrote:
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
> On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris
> <[hidden email]> wrote:
> > On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli <[hidden email]> wrote:
> >>
> >> Hello everyone,
> >>
> >> As one of the co-chairs in charge of organizing the birds-of-a-feather
> >> sesssions at the SciPy conference this year, I wanted to solicit through the
> >> NumPy list to see if we could get enough interest to hold a NumPy centered
> >> BoF this year.  The BoF format would be up to those who would lead the
> >> discussion, a couple of ideas used in the past include picking out a few of
> >> the lead devs to be on a panel and have a Q&A type of session or an open Q&A
> >> with perhaps audience guided list of topics.  I can help facilitate
> >> organization of something but we would really like to get something
> >> organized this year (last year NumPy was the only major project that was not
> >> really represented in the BoF sessions).
> >
> > I'll be at the conference, but I don't know who else will be there. I feel
> > that NumPy has matured to the point where most of the current work is
> > cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd
> > like to see discussed is where do we go from here. One option to look at is
> > Blaze, which looks to have matured a lot in the last year. The problem with
> > making it a NumPy replacement is that NumPy has become quite widespread,
> > with downloads from PyPi running at about 3 million per year. With that much
> > penetration it may be difficult for a new core like Blaze to gain traction.
> > So I'd like to also discuss ways to bring the two branches of development
> > together at some point and explore what NumPy can do to pave the way. Mind,
> > there are definitely things that would be nice to add to NumPy, a better
> > type system, missing values, etc., but doing that is difficult given the
> > current design.
>
> I won't be at the conference unfortunately (I'm on the wrong continent
> and have family commitments then anyway), but I think there's lots of
> exciting stuff that can be done in numpy-land.
>

I wouldn't like to come, but to be honest have not planned to yet and it
doesn't fit too well with the stuff I work on mostly right now. So will
have to see.

- Sebastian

> We absolutely could rewrite the dtype system, and this would
> straightforwardly give us excellent support for missing values, units,
> categorical data, automatic differentiation, better datetimes, etc.
> etc. -- and make numpy much more friendly in general to third-party
> extensions.
>
> I'd like to see the ufunc system revisited in the light of all the
> things we know now, to make gufuncs more first-class, provide better
> support for user-defined types, more flexible loop selection (e.g.
> make it possible to implement np.add.reduce(a, type="kahan")), etc.;
> one goal would be to convert a lot of ufunc-like functions (np.mean
> etc.) into being real ufuncs, and then they'd automatically benefit
> from __numpy_ufunc__, which would also massively improve
> interoperability with alternative array types like blaze.
>
> I'd like to see support for extensible label-based indexing, like pandas.
>
> Internally, I'd like to see internal migrating out of C and into
> Cython -- we have hundreds of lines of code that could be replaced
> with a few lines of Cython and no-one would notice. (Combining this
> with a cffi cython backend and pypy would be pretty interesting
> too...)
>
> I'd like to see sparse ndarrays, with integration into the ufunc
> looping machinery so all ufuncs just work. Or even better, I'd like to
> see the right hooks added so that anyone can write a sparse ndarray
> package using only public APIs, and have all ufuncs just work. (I was
> going to put down deferred/loop-fused/out-of-core computation as a
> wishlist item too, but if we do it right then this too could be
> implemented by anyone without needing to be baked into numpy proper.)
>
> All of these things would take some work and care, but I think they
> could all be done incrementally and without breaking backwards
> compatibility. Compare to ipython, which -- as Fernando likes to point
> out :-) -- went from a little console program to its current
> distributed-notebook-skynet-whatever-it-is by merging one working PR
> at a time. Certainly these changes would much easier and less
> disruptive than any plan that involves throwing out numpy and starting
> over. But they also do help smooth the way for an incremental
> transition to a world where numpy is regularly used alongside other
> libraries.
>
> -n
>


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Stéfan van der Walt
In reply to this post by mandli
Hi Kyle

Kyle Mandli writes:

> The BoF format would be up to those who would lead
> the discussion, a couple of ideas used in the past include picking out a
> few of the lead devs to be on a panel and have a Q&A type of session or an
> open Q&A with perhaps audience guided list of topics.

Unfortunately I won't be at the conference this year, but if I were I'd
have enjoyed seeing a couple of short presentations, drawn from, e.g.,
some of the people involved in this discussion (Nathan can perhaps join
in via Google Hangout), about possible future directions.  That way one
can sketch out the playing field to seed the discussion.  In addition, I
those sketches would provide a useful update to all those watching the
conference remotely via video.

Regards
Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Chris Barker - NOAA Federal
In reply to this post by mandli
On Thu, Jun 5, 2014 at 1:32 PM, Kyle Mandli <[hidden email]> wrote:
 In the past I know that we have simply gathered in a circle and discussed which works as well.  Whatever the case, if someone could volunteer to "lead" the discussion

It's my experience that a really good facilitator could make all the difference in how productive this kind of discussion is. I have no idea how to find such a facilitator (it's a pretty rare skill), but it would be nice to try, rather than taking whoever is willing to do the bureaucratic part....

and also submit it via the SciPy conference website (you have to sign into the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative.

someone could still take on the organizer role while trying to find a facilitator...

-Chris

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

mandli
Hi Everyone,

Fernando Perez has volunteered to act as a facilitator for a BoF on NumPy if there is still interest in having this discussion.  I can make sure that it is not scheduled in conjunction with some of the other planned BoFs that may interest this crowd.  If someone could take the lead on organizing, submitting something to the conference website and I helping to gather interested parties I would be much obliged.

Kyle


On Fri, Jun 6, 2014 at 12:17 AM, Chris Barker <[hidden email]> wrote:
On Thu, Jun 5, 2014 at 1:32 PM, Kyle Mandli <[hidden email]> wrote:
 In the past I know that we have simply gathered in a circle and discussed which works as well.  Whatever the case, if someone could volunteer to "lead" the discussion

It's my experience that a really good facilitator could make all the difference in how productive this kind of discussion is. I have no idea how to find such a facilitator (it's a pretty rare skill), but it would be nice to try, rather than taking whoever is willing to do the bureaucratic part....

and also submit it via the SciPy conference website (you have to sign into the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative.

someone could still take on the organizer role while trying to find a facilitator...

-Chris

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            <a href="tel:%28206%29%20526-6959" value="+12065266959" target="_blank">(206) 526-6959   voice
7600 Sand Point Way NE   <a href="tel:%28206%29%20526-6329" value="+12065266329" target="_blank">(206) 526-6329   fax
Seattle, WA  98115       <a href="tel:%28206%29%20526-6317" value="+12065266317" target="_blank">(206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Charles R Harris



On Tue, Jun 17, 2014 at 2:40 PM, Kyle Mandli <[hidden email]> wrote:
Hi Everyone,

Fernando Perez has volunteered to act as a facilitator for a BoF on NumPy if there is still interest in having this discussion.  I can make sure that it is not scheduled in conjunction with some of the other planned BoFs that may interest this crowd.  If someone could take the lead on organizing, submitting something to the conference website and I helping to gather interested parties I would be much obliged.

I can submit something to the conference website next week when I get back in town. We need to make sure it fits with the schedule of the Continuum folks.

<snip>

Chuck


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Charles R Harris
In reply to this post by mandli
Hi Kyle,

On Tue, Jun 17, 2014 at 2:40 PM, Kyle Mandli <[hidden email]> wrote:
Hi Everyone,

Fernando Perez has volunteered to act as a facilitator for a BoF on NumPy if there is still interest in having this discussion.  I can make sure that it is not scheduled in conjunction with some of the other planned BoFs that may interest this crowd.  If someone could take the lead on organizing, submitting something to the conference website and I helping to gather interested parties I would be much obliged.

I see you have reserved a slot on July 10 at 1:30 PM in room 204. That looks good to me. Along with discussing the future development of NumPy, I'd like to propose adopting something like the Blaze standard for datetime, which is similar to the current Pandas treatment, but with a different time base. Hmm... there seems to be a proliferation of time implementations, we should try to pick just one.

<snip>

Chuck


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Chris Barker - NOAA Federal
On Jun 27, 2014, at 8:44 PM, Charles R Harris <[hidden email]> wrote:
>
> Hi Kyle,
>
>> On Tue, Jun 17, 2014 at 2:40 I'd like to propose adopting something like the Blaze standard for datetime,

+1 for some focused discussion of datetime. This has been lingering
far too long.

-Chris
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Fernando Perez
Hi folks,

I've just created a page on the numpy wiki:


I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.

BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.

It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.

Cheers

f


On Sat, Jun 28, 2014 at 11:25 AM, Chris Barker - NOAA Federal <[hidden email]> wrote:
On Jun 27, 2014, at 8:44 PM, Charles R Harris <[hidden email]> wrote:
>
> Hi Kyle,
>
>> On Tue, Jun 17, 2014 at 2:40 I'd like to propose adopting something like the Blaze standard for datetime,

+1 for some focused discussion of datetime. This has been lingering
far too long.

-Chris
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Charles R Harris



On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez <[hidden email]> wrote:
Hi folks,

I've just created a page on the numpy wiki:


I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.

BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.

It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.

Cheers


I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.

<snip>

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Charles R Harris



On Sat, Jun 28, 2014 at 9:19 PM, Charles R Harris <[hidden email]> wrote:



On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez <[hidden email]> wrote:
Hi folks,

I've just created a page on the numpy wiki:


I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.

BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.

It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.

Cheers


I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.


I've added a preliminary list of topics. I'm sure there is more to be added.

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

Fernando Perez
Great, thanks! And we can certainly try to either move into planning at the end, or plan for such afterwards.

Best

f


On Sat, Jun 28, 2014 at 8:40 PM, Charles R Harris <[hidden email]> wrote:



On Sat, Jun 28, 2014 at 9:19 PM, Charles R Harris <[hidden email]> wrote:



On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez <[hidden email]> wrote:
Hi folks,

I've just created a page on the numpy wiki:


I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.

BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.

It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.

Cheers


I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.


I've added a preliminary list of topics. I'm sure there is more to be added.

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion




--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: SciPy 2014 BoF NumPy Participation

mandli
I am really excited to see that we have a great agenda for the BoF, I hope that the discussion will be fruitful!

Kyle


On Sat, Jun 28, 2014 at 10:44 PM, Fernando Perez <[hidden email]> wrote:
Great, thanks! And we can certainly try to either move into planning at the end, or plan for such afterwards.

Best

f


On Sat, Jun 28, 2014 at 8:40 PM, Charles R Harris <[hidden email]> wrote:



On Sat, Jun 28, 2014 at 9:19 PM, Charles R Harris <[hidden email]> wrote:



On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez <[hidden email]> wrote:
Hi folks,

I've just created a page on the numpy wiki:


I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.

BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.

It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.

Cheers


I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.


I've added a preliminary list of topics. I'm sure there is more to be added.

Chuck

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion




--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/numpy-discussion