Reminder: weekly status meeting

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Reminder: weekly status meeting

mattip
Hi everyone,

The team at BIDS meets once a week to discuss progress, priorities, and
roadblocks.  While our priorities are broadly determined by the project
roadmap [0], we would like to provide an opportunity for the community
to give more regular and detailed feedback on our work.

We therefore invite you to join us for our weekly calls,
each **Wednesday from 12:00 to 13:00 Pacific Time**.

Detail of the next meeting (2018-10-24) is given in the agenda [1],
which is a living document. Feel free to add topics you wish to discuss.

We hope to see you there!
Best regards,
Stéfan, Tyler, Matti

[0]https://www.numpy.org/neps/index.html
[1]https://hackmd.io/5WZ6VwQKSbSR_4Ng65pUFw?both

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

Stefan van der Walt
Hi all,

On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
> We therefore invite you to join us for our weekly calls,
> each **Wednesday from 12:00 to 13:00 Pacific Time**.
>
> Detail of the next meeting (2018-10-24) is given in the agenda

This week's meeting notes are at:

https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md

Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

einstein.edison
Hi!

Sorry to miss this week’s meeting.

If I may point out an inaccuracy in the notes: in PyData/Sparse most things are implemented from the ground up without relying on scipy.sparse. The only part that does rely on it is `sparse.matmul`, `sparse.dot` and `sparse.tensordot`, as well as a few conversions to/from SciPy, if these could depend on Cython wrappers instead that’d be nice.

I should probably update the docs on that. If anyone is willing to discuss pydata/sparse with me, I’ll be available for a meeting anytime.

Best Regards,
Hameer Abbasi

On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt <[hidden email]> wrote:
Hi all,

On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
We therefore invite you to join us for our weekly calls,
each **Wednesday from 12:00 to 13:00 Pacific Time**.

Detail of the next meeting (2018-10-24) is given in the agenda

This week's meeting notes are at:

https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md

Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

Tyler Reddy
What exactly would you like Cython wrappers for? Some of the C++ code in scipy/sparse/sparsetools?

I see you have COO.from_scipy_sparse(x) in some pydata/sparse code paths, which presumably you'd like to avoid or improve?

On Thu, 25 Oct 2018 at 03:41, Hameer Abbasi <[hidden email]> wrote:
Hi!

Sorry to miss this week’s meeting.

If I may point out an inaccuracy in the notes: in PyData/Sparse most things are implemented from the ground up without relying on scipy.sparse. The only part that does rely on it is `sparse.matmul`, `sparse.dot` and `sparse.tensordot`, as well as a few conversions to/from SciPy, if these could depend on Cython wrappers instead that’d be nice.

I should probably update the docs on that. If anyone is willing to discuss pydata/sparse with me, I’ll be available for a meeting anytime.

Best Regards,
Hameer Abbasi

On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt <[hidden email]> wrote:
Hi all,

On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
We therefore invite you to join us for our weekly calls,
each **Wednesday from 12:00 to 13:00 Pacific Time**.

Detail of the next meeting (2018-10-24) is given in the agenda

This week's meeting notes are at:

https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md

Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

einstein.edison
Hi everyone,

Like I said, we just use those to coerce SciPy arrays to native ones for compatibility. You could remove all those and the package would work fine, as long as you were using native PyData/Sparse arrays.

The only core functionality dependent on scipy.sparse is matrix multiplication and the like. Everything else is for inter-operability.

Best Regards,
Hameer Abbasi

On Friday, Oct 26, 2018 at 1:19 AM, Tyler Reddy <[hidden email]> wrote:
What exactly would you like Cython wrappers for? Some of the C++ code in scipy/sparse/sparsetools?

I see you have COO.from_scipy_sparse(x) in some pydata/sparse code paths, which presumably you'd like to avoid or improve?

On Thu, 25 Oct 2018 at 03:41, Hameer Abbasi <[hidden email]> wrote:
Hi!

Sorry to miss this week’s meeting.

If I may point out an inaccuracy in the notes: in PyData/Sparse most things are implemented from the ground up without relying on scipy.sparse. The only part that does rely on it is `sparse.matmul`, `sparse.dot` and `sparse.tensordot`, as well as a few conversions to/from SciPy, if these could depend on Cython wrappers instead that’d be nice.

I should probably update the docs on that. If anyone is willing to discuss pydata/sparse with me, I’ll be available for a meeting anytime.

Best Regards,
Hameer Abbasi

On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt <[hidden email]> wrote:
Hi all,

On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
We therefore invite you to join us for our weekly calls,
each **Wednesday from 12:00 to 13:00 Pacific Time**.

Detail of the next meeting (2018-10-24) is given in the agenda

This week's meeting notes are at:

https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md

Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

Stefan van der Walt
Hi Hameer,

On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
> The only core functionality dependent on scipy.sparse is matrix
> multiplication and the like. Everything else is for inter-operability.

Thank you for commenting here.

As you know, I am enthusiastic about seeing an `sparray` equivalent to
`spmatrix`.  When we last spoke, my recollection was that it would be
beneficial to `pydata/sparse`.  Is this still correct?

If not, are we now in a situation where it would be more helpful to
build `sparray` based on `pydata/sparse`.

If we can have a good sparse array API in place in SciPy, it may
significantly simplify code in various other libraries (I'm thinking of
scikit-learn, e.g.).

Best regards,
Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

einstein.edison
Hi Stefan!

PyData/Sparse is pretty far along, by January or so we should have a CSR/CSC replacement that is ND. It needs optimisation in a lot of cases but the API is compatible with NumPy and works pretty well already IMO.

PyData/Sparse is pretty much independent of any changes to scipy.sparse at this point. We build on top of NumPy, not scipy.sparse.

Feel free to use any or all of my code for sparray, although I think Ralf Gommers, Matthew Rocklin and others were of the opinion that the data structure should stay in PyData/Sparse and linear algebra and csgraph etc should go into SciPy.

Best Regards,
Hameer Abbasi

On Friday, Oct 26, 2018 at 7:03 PM, Stefan van der Walt <[hidden email]> wrote:
Hi Hameer,

On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
The only core functionality dependent on scipy.sparse is matrix
multiplication and the like. Everything else is for inter-operability.

Thank you for commenting here.

As you know, I am enthusiastic about seeing an `sparray` equivalent to
`spmatrix`. When we last spoke, my recollection was that it would be
beneficial to `pydata/sparse`. Is this still correct?

If not, are we now in a situation where it would be more helpful to
build `sparray` based on `pydata/sparse`.

If we can have a good sparse array API in place in SciPy, it may
significantly simplify code in various other libraries (I'm thinking of
scikit-learn, e.g.).

Best regards,
Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

ralfgommers


On Sat, Oct 27, 2018 at 6:10 AM Hameer Abbasi <[hidden email]> wrote:
Hi Stefan!

PyData/Sparse is pretty far along, by January or so we should have a CSR/CSC replacement that is ND. It needs optimisation in a lot of cases but the API is compatible with NumPy and works pretty well already IMO.

PyData/Sparse is pretty much independent of any changes to scipy.sparse at this point. We build on top of NumPy, not scipy.sparse.

Feel free to use any or all of my code for sparray, although I think Ralf Gommers, Matthew Rocklin and others were of the opinion that the data structure should stay in PyData/Sparse and linear algebra and csgraph etc should go into SciPy.

Just to make sure we're talking about the same things here: Stefan, I think with "sparray" you mean "an n-D sparse array implementation that lives in SciPy", nothing more specific? In that case pydata/sparse is the one implementation, and including it in scipy.sparse would make it "sparray". I'm currently indeed leaning towards depending on pydata/sparse rather than including it in scipy.

Cheers,
Ralf



Best Regards,
Hameer Abbasi

On Friday, Oct 26, 2018 at 7:03 PM, Stefan van der Walt <[hidden email]> wrote:
Hi Hameer,

On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
The only core functionality dependent on scipy.sparse is matrix
multiplication and the like. Everything else is for inter-operability.

Thank you for commenting here.

As you know, I am enthusiastic about seeing an `sparray` equivalent to
`spmatrix`. When we last spoke, my recollection was that it would be
beneficial to `pydata/sparse`. Is this still correct?

If not, are we now in a situation where it would be more helpful to
build `sparray` based on `pydata/sparse`.

If we can have a good sparse array API in place in SciPy, it may
significantly simplify code in various other libraries (I'm thinking of
scikit-learn, e.g.).

Best regards,
Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

Stefan van der Walt
On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> Just to make sure we're talking about the same things here: Stefan, I think
> with "sparray" you mean "an n-D sparse array implementation that lives in
> SciPy", nothing more specific? In that case pydata/sparse is the one
> implementation, and including it in scipy.sparse would make it "sparray".
> I'm currently indeed leaning towards depending on pydata/sparse rather than
> including it in scipy.

I want to double check: when we last spoke, it seemed as though certain
refactorings inside of SciPy (specifically, sparray was mentioned) would
simplify the life of pydata/sparse devs.  That no longer seems to be the
case?

If our recommended route is to tell users to use pydata/sparse instead
of SciPy (for the sparse array object), we probably want to get rid of
our own internal implementation, and deprecate spmatrix (or, build
spmatrix on top of pydata/sparse)?

Once we can define a clear API for sparse arrays, we can include some
algorithms that ingest those objects in SciPy.  But, I'm not sure we
have an API in place that will allow handover of such objects to the
existing C/FORTRAN-level code.

Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

ralfgommers


On Sat, Oct 27, 2018 at 11:10 AM Stefan van der Walt <[hidden email]> wrote:
On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> Just to make sure we're talking about the same things here: Stefan, I think
> with "sparray" you mean "an n-D sparse array implementation that lives in
> SciPy", nothing more specific? In that case pydata/sparse is the one
> implementation, and including it in scipy.sparse would make it "sparray".
> I'm currently indeed leaning towards depending on pydata/sparse rather than
> including it in scipy.

I want to double check: when we last spoke, it seemed as though certain
refactorings inside of SciPy (specifically, sparray was mentioned) would
simplify the life of pydata/sparse devs.  That no longer seems to be the
case?

There's no such thing as `sparray` anywhere in SciPy. There's two inactive projects to create an n-D sparse array implementation, one of which is called sparray (https://github.com/perimosocordiae/sparray). And there's one very active project to do that same thing which is https://github.com/pydata/sparse


If our recommended route is to tell users to use pydata/sparse instead
of SciPy (for the sparse array object), we probably want to get rid of
our own internal implementation, and deprecate spmatrix

Doc-deprecate I think; the sparse matrix classes in SciPy are very heavily used, so it doesn't make sense to start emitting deprecation warnings for them. But at some point we'll want to point users to pydata/sparse for new code.
 
(or, build
spmatrix on top of pydata/sparse)?

It's the matrix vs. array semantics that are the issue, so not sure that building one on top of the other would be useful.


Once we can define a clear API for sparse arrays, we can include some
algorithms that ingest those objects in SciPy.  But, I'm not sure we
have an API in place that will allow handover of such objects to the
existing C/FORTRAN-level code.

I don't think the constructors for sparse matrix/array care about C/F order. pydata/sparse is pure Python (and uses Numba). For reusing scipy.sparse.linalg and scipy.sparse.csgraph you're right I think that that will need some careful design work. Not sure anyone has thought about that in a lot of detail yet.

There are interesting API questions probably, such as how to treat explicit zeros (that debate still isn't settled for the matrix classes IIRC). And there's an interesting transition puzzle to figure out (which also includes np.matrix). At the moment the discussion on that is spread out over many mailing list threads and Github issues, at some point we'll need to summarize that. Probably around the time that the CSR/CSC replacement that Hameer mentioned is finished.

Cheers,
Ralf




_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

einstein.edison
In reply to this post by Stefan van der Walt
On Saturday, Oct 27, 2018 at 12:10 AM, Stefan van der Walt <[hidden email]> wrote:
On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
Just to make sure we're talking about the same things here: Stefan, I think
with "sparray" you mean "an n-D sparse array implementation that lives in
SciPy", nothing more specific? In that case pydata/sparse is the one
implementation, and including it in scipy.sparse would make it "sparray".
I'm currently indeed leaning towards depending on pydata/sparse rather than
including it in scipy.

I want to double check: when we last spoke, it seemed as though certain
refactorings inside of SciPy (specifically, sparray was mentioned) would
simplify the life of pydata/sparse devs. That no longer seems to be the
case? 

Hi! I can’t recall having said this, perhaps you inferred it from the docs (it’s on the front page, so that isn’t unreasonable). We should update that sometime.

That said, we use very little of scipy.sparse in PyData/Sparse. When Matt Rocklin was maintaining the project, that was the case, but even in the later days he shifted much of his code to pure NumPy. I followed that path further, not out of unwillingness to depend on it, but out of desire for generality.

In its current state, the only things in PyData/Sparse that depend on scipy.sparse are:
  • Conversion to/from scipy.sparse spmatrix classes
  • A bit of linear algebra i.e. dot, tensordot, matmul.
Best Regards,
Hameer Abbasi


If our recommended route is to tell users to use pydata/sparse instead
of SciPy (for the sparse array object), we probably want to get rid of
our own internal implementation, and deprecate spmatrix (or, build
spmatrix on top of pydata/sparse)?

Once we can define a clear API for sparse arrays, we can include some
algorithms that ingest those objects in SciPy. But, I'm not sure we
have an API in place that will allow handover of such objects to the
existing C/FORTRAN-level code.

Stéfan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Reminder: weekly status meeting

einstein.edison
In reply to this post by ralfgommers


On Saturday, Oct 27, 2018 at 6:11 AM, Ralf Gommers <[hidden email]> wrote:


On Sat, Oct 27, 2018 at 11:10 AM Stefan van der Walt <[hidden email]> wrote:
On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> Just to make sure we're talking about the same things here: Stefan, I think
> with "sparray" you mean "an n-D sparse array implementation that lives in
> SciPy", nothing more specific? In that case pydata/sparse is the one
> implementation, and including it in scipy.sparse would make it "sparray".
> I'm currently indeed leaning towards depending on pydata/sparse rather than
> including it in scipy.

I want to double check: when we last spoke, it seemed as though certain
refactorings inside of SciPy (specifically, sparray was mentioned) would
simplify the life of pydata/sparse devs.  That no longer seems to be the
case?

There's no such thing as `sparray` anywhere in SciPy. There's two inactive projects to create an n-D sparse array implementation, one of which is called sparray (https://github.com/perimosocordiae/sparray). And there's one very active project to do that same thing which is https://github.com/pydata/sparse


If our recommended route is to tell users to use pydata/sparse instead
of SciPy (for the sparse array object), we probably want to get rid of
our own internal implementation, and deprecate spmatrix

Doc-deprecate I think; the sparse matrix classes in SciPy are very heavily used, so it doesn't make sense to start emitting deprecation warnings for them. But at some point we'll want to point users to pydata/sparse for new code.
 
(or, build
spmatrix on top of pydata/sparse)?

It's the matrix vs. array semantics that are the issue, so not sure that building one on top of the other would be useful.


Once we can define a clear API for sparse arrays, we can include some
algorithms that ingest those objects in SciPy.  But, I'm not sure we
have an API in place that will allow handover of such objects to the
existing C/FORTRAN-level code.

I don't think the constructors for sparse matrix/array care about C/F order. pydata/sparse is pure Python (and uses Numba). For reusing scipy.sparse.linalg and scipy.sparse.csgraph you're right I think that that will need some careful design work. Not sure anyone has thought about that in a lot of detail yet.


They don’t yet. That is a planned feature, allowing an arbitrary permutation of input coordinates.

There are interesting API questions probably, such as how to treat explicit zeros (that debate still isn't settled for the matrix classes IIRC).


Explicit zeros are easier now, just use a fill_value of NaN and work with zeros as usual.

Best Regards,
Hameer Abbasi

And there's an interesting transition puzzle to figure out (which also includes np.matrix). At the moment the discussion on that is spread out over many mailing list threads and Github issues, at some point we'll need to summarize that. Probably around the time that the CSR/CSC replacement that Hameer mentioned is finished.

Cheers,
Ralf



_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion