Hi everyone, I was stalking the deprecating the numpy.matrix discussion on the other thread and I wondered maybe the mailing list is a better place for the discussion about something I've been meaning to ask the dev members. I thought mailing lists are something we dumped using together with ICQ and geocities stuff but apparently not :-)The "assume_a" keyword introduced here is hopefully modular enough that
should there be any need for more structures we can simply keep adding
to the list without any backwards compatibility. It will be at least offering more options than what we have currently. The part that I would like to discuss requires a bit of intro so please bear with me. Let me copy/paste the part from the old PR: Around many places online, we can witness the rant about numpy/scipy not letting the users know about the conditioning for example Mike Croucher's blog and numpy/numpy#3755 Since we don't have any central backslash function that optimizes
depending on the input data, should we create a function, let's name it
with the matlab equivalent for the time being I'm sure you are aware, but just for completeness, the linear equation solvers are often built around the concept of polyalgorithm which is a fancy way of saying that the array is tested consecutively for certain structures and the checks are ordered in such a way that the simpler structure is tested the sooner. E.g. first check for diagonal matrix, then for upper/lower triangular then permuted triangular then symmetrical and so on. Here is also another example from AdvanPix http://www.advanpix.com/2016/ Now, according to what I have coded and optimized as much as I can, a pure Python is not acceptable as an overhead during these checks. It would definitely be a noticeable slowdown if this was in place in the existing linalg.solve however I think this is certainly doable in the low-level C/FORTRAN level. CPython is certainly faster but I think only a straight C/FORTRAN implementation would cut it. Note that we only need the discovery of the structure then we can pass to the dedicated solver as is. Hence I'm not saying that we should recode the existing solve functionality. We already have the branching in place to ?GE/SY/HE/POSVX routines. ------- The second issue about the current linalg.solve function is when trying to solve for right inverse e.g. xA = b. Again with some copy/paste: The right inversion is currently a bit annoying, that is to say if we would like to compute, say, BA^{-1} , then the user has to explicitly transpose the explicitly transposed equation to avoid using an explicit inv (whose use should be discouraged anyways) x = scipy.linalg.solve(A.T, B.T).T . Since expert drivers come with a trans switch that can internally handle whether to solve the transposed or the regular equation, these routines avoid the A.T off-the-shelf. I am wondering what might be the best way to add a "r_inv" keyword such that the B.T is also handled at the FORTRAN level instead such that the user can simply write "solve(A,B, r_inv=True)". Because we don't have a backslash operation we could at least provide this much as convenience I guess. I would love to have go at it but I'm definitely not competent enough in C/FORTRAN at the production level so I was wondering whether I could get some help about this. Anyways, I hope I could make my point with a rather lengthy post. Please let me know if this is a plausible feature ilhan PS: In case gmail links won't be parsed, here are the inline links SO thread : http://stackoverflow.com/ linsolve/mldivide page : http://nl.mathworks.com/help/ _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mon, Jan 9, 2017 at 4:17 AM, Ilhan Polat <[hidden email]> wrote:
Note that you're proposing a new scipy feature (right?) on the numpy list.... This sounds like a good idea to me. As a former heavy Matlab user I remember a lot of things to dislike, but "\" behavior was quite nice.
How much is a noticeable slowdown? Note that we still have the current interfaces available for users that know what they need, so a nice convenience function that is say 5-10% slower would not be the end of the world. Ralf
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
> Note that you're proposing a new scipy feature (right?) on the numpy list.... > This
sounds like a good idea to me. As a former heavy Matlab user I remember
a lot of things to dislike, but "\" behavior was quite nice.> How much is a noticeable slowdown? Note that we still have the current interfaces available for users that know what they need, so a nice convenience function that is say 5-10% slower would not be the end of the world. _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mon, Jan 9, 2017 at 6:27 AM, Ilhan Polat <[hidden email]> wrote:
All this checks sound a bit expensive, if we have almost always completely unstructured arrays that don't satisfy any special matrix pattern. In analogy to the type proliferation in Julia to handle those cases: Is there a way to attach information to numpy arrays that for example signals that a 2d array is hermitian, banded or diagonal or ...? (After second thought: maybe completely unstructured is not too expensive to detect if the checks are short-circuited, one off diagonal element nonzero - not diagonal, two opposite diagonal different - not symmetric, ...) Josef
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Indeed, generic is the cheapest discovery including the worst case that only the last off-diagonal element is nonzero, a pseudo code is first remove the diagonals check the remaining parts for nonzero, then check the upper triangle then lower, then morally triangularness from zero structure if any then bandedness and so on. If you have access to matlab, then you can set the sparse monitor to verbose mode " spparms('spumoni', 1) " and perform a backslash operation on sparse matrices. It will spit out what it does during the checks. A = sparse([0 2 0 1 0; 4 -1 -1 0 0; 0 0 0 3 -6; -2 0 0 0 2; 0 0 4 2 0]); B = sparse([8; -1; -18; 8; 20]); spparms('spumoni',1) x = A\B So every test in the polyalgorithm is cheaper than the next one. I'm not exactly sure what might be the best strategy yet hence the question. It's really interesting that LAPACK doesn't have this type of fast checks. On Mon, Jan 9, 2017 at 8:30 PM, <[hidden email]> wrote:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mon, Jan 9, 2017 at 5:09 PM, Ilhan Polat <[hidden email]> wrote: -- > So every test in the polyalgorithm is cheaper than the next one. I'm not exactly sure what might be the best strategy yet hence the question. It's really interesting that LAPACK doesn't have this type of fast checks. In Fortran LAPACK, if you have a special structured matrix, you usually explicitly use packed storage and call the appropriate function type on it. It's only when you go to a system that only has a generic, unstructured dense matrix data type that it makes sense to do those kinds of checks. Robert Kern
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Yes, that's precisely the case but when we know the structure we can just choose the appropriate solver anyhow with a little bit of overhead. What I mean is that, to my knowledge, FORTRAN routines for checking for triangularness etc. are absent. On Tue, Jan 10, 2017 at 2:29 AM, Robert Kern <[hidden email]> wrote:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
On Mon, Jan 9, 2017 at 7:10 PM, Ilhan Polat <[hidden email]> wrote:
> > Yes, that's precisely the case but when we know the structure we can just choose the appropriate solver anyhow with a little bit of overhead. What I mean is that, to my knowledge, FORTRAN routines for checking for triangularness etc. are absent. I'm responding to that. The reason that they don't have those FORTRAN routines for testing for structure inside of a generic dense matrix is that in FORTRAN it's more natural (and efficient) to just use the explicit packed structure and associated routines instead. You would only use a generic dense matrix if you know that there isn't structure in the matrix. So there are no routines for detecting that structure in generic dense matrices. -- Robert Kern _______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
I've done some benchmarking and it seems that the packed storage comes with a runtime penalty which agrees with a few links I've found online Existence of these polyalgorithms in matlab and not having in lapack should not imply FORTRAN users always know the structure in their matrices. I will also ask in LAPACK message board about this for some context.https://blog.debroglie.net/2013/09/01/lapack-and-packed-storage/ http://stackoverflow.com/questions/8941678/lapack-are-operations-on-packed-storage-matrices-faster The access of individual elements in packed stored matrices is expected to be more costly than in full storage, because of the more complicated indexing necessary. Hence, I am not sure if this justifies the absence just by having a dedicated solver for a prescribed structure. On Tue, Jan 10, 2017 at 4:16 AM, Robert Kern <[hidden email]> wrote:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
I agree that this seems more like a scipy feature than a numpy feature. Users with structured matrices often use a sparse matrix format, though the API for using them in solvers could use some work. (I have a work-in-progress PR along those lines here: https://github.com/scipy/scipy/pull/6331) Perhaps this polyalgorithm approach could be used to dispatch sparse matrices to the appropriate solver, while optionally checking dense matrices for structure before dispatching them as well. Usage might look like: # if A is sparse, use scipy.sparse.linalg.solve, otherwise use scipy.linalg.solve scipy.linalg.generic_solve(A, b)# converts A to banded representation and calls scipy.linalg.solveh_banded, regardless if A is sparse or dense scipy.linalg.generic_solve(A, b, symmetric=True, banded=(-5, 5)) # runs possibly-expensive checks, then dispatches to the appropriate solver scipy.linalg.generic_solve(A, b, detect_structure=True)(I'm not advocating for "generic_solve" as the final name, I just needed a placeholder.) On Tue, Jan 10, 2017 at 4:58 AM, Ilhan Polat <[hidden email]> wrote:
_______________________________________________ NumPy-Discussion mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/numpy-discussion |
Free forum by Nabble | Edit this page |