lstsq underdetermined behaviour

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

lstsq underdetermined behaviour

Romesh Abeysuriya
Hi all,

I'm solving an underdetermined system using `numpy.linalg.lstsq` and
trying to track down its behavior for underdetermined systems. In
previous versions of numpy (e.g. 1.14) in `linalg.py` the definition
for `lstsq` calls `dgelsd` for real inputs, which I think means that
the underdetermined system is solved with the minimum-norm solution
(that is, minimizing the norm of the solution vector, in addition to
minimizing the residual). In 1.15 the call is instead to
`_umath_linalg.lstsq_m` and I'm not sure what this actually ends up
doing - does this end up being the same as `dgelsd`? If so, it would
be great if the documentation for  `numpy.linalg.lstsq` stated that it
is returning the minimum-norm solution (as it stands, it reads as
undefined, so in theory I don't think one can rely on any particular
solution being returned for an underdetermined system)

Cheers,
Romesh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: lstsq underdetermined behaviour

Eric Wieser
In 1.15 the call is instead to `_umath_linalg.lstsq_m` and I'm not sure what this actually ends up doing - does this end up being the same as `dgelsd`?

When the arguments are real, yes. What changed is that the dispatching now happens in C, which was done as a step towards the incomplete https://github.com/numpy/numpy/issues/8720.

I'm not an expert - but aren't "minimum norm" and "least squares" two ways to state the same thing?

Eric

On Sun, 18 Nov 2018 at 20:04 Romesh Abeysuriya <[hidden email]> wrote:
Hi all,

I'm solving an underdetermined system using `numpy.linalg.lstsq` and
trying to track down its behavior for underdetermined systems. In
previous versions of numpy (e.g. 1.14) in `linalg.py` the definition
for `lstsq` calls `dgelsd` for real inputs, which I think means that
the underdetermined system is solved with the minimum-norm solution
(that is, minimizing the norm of the solution vector, in addition to
minimizing the residual). In 1.15 the call is instead to
`_umath_linalg.lstsq_m` and I'm not sure what this actually ends up
doing - does this end up being the same as `dgelsd`? If so, it would
be great if the documentation for  `numpy.linalg.lstsq` stated that it
is returning the minimum-norm solution (as it stands, it reads as
undefined, so in theory I don't think one can rely on any particular
solution being returned for an underdetermined system)

Cheers,
Romesh
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: lstsq underdetermined behaviour

Charles R Harris


On Sun, Nov 18, 2018 at 9:24 PM Eric Wieser <[hidden email]> wrote:
In 1.15 the call is instead to `_umath_linalg.lstsq_m` and I'm not sure what this actually ends up doing - does this end up being the same as `dgelsd`?

When the arguments are real, yes. What changed is that the dispatching now happens in C, which was done as a step towards the incomplete https://github.com/numpy/numpy/issues/8720.

I'm not an expert - but aren't "minimum norm" and "least squares" two ways to state the same thing?


If there aren't enough data points to uniquely determine the minimizing solution, the solution vector of shortest length is returned. In practice it is pretty useless because it depends on the column scaling and there is generally no natural metric in the solution space.

<snip>

Chuck 

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: lstsq underdetermined behaviour

Romesh Abeysuriya
Thanks both! Yes, I guess it's typically 'least squares' referring to
the residual vector, and 'minimum norm' referring to the solution
vector. That's certainly how the documentation for `dgelsd` frames it.
In my case, the minimum norm solution can be sensibly interpreted (and
in particular, it guarantees that the solution is 0 for missing
variables), so it's great to know that I can rely on this being
returned

Cheers,
Romesh



On Mon, Nov 19, 2018 at 12:30 PM Charles R Harris
<[hidden email]> wrote:

>
>
>
> On Sun, Nov 18, 2018 at 9:24 PM Eric Wieser <[hidden email]> wrote:
>>
>> > In 1.15 the call is instead to `_umath_linalg.lstsq_m` and I'm not sure what this actually ends up doing - does this end up being the same as `dgelsd`?
>>
>> When the arguments are real, yes. What changed is that the dispatching now happens in C, which was done as a step towards the incomplete https://github.com/numpy/numpy/issues/8720.
>>
>> I'm not an expert - but aren't "minimum norm" and "least squares" two ways to state the same thing?
>>
>
> If there aren't enough data points to uniquely determine the minimizing solution, the solution vector of shortest length is returned. In practice it is pretty useless because it depends on the column scaling and there is generally no natural metric in the solution space.
>
> <snip>
>
> Chuck
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion