Hi,

I hope this is the appropriate place to ask something like

this, otherwise please let me know (or feel free to ignore

this). Also I hope that I do not misunderstood something or

did some silly mistake. If so, please let me know as well!

TLDR:

When scaling the covariance matrix based on the residuals,

scipy.optimize.curve_fit uses a factor of chisq(popt)/(M-N)

(with M=number of point, N=number of parameters) and

numpy.polyfit uses chisq(popt)/(M-N-2). I am wondering which

is correct.

I am somewhat confused about different results I am getting

for the covariance matrix of a simple linear fit, when

comparing `scipy.optimize.curve_fit` and `numpy.polyfit`. I

am aware, that `curve_fit` solves the more general non-linear

problem numerically, while `polyfit` finds an analytical

solution to the linear problem. However, both converge to the

same solution, so I suspect that this difference is not

important here. The difference, I am curious about is not in

the returned parameters but in the estimate of the

corresponding covariance matrix. As I understand, there are

two different ways to estimate it, based either on the

absolute values of the provided uncertainties or by

interpreting those only as weights and then scaling the

matrix to produce an appropriate reduced chisq. To that end,

curve_fit has the parameter:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html:

"absolute_sigma : bool, optional

If True, sigma is used in an absolute sense and the

estimated parameter covariance pcov reflects these absolute

values.

If False, only the relative magnitudes of the sigma values

matter. The returned parameter covariance matrix pcov is

based on scaling sigma by a constant factor. This constant

is set by demanding that the reduced chisq for the optimal

parameters popt when using the scaled sigma equals unity. In

other words, sigma is scaled to match the sample variance of

the residuals after the fit. Mathematically,

pcov(absolute_sigma=False) = pcov(absolute_sigma=True) *

chisq(popt)/(M-N)"

https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.htmlon the other hand, does not say anything about how the

covariance matrix is estimated. To my understanding, its

default should correspond to `absolute_sigma=False` for

`curve_fit`. As `polyfit` has a weight parameter instead of

an uncertainty parameter, I guess the difference in default

behavior is not that surprising.

However, even when specifying `absolute_sigma=False`,

`curve_fit` and `polyfit` produce different covariance

matrices as the applied scaling factors are

chisq(popt)/(M-N-2) for `polyfit`

(

https://github.com/numpy/numpy/blob/6a58e25703cbecb6786faa09a04ae2ec8221348b/numpy/lib/polynomial.py#L598-L605)

and chisq(popt)/(M-N) for `curve_fit`

(

https://github.com/scipy/scipy/blob/607a21e07dad234f8e63fcf03b7994137a3ccd5b/scipy/optimize/minpack.py#L781-L782).

The argument given in a comment to the scaling `polyfit` is:

"Some literature ignores the extra -2.0 factor in the

denominator, but it is included here because the covariance

of Multivariate Student-T (which is implied by a Bayesian

uncertainty analysis) includes it. Plus, it gives a slightly

more conservative estimate of uncertainty.",

but honestly, in a quick search, I was not able to find any

literature not ignoring the extra "factor". But obviously, I

could very well be misunderstanding something.

Nonetheless, as `curve_fit` ignores it as well, I was

wondering whether those two shouldn't give consistent

results and if so, which would be the correct solution.

Best,

Jonathan

_______________________________________________

NumPy-Discussion mailing list

[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion