Strange behaviour of linalg.svd() and linalg.eigh()

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange behaviour of linalg.svd() and linalg.eigh()

Matthieu Brucher-2
Hi,

Ive implemented the classical MultiDimensional Scaling for the scikit learn using both functions. Their behavior surprised me for "big" arrays (10000 by 10000, symmetric as it is a similarity matrix).
linalg.svd() raises a memory error because it tries to allocate a (7000000,) array (in fact bigger than that !). This is strange because the test was made on a 64bits Linux, so memory should not have been a problem.
linalg.eigh() fails to diagonalize the matrix, it gives me NaN as a result, and this is not very useful.
A direct optimization of the underlying cost function can give me an adequate solution.

I cannot attach the matrix file (more than 700MB when pickled), but if anyone has a clue, I'll be glad.

Matthieu
--
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Strange behaviour of linalg.svd() and linalg.eigh()

Matthieu Brucher-2
Hi,

I tried using Matlab with the same matrix and its eig() function. It can diagonalize the matrix with a correct result, which is not the case for linalg.eigh().
Strange.

Matthieu

2008/4/17 Matthieu Brucher <[hidden email]>:
Hi,

Ive implemented the classical MultiDimensional Scaling for the scikit learn using both functions. Their behavior surprised me for "big" arrays (10000 by 10000, symmetric as it is a similarity matrix).
linalg.svd() raises a memory error because it tries to allocate a (7000000,) array (in fact bigger than that !). This is strange because the test was made on a 64bits Linux, so memory should not have been a problem.
linalg.eigh() fails to diagonalize the matrix, it gives me NaN as a result, and this is not very useful.
A direct optimization of the underlying cost function can give me an adequate solution.

I cannot attach the matrix file (more than 700MB when pickled), but if anyone has a clue, I'll be glad.

Matthieu
--
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher



--
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion