-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPCA did not converge, numpy.linalg.LinAlgError: SVD did not converge #15996
Comments
What problem are you trying to solve with such large matrices? |
Yes I understand, it is large matrix. The number of features is dynamic and can be really large in some cases (as discussed above) in our problem. We need to find the principal features and we are okay to have less variance ratio here. We are using Incremental PCA for memory optimization. Do you think, IPCA is not good for a large number of feature sets? I do not understand why does it give convergence issue . It should able to give features set with a lower variance ratio. IPCA does not have an option for variance ratio. LAPACK implementation should be triggered if it does not converge |
One thing worth looking at is if the approach you are using can be simplified, that is, improve the algorithm. The large array is somewhat suspicious in that regard. That is why I was asking for more details on what you were doing. |
Yes. The original matrix has this large array. We need to use Clustering on this data. Before we give this for the clustering algorithm, we use principal component analysis. Also, are you planning to add more options in Incremental PCA for SVD Solver {‘auto’, ‘full’, ‘arpack’, ‘randomized’}? |
Incremental PCA is a scikit-learn thing, not numpy. My naive thought is that incremental may not be the best approach here. What I am curious about is how the matrix is produced. |
Thanks for the reply. sorry, I just realized it, I am on NumPy GitHub. I should be asking this on sci-kit-learn. |
Closing as it seems the conversation has moved to another forum. Feel free to reopen if I've missed something. |
Incremental PCA is consistently giving convergence issue with dataframe of 18000, 18000
Reproducing code example:
Error message:
Trackback:
Traceback (most recent call last):
File "ipca_script.py", line 8, in
data_ipca = ipca.fit_transform(df_data)
File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/sklearn/base.py", line 553, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/sklearn/decomposition/incremental_pca.py", line 201, in fit
self.partial_fit(X[batch], check_input=False)
File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/sklearn/decomposition/incremental_pca.py", line 279, in partial_fit
U, S, V = linalg.svd(X, full_matrices=False)
File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/scipy/linalg/decomp_svd.py", line 132, in svd
Numpy/Python version information:
1.17.4 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
The text was updated successfully, but these errors were encountered: