作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在尝试对双聚类进行建模,但它失败了,因为它说数组包含 infs
和 nans
,尽管我使用 pd.isnull(DataFile) 扫描了数组).sum()
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.datasets import samples_generator as sg
from sklearn.cluster.bicluster import SpectralCoclustering
from sklearn.metrics import consensus_score
DataFile=pd.read_csv("DatafilledProp.csv",sep='\t')
DataFile.drop(DataFile.columns[[0, 1]], axis=1, inplace=True)
plt.matshow(DataFile.as_matrix(), cmap=plt.cm.Blues)
plt.title("Original TransMapping")
data, row_idx, col_idx = sg._shuffle(DataFile.as_matrix(), random_state=0)
plt.matshow(data, cmap=plt.cm.Blues)
plt.title("Shuffled dataset")
plt.show()
Features=DataFile.values
model = SpectralCoclustering(n_clusters=10, random_state=0)
model.fit(Features)
这是我得到的错误:
File "C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\Extensio
ns\Microsoft\Python Tools for Visual Studio\2.1\visualstudio_py_util.py", line 1 06, in exec_file
exec_code(code, file, global_variables)
File "C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\Extensio
ns\Microsoft\Python Tools for Visual Studio\2.1\visualstudio_py_util.py", line 8
2, in exec_code
exec(code_obj, global_variables)
File "D:\ClusteringDemo\DataPreparation.py\DataPreparation.py\Model.py", line
19, in <module>
model.fit(Features)
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\sklearn\cluster\bicluster\spectral.py", line 126, in fit
self._fit(X)
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\sklearn\cluster\bicluster\spectral.py", line 275, in _fit
u, v = self._svd(normalized_data, n_sv, n_discard=1)
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\sklearn\cluster\bicluster\spectral.py", line 139, in _svd
**kwargs)
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\sklearn\utils\extmath.py", line 299, in randomized_svd
Q = randomized_range_finder(M, n_random, n_iter, random_state)
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\sklearn\utils\extmath.py", line 226, in randomized_range_finder
Q, R = linalg.qr(Y, mode='economic')
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\scipy\linalg\decomp_qr.py", line 127, in qr
a1 = numpy.asarray_chkfinite(a)
File "C:\Users\vinay.sawant\AppData\Local\Continuum\Anaconda\lib\site-packages
\numpy\lib\function_base.py", line 613, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
Press any key to continue .
最佳答案
此问题已在这里得到解答:https://stackoverflow.com/a/42764378/2649309
It could be problem with PCA implementation in scikit-learn 0.18.1.
See a bug report https://github.com/scikit-learn/scikit-learn/issues/7568
Described workaround is to use PCA with svd_solver='full'. So try this code:
pipe = [('pca',PCA(whiten=True,svd_solver='full')),
('clf' ,lm)]
我能够通过此解决我的问题。
关于python - 值错误: array must not contain infs or NaNs during Biclustering,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35774081/
我正在尝试对双聚类进行建模,但它失败了,因为它说数组包含 infs 和 nans,尽管我使用 pd.isnull(DataFile) 扫描了数组).sum() import pandas as pd
我是一名优秀的程序员,十分优秀!