gpt4 book ai didi

python - 使用 hstack 时矩阵格式错误?

转载 作者:太空宇宙 更新时间:2023-11-04 10:02:01 24 4
gpt4 key购买 nike

我有以下矩阵:

>>> X1
shape: (2399, 39999)
type: scipy.sparse.csr.csr_matrix

>> X2
shape: (2399, 333534)
type: scipy.sparse.csr.csr_matrix

>>>X3.reshape(-1,1)
shape: (2399, 1)
type: <class 'numpy.ndarray'>

如何在右侧连接 X1 和 X2 以生成具有以下形状的新矩阵:(2399, 373534)。我知道这可以用 scipy 的 hstack 来完成或 vstack .但是,当我尝试:

X_combined = sparse.hstack([X1,X2,X3.T])

但是,我得到了一个格式错误的最终矩阵:

ValueError: all the input array dimensions except for the concatenation axis must match exactly

因此,如何在单个矩阵中正确连接?

更新

from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer(min_df=5)
X1 = count_vect.fit_transform(X)

from sklearn.feature_extraction.text import TfidfVectorizer
tdidf_vect = TfidfVectorizer()
X2 = tdidf_vect.fit_transform(X)

from hdbscan import HDBSCAN
clusterer = HDBSCAN().fit(X1)
X3 = clusterer.labels_
print(X3.shape)
print(type(X3))

然后:

在:

import scipy as sparse

X_combined = sparse.hstack([X1,X2,X3.reshape(-1,1)])

输出:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-14baa47e0993> in <module>()
5
6
----> 7 X_combined = sparse.hstack([X1,X2,X3.reshape(-1,1)])

/usr/local/lib/python3.5/site-packages/numpy/core/shape_base.py in hstack(tup)
284 # As a special case, dimension 0 of 1-dimensional arrays is "horizontal"
285 if arrs[0].ndim == 1:
--> 286 return _nx.concatenate(arrs, 0)
287 else:
288 return _nx.concatenate(arrs, 1)

ValueError: all the input arrays must have same number of dimensions

最佳答案

问题是你的import,应该是

from scipy import sparse

顶级 scipy 模块(通常你不应该使用顶级 scipy 模块)导入了 numpy 函数,所以当你尝试你的版本时:

>>> import scipy as sparse
>>> sparse.hstack
<function numpy.core.shape_base.hstack>

>>> # incorrect! Correct would be

>>> from scipy import sparse
>>> sparse.hstack
<function scipy.sparse.construct.hstack>

这在他们的documentation中都提到了:

The scipy namespace itself only contains functions imported from numpy. These functions still exist for backwards compatibility, but should be imported from numpy directly.

Everything in the namespaces of scipy submodules is public. In general, it is recommended to import functions from submodule namespaces.

关于python - 使用 hstack 时矩阵格式错误?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43075235/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com