gpt4 book ai didi

python - 使用sklearn时Python出现内存错误

转载 作者:行者123 更新时间:2023-12-01 09:31:23 25 4
gpt4 key购买 nike

我正在尝试测试我的逻辑回归模型,但出现内存错误并且无法解决它。是因为我的句子占了太多篇幅吗?我将不胜感激任何帮助。

来 self 代码中的第 267 行:

self.X, self.y = self.transform_to_dataset(training_sentences,_pos__sentences)
self.clf = Pipeline([
('vectorizer', DictVectorizer(sparse=False)),
('classifier', LogisticRegression())])
self.clf.fit(self.X, self.y)

运行此命令后出现的错误:

Traceback (most recent call last):
File "tagger_lr_chunk.py", line 342, in <module>
tagger.train(data_dir + 'train.txt')
File "tagger_lr_chunk.py", line 271, in train
self.clf.fit(self.X, self.y)
File "/home/selub/anaconda2/lib/python2.7/site-packages/sklearn/pipeline.py", line 248, in fit
Xt, fit_params = self._fit(X, y, **fit_params)
File "/home/selub/anaconda2/lib/python2.7/site-packages/sklearn/pipeline.py", line 213, in _fit
**fit_params_steps[name])
File "/home/selub/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py", line 362, in __call__
return self.func(*args, **kwargs)
File "/home/selub/anaconda2/lib/python2.7/site-packages/sklearn/pipeline.py", line 581, in _fit_transform_one
res = transformer.fit_transform(X, y, **fit_params)
File "/home/selub/anaconda2/lib/python2.7/site-packages/sklearn/feature_extraction/dict_vectorizer.py", line 230, in fit_transform
return self._transform(X, fitting=True)
File "/home/selub/anaconda2/lib/python2.7/site-packages/sklearn/feature_extraction/dict_vectorizer.py", line 204, in _transform
result_matrix = result_matrix.toarray()
File "/home/selub/anaconda2/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 943, in toarray
out = self._process_toarray_args(order, out)
File "/home/selub/anaconda2/lib/python2.7/site-packages/scipy/sparse/base.py", line 1130, in _process_toarray_args
return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError

最佳答案

我通过更改 DictVectorizer 的参数解决了这个内存问题,以便生成 scipy.sparse 矩阵

self.X, self.y = self.transform_to_dataset(training_sentences,_pos__sentences)
self.clf = Pipeline([
('vectorizer', DictVectorizer(sparse=True)),
('classifier', LogisticRegression())])
self.clf.fit(self.X, self.y)

关于python - 使用sklearn时Python出现内存错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49949224/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com