gpt4 book ai didi

python - 尝试在 scikit-learn 中并行化参数搜索会导致 "SystemError: NULL result without error in PyObject_Call"

转载 作者:太空宇宙 更新时间:2023-11-03 18:21:50 25 4
gpt4 key购买 nike

我使用 scikit-learn 14.1 中的 sklearn.grid_search.RandomizedSearchCV 类,在运行以下代码时出现错误:

X, y = load_svmlight_file(inputfile)

min_max_scaler = preprocessing.MinMaxScaler()
X_scaled = min_max_scaler.fit_transform(X.toarray())

parameters = {'kernel':'rbf', 'C':scipy.stats.expon(scale=100), 'gamma':scipy.stats.expon(scale=.1)}

svr = svm.SVC()

classifier = grid_search.RandomizedSearchCV(svr, parameters, n_jobs=8)
classifier.fit(X_scaled, y)

当我将 n_jobs 参数设置为大于 1 时,我收到以下错误输出:

Traceback (most recent call last):
File "./svm_training.py", line 185, in <module>
main(sys.argv[1:])
File "./svm_training.py", line 63, in main
gridsearch(inputfile, kerneltype, parameterfile)
File "./svm_training.py", line 85, in gridsearch
classifier.fit(X_scaled, y)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.14.1-py2.7-linux- x86_64.egg/sklearn/grid_search.py", line 860, in fit
return self._fit(X, y, sampled_params)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.14.1-py2.7-linux-x86_64.egg/sklearn/grid_search.py", line 493, in _fit
for parameters in parameter_iterable
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.14.1-py2.7-linux-x86_64.egg/sklearn/externals/joblib/parallel.py", line 519, in __call__
self.retrieve()
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.14.1-py2.7-linux-x86_64.egg/sklearn/externals/joblib/parallel.py", line 419, in retrieve
self._output.append(job.get())
File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
SystemError: NULL result without error in PyObject_Call

这似乎与 python 多处理功能有关,但除了手动实现参数搜索的并行化之外,我不确定如何解决它。有没有人在尝试并行化随机参数搜索时遇到过类似的问题并且能够解决?

最佳答案

事实证明问题出在 MinMaxScaler 的使用上。由于 MinMaxScaler 只接受密集数组,因此我在缩放之前将特征向量的稀疏表示转换为密集数组。由于特征向量有数千个元素,我的假设是,在尝试并行化参数搜索时,密集数组会导致内存错误。相反,我切换到 StandardScaler,它接受稀疏数组作为输入,而且应该更适合我的问题空间。

关于python - 尝试在 scikit-learn 中并行化参数搜索会导致 "SystemError: NULL result without error in PyObject_Call",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23963542/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com