gpt4 book ai didi

python - GridSearchCV 初始化

转载 作者:行者123 更新时间:2023-11-30 09:09:14 24 4
gpt4 key购买 nike

我想在一系列 alpha(拉普拉斯平滑参数)上使用 GridSearchCV 来检查哪个为伯努利朴素贝叶斯模型提供了最佳准确度。

def binarize_pixels(data, threshold=0.784):
# Initialize a new feature array with the same shape as the original data.
binarized_data = np.zeros(data.shape)

# Apply a threshold to each feature.
for feature in range(data.shape[1]):
binarized_data[:,feature] = data[:,feature] > threshold
return binarized_data

binarized_train_data = binarize_pixels(mini_train_data)

def BNB():
clf = BernoulliNB()
clf.fit(binarized_train_data, mini_train_labels)
scoring = clf.score(mini_train_data, mini_train_labels)
predsNB = clf.predict(dev_data)
print "Bernoulli binarized model accuracy: {:.4}".format(np.mean(predsNB == dev_labels))

模型运行良好,但我的 GridSearch 交叉验证却运行不佳:

pipeline = Pipeline([('classifier', BNB())])
def P8(alphas):
gs_clf = GridSearchCV(pipeline, param_grid = alphas, refit=True)
y_predictions = gs_clf.best_estimator_.predict(dev_data)
print classification_report(dev_labels, y_predictions)
alphas = {'alpha' : [0.0, 0.0001, 0.001, 0.01, 0.1, 0.5, 1.0, 2.0, 10.0]}
P8(alphas)

我收到 AttributeError: 'GridSearchCV' 对象没有属性 'best_estimator_'

最佳答案

问题出在以下两行:

gs_clf = GridSearchCV(pipeline, param_grid = alphas, refit=True)
y_predictions = gs_clf.best_estimator_.predict(dev_data)

请注意,在使用预测之前,您首先需要拟合模型。即调用 gs_clf.fit。请参阅 documentation 中的以下示例:

>>> from sklearn import svm, datasets
>>> from sklearn.model_selection import GridSearchCV
>>> iris = datasets.load_iris()
>>> parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
>>> svr = svm.SVC()
>>> clf = GridSearchCV(svr, parameters)
>>> clf.fit(iris.data, iris.target)
...
GridSearchCV(cv=None, error_score=...,
estimator=SVC(C=1.0, cache_size=..., class_weight=..., coef0=...,
decision_function_shape=None, degree=..., gamma=...,
kernel='rbf', max_iter=-1, probability=False,
random_state=None, shrinking=True, tol=...,
verbose=False),
fit_params={}, iid=..., n_jobs=1,
param_grid=..., pre_dispatch=..., refit=..., return_train_score=...,
scoring=..., verbose=...)
>>> sorted(clf.cv_results_.keys())
...
['mean_fit_time', 'mean_score_time', 'mean_test_score',...
'mean_train_score', 'param_C', 'param_kernel', 'params',...
'rank_test_score', 'split0_test_score',...
'split0_train_score', 'split1_test_score', 'split1_train_score',...
'split2_test_score', 'split2_train_score',...
'std_fit_time', 'std_score_time', 'std_test_score', 'std_train_score'...]

关于python - GridSearchCV 初始化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44479790/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com