gpt4 book ai didi

python - 改进 Python SKLearn 交叉验证输出

转载 作者:太空宇宙 更新时间:2023-11-03 16:34:25 24 4
gpt4 key购买 nike

我使用下面的方法来获取估计器的输出。

有没有更快的方法来使用分数进行交叉验证?

for clfx, label in zip([clf0], ['Random Forest']):
scores = cross_validation.cross_val_score(clfx, X, y, cv=5, scoring='accuracy')
print "Accuracy : %0.3f (+/- %0.2f) [%s]" % (scores.mean(), scores.std(), label)
scores = cross_validation.cross_val_score(clfx, X, y, cv=5, scoring='precision')
print "Precision: %0.3f (+/- %0.2f) [%s] " % (scores.mean(), scores.std(), label)
scores = cross_validation.cross_val_score(clfx, X, y, cv=5, scoring='recall')
print "Recall : %0.3f (+/- %0.2f) [%s] \n" % (scores.mean(), scores.std(), label)

输出:

Accuracy : 0.82 (+/- 0.00) [Random Forest]
Precision: 0.50 (+/- 0.02) [Random Forest]
Recall : 0.13 (+/- 0.01) [Random Forest]

这是否太过分了,我应该使用一次运行中的混淆矩阵?

最佳答案

不幸的是,如果您想组合指标,我认为您必须“手动”运行交叉验证迭代:

from sklearn.metrics import precision_score, accuracy_score, recall_score
from sklearn.cross_validation import KFold

all_scores = {'precision':[], 'recall':[], 'accuracy': []}
for train, test in KFold(n = len(X)):
clfx.fit(X[train, :],y[train])
y_pred = clfx.predict(X[test])
all_scores['precision'] += precision_score(y_pred, y[test])
all_scores['accuracy'] += accuracy_score(y_pred, y[test])
all_scores['recall'] += recall_score(y_pred, y[test])

scores = all_scores['accuracy']
print ("Accuracy : %0.3f (+/- %0.2f) [%s]" % (np.mean(scores), np.std(scores), label))
scores = all_scores['precision']
print ("Precision: %0.3f (+/- %0.2f) [%s] " % (np.mean(scores), np.std(scores), label))
scores = all_scores['recall']
print ("Recall : %0.3f (+/- %0.2f) [%s] \n" % (np.mean(scores), np.std(scores), label))

如果您愿意,还可以使用multiprocess 来并行化(这是使用 scikit-learn 交叉验证函数的主要优点之一):

from multiprocessing import Pool

def score(cv_split, clfx=clfx, X=X, y=y):
train, test = cv_split
clfx.fit(X[train, :],y[train])
y_pred = clfx.predict(X[test])
all_scores = {}
all_scores['precision'] = precision_score(y_pred, y[test])
all_scores['accuracy'] = accuracy_score(y_pred, y[test])
all_scores['recall'] = recall_score(y_pred, y[test])
return all_scores

p = Pool(6)
scores_by_run = p.map(score, KFold(len(X)))
all_scores = {k:[d[k] for d in scores_by_run] for k in scores_by_run[0].keys()}

关于python - 改进 Python SKLearn 交叉验证输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37334631/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com