gpt4 book ai didi

python - sklearn - 具有多个分数的交叉验证

转载 作者:IT老高 更新时间:2023-10-28 22:12:39 26 4
gpt4 key购买 nike

我想计算不同分类器的交叉验证测试的 recallprecisionf-measurescikit-learn 自带 cross_val_score但不幸的是,这种方法不会返回多个值。

我可以通过调用 3 次 cross_val_score 来计算此类度量,但这并不高效。有没有更好的解决方案?

现在我写了这个函数:

from sklearn import metrics

def mean_scores(X, y, clf, skf):

cm = np.zeros(len(np.unique(y)) ** 2)
for i, (train, test) in enumerate(skf):
clf.fit(X[train], y[train])
y_pred = clf.predict(X[test])
cm += metrics.confusion_matrix(y[test], y_pred).flatten()

return compute_measures(*cm / skf.n_folds)

def compute_measures(tp, fp, fn, tn):
"""Computes effectiveness measures given a confusion matrix."""
specificity = tn / (tn + fp)
sensitivity = tp / (tp + fn)
fmeasure = 2 * (specificity * sensitivity) / (specificity + sensitivity)
return sensitivity, specificity, fmeasure

它基本上总结了混淆矩阵的值,一旦你有 false positivefalse positive 等,你可以轻松计算召回率、精度等......但我仍然不喜欢这个解决方案:)

最佳答案

现在在 scikit-learn 中:cross_validate 是一个新函数,可以根据多个指标评估模型。GridSearchCVRandomizedSearchCV (doc) 也提供此功能。一直是merged recently in master并将在 v0.19 中提供。

来自 scikit-learn doc :

The cross_validate function differs from cross_val_score in two ways: 1. It allows specifying multiple metrics for evaluation. 2. It returns a dict containing training scores, fit-times and score-times in addition to the test score.

典型用例如下:

from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_validate
iris = load_iris()
scoring = ['precision', 'recall', 'f1']
clf = SVC(kernel='linear', C=1, random_state=0)
scores = cross_validate(clf, iris.data, iris.target == 1, cv=5,
scoring=scoring, return_train_score=False)

另见 this example .

关于python - sklearn - 具有多个分数的交叉验证,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23339523/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com