gpt4 book ai didi

python - 计算学习算法的 yScore

转载 作者:行者123 更新时间:2023-11-30 09:41:05 25 4
gpt4 key购买 nike

我对 ML python 环境相当陌生,我需要绘制精度/召回图,如这篇文章所述:[ https://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html][1]您需要计算 y_score :

    # Create a simple classifier
classifier = svm.LinearSVC(random_state=random_state)
classifier.fit(X_train, y_train)
y_score = classifier.decision_function(X_test)

所以问题是:如何使用 Multinomial NaiveBayes 或 LearningTree 计算分数?在我的代码中我有:

 print("MultinomialNB - countVectorizer")

xTrain, xTest, yTrain, yTest=countVectorizer(db)

classifier = MultinomialNB()
model = classifier.fit(xTrain, yTrain)
yPred = model.predict(xTest)

print("confusion Matrix of MNB/ cVectorizer:\n")
print(confusion_matrix(yTest, yPred))
print("\n")
print("classificationReport Matrix of MNB/ cVectorizer:\n")
print(classification_report(yTest, yPred))

elapsed_time = time.time() - start_time
print("elapsed Time: %.3fs" %elapsed_time)

绘图功能:

def plotLearningAlgorithm(yTest,yScore,algName):

precision, recall, _ = precision_recall_curve(yTest, yScore)

plt.step(recall, precision, color='b', alpha=0.2,
where='post')
plt.fill_between(recall, precision, alpha=0.2, color='b', **step_kwargs)

plt.xlabel('Recall')
plt.ylabel('Precision')
plt.ylim([0.0, 1.05])
plt.xlim([0.0, 1.0])
plt.title('2-class Precision-Recall'+ algName +'curve: AP={0:0.2f}'.format(average_precision))

绘图错误:

<ipython-input-43-d07c3365bfc2> in MultinomialNaiveBayesOPT()
11 yPred = model.predict(xTest)
12
---> 13 plotLearningAlgorithm(yTest,model.predict_proba(xTest),"MultinomialNB - countVectorizer")
14
15 print("confusion Matrix of MNB/ cVectorizer:\n")

<ipython-input-42-260aac9918f2> in plotLearningAlgorithm(yTest, yScore, algName)
1 def plotLearningAlgorithm(yTest,yScore,algName):
2
----> 3 precision, recall, _ = precision_recall_curve(yTest, yScore)
4
5 step_kwargs = ({'step': 'post'}

/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/ranking.py in precision_recall_curve(y_true, probas_pred, pos_label, sample_weight)
522 fps, tps, thresholds = _binary_clf_curve(y_true, probas_pred,
523 pos_label=pos_label,
--> 524 sample_weight=sample_weight)
525
526 precision = tps / (tps + fps)

/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/ranking.py in _binary_clf_curve(y_true, y_score, pos_label, sample_weight)
398 check_consistent_length(y_true, y_score, sample_weight)
399 y_true = column_or_1d(y_true)
--> 400 y_score = column_or_1d(y_score)
401 assert_all_finite(y_true)
402 assert_all_finite(y_score)

/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in column_or_1d(y, warn)
758 return np.ravel(y)
759
--> 760 raise ValueError("bad input shape {0}".format(shape))
761
762

ValueError: bad input shape (9000, 2)

其中 db 包含已划分为训练集和测试集的数据集。有什么建议吗?

解决方案:

def plot_pr(y_pred,y_true,l):
precision, recall, thresholds = precision_recall_curve(y_true, y_pred,pos_label=l)
return precision,recall


def plotPrecisionRecall(xTest,yTest,yPred,learningName,model):
yPred_probability = model.predict_proba(xTest)
yPred_probability = yPred_probability[:,1];
no_skill_probs = [0 for _ in range(len(yTest))]
ns_precision,ns_recall,_=precision_recall_curve(yTest,no_skill_probs,pos_label="L")
precision, rec= plot_pr(yPred_probability,yTest,"L");
plt.title(learningName)
plt.plot(ns_recall,ns_precision,linestyle='--',label='No Skill')
plt.plot(rec,precision,Label='Skill')
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.legend()
plt.show()

事实证明 y_Pred 需要转换为:

yPred_probability = yPred_probability[:,1];

非常感谢@ignoring_gravity为我提供了正确的解决方案,我还打印了无技能行以提高图表的可读性。

最佳答案

他们所谓的 y_score 只是您的 ML 算法输出的预测概率。

在多项式nb和决策树中(我想这就是LearningTree的意思?),您可以使用方法.predict_proba来做到这一点:

    classifier = MultinomialNB()
model = classifier.fit(xTrain, yTrain)
yPred = model.predict_proba(xTest)

关于python - 计算学习算法的 yScore,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58660239/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com