python - 面对 ValueError : Target is multiclass but average ='binary'

转载作者：太空狗更新时间：2023-10-30 02:36:48

31

4

我是 Python 和机器学习的新手。根据我的要求，我正在尝试对我的数据集使用朴素贝叶斯算法。

我能够找出准确度，但试图找出精确度和召回率。但是，它抛出以下错误:

ValueError: Target is multiclass but average='binary'. Please choose another average setting.

任何人都可以建议我如何进行。我尝试在精度和召回分数中使用 average ='micro'。它没有任何错误，但它在准确性、精度和召回方面给出了相同的分数。

我的数据集:

train_data.csv:

review,label
Colors & clarity is superb,positive
Sadly the picture is not nearly as clear or bright as my 40 inch Samsung,negative

测试数据.csv:

review,label
The picture is clear and beautiful,positive
Picture is not clear,negative

我的代码:

from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn.metrics import confusion_matrix
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score


def load_data(filename):
    reviews = list()
    labels = list()
    with open(filename) as file:
        file.readline()
        for line in file:
            line = line.strip().split(',')
            labels.append(line[1])
            reviews.append(line[0])

    return reviews, labels

X_train, y_train = load_data('/Users/abc/Sep_10/train_data.csv')
X_test, y_test = load_data('/Users/abc/Sep_10/test_data.csv')

vec = CountVectorizer() 

X_train_transformed =  vec.fit_transform(X_train) 

X_test_transformed = vec.transform(X_test)

clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)

score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)

y_pred = clf.predict(X_test_transformed)
print(confusion_matrix(y_test,y_pred))

print("Precision Score : ",precision_score(y_test,y_pred,pos_label='positive'))
print("Recall Score :" , recall_score(y_test, y_pred, pos_label='positive') )

最佳答案

您需要添加 'average' 参数。根据the documentation :

average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]

This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:

这样做:

print("Precision Score : ",precision_score(y_test, y_pred, 
                                           pos_label='positive'
                                           average='micro'))
print("Recall Score : ",recall_score(y_test, y_pred, 
                                           pos_label='positive'
                                           average='micro'))

将 'micro' 替换为上述任一选项，但 'binary' 除外。此外，在多类设置中，无需提供 'pos_label'，因为它无论如何都会被忽略。

评论更新:

是的，它们可以相等。它在 user guide here 中给出:

Note that for “micro”-averaging in a multiclass setting with all labels included will produce equal precision, recall and F, while “weighted” averaging may produce an F-score that is not between precision and recall.

关于python - 面对 ValueError : Target is multiclass but average ='binary' ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52269187/

31

4

0

文章推荐： python - 为什么 os.path.normpath 不删除第一个//？

文章推荐： python - 将文件夹的多个 csv 文件加载到一个数据框中

文章推荐： python - 从索引到条件从 Pandas DataFrame 获取行

Python多处理池 'raise ValueError("池未运行“)ValueError : Pool not running' function with return value
我正在尝试并行运行具有循环返回值的函数。但它似乎停留在 results = pool.map(algorithm_file.foo, population) 在 for 循环的第二次迭代中 r
python - 引发 ValueError ("cannot have a multithreaded and multi process server.") ValueError : cannot have a multithreaded and multi process server
Serving Flask 应用程序“服务器”(延迟加载) 环境:生产警告:这是一个开发服务器。不要在生产部署中使用它。请改用生产 WSGI 服务器。 Debug模式:开启在 http://0.0.
python - 引发 ValueError ("Expected singleton: %s"% self) ValueError : Expected singleton: product. Pricelist()
我使用“product.pricelist”模型中的 get_product_price_rule() 函数。我的代码是: price = self._get_display_price(produ
Python valueError 使用 hstack() (ValueError : all the input array dimensions except for the concatenation axis must match exactly)
我收到以下错误: Traceback (most recent call last): File "/home/odroid/trackAndFollow/getPositions.py", line
machine-learning - 提高 ValueError ("Unknown label type: %s"% repr(ys)) ValueError : Unknown label type: (array
我正在尝试采用机器学习方法，但遇到了一些问题。这是我的代码: import sys import scipy import numpy import matplotlib import pandas
tensorflow 错误 "raise ValueError("形状 %s 和 %s 不兼容"% (self, other)) ValueError : Shapes (? , 5) and (5,) are not compatible"
我尝试使用 tensorflow 1.4.0 对我的原始记录进行分类。过程如下。拳头:读取图片和标签，输出“tfrecord”格式的文件。第二:读取tf记录和训练编写tfrecord脚本是 !/u
python - 引发 ValueError ("bad input shape {0}".format(shape)) ValueError : bad input shape (10, 90)
我是新手，所以需要任何帮助，当我要求一个例子时，我的教授给我了这段代码，我希望有一个工作模型...... from numpy import loadtxt import numpy as np fr
python - 无法使用 json、requests、BeautifulSoup : ValueError(errmsg ("Extra data", s、end、len(s)) 找出 ValueError
我无法弄清楚为什么会出现此 ValueError...为了提供一些上下文，我正在使用 requests、BeautifulSoup 和 json 与 python 来抓取站点 json 数据。我不确
Python List -- ValueError: invalid literal for int() with base 10: ' ' [duplicate](Python List--ValueError：基数为10的int()的文本无效：‘’[Duplate])
我已经尝试使用这两个循环以及列表理解。即使我正在尝试将数字转换为列表中的整型，两者都无法解析整数。
Python List -- ValueError: invalid literal for int() with base 10: ' ' [duplicate](Python List--ValueError：基数为10的int()的文本无效：‘’[Duplate])
我已经尝试使用这两个循环以及列表理解。即使我正在尝试将数字转换为列表中的整型，两者都无法解析整数。
python-3.x - Python 图像保存错误 - 从 e ValueError : unknown file extension: 引发 ValueError ("unknown file extension: {}".format(ext))
我只有四个星期的 Python 经验。使用 Tkinter 创建一个工具，将新的公司 Logo 粘贴到现有图像之上。下面的方法是获取给定目录中的所有图像并将新 Logo 粘贴到初始级别。现有图像、编
python-3.x - Python 图像保存错误 - 从 e ValueError : unknown file extension: 引发 ValueError ("unknown file extension: {}".format(ext))
我只有四个星期的 Python 经验。使用 Tkinter 创建一个工具，将新的公司 Logo 粘贴到现有图像之上。下面的方法是获取给定目录中的所有图像并将新 Logo 粘贴到初始级别。现有图像、编
python-3.x - Keras ValueError : ValueError: Error when checking target: expected dense_4 to have shape (None, 2) 但得到了形状为 (2592, 1) Python3 的数组
我在尝试在 Keras 2.0.8、Python 3.6.1 和 Tensorflow 后端中训练模型时遇到问题。错误消息: ValueError: Error when checking targ
Python List -- ValueError: invalid literal for int() with base 10: ' ' [duplicate](Python List -- ValueError：invalid literal for int（）with base 10：' ' [duplicate])
我已经尝试使用这两个循环以及列表理解。即使我正在尝试将数字转换为列表中的整型，两者都无法解析整数。
Python ValueError 是否可以在不进行字符串解析的情况下获得不正确的值？
我有这段代码: while True: try: start = int(input("Starting number: ")) fin = int(i
python - 初学者得到 ValueError
我是 python 的初学者编码员，试图制作一个“模具滚筒”，您可以在其中选择模具的大小，它在我的代码的第 20 行返回此错误 import sys import random import geto
python - 时间序列数据中的 ValueError
我有以下代码: import fxcmpy import pandas as pd from pandas import datetime from pandas import DataFrame a
python - ValueError at/(未设置所需的参数名称)
我正在尝试使用 django 和 python 制作一个博客应用程序。我也在尝试使用 s3 存储桶进行存储，使用 heroku 进行部署。我正在学习 coreymschafer 的在线教程。我正在按照
python - 更改订单后如何解决numpy ValueError？
我创建了一个 numpy 数组(考虑输入数据)并想更改顺序(一些数值运算后的输出数据)。在使用转换后的数组时，我遇到错误并找到了根本原因。请在下面找到详细信息并使用 numpy 版本 1.19.1 i
Python:ValueError:所有参数都应该具有相同的长度
我已经引用了之前的查询 All arguments should have the same length plotly但仍然没有得到我的问题的答案。我有一个黄金价格数据集。 Date

首页

博学

6Ren·AI

商城

python - 面对 ValueError : Target is multiclass but average ='binary'

我的数据集:

train_data.csv:

测试数据.csv:

我的代码: