gpt4 book ai didi

python - 设计朴素贝叶斯分类器时出现属性错误

转载 作者:太空宇宙 更新时间:2023-11-03 14:11:14 25 4
gpt4 key购买 nike

我正在尝试创建一个简单的朴素贝叶斯分类器,用于对两个类之间的数据进行分类,如下面的代码中所述。但我遇到了以下错误,任何人都可以告诉我我做错了什么吗?

Traceback (most recent call last):
File "NBC.py", line 33, in <module>
test(['Apple', 'Banana'])
File "NBC.py", line 16, in test
prob_dist = classifier.prob_classify(lst)
File "/home/***/.local/lib/python3.6/site-packages/nltk/classify/naivebayes.py", line 95, in prob_classify
for fname in list(featureset.keys()):
AttributeError: 'list' object has no attribute 'keys'

“NBC.py”

from nltk.classify import NaiveBayesClassifier

dataFruits = ['Apple', 'Banana', 'Cherry', 'Grape', 'Guava',
'Lemon', 'Mangos', 'Orange', 'Strawberry', 'Watermelon']

dataVeggies = ['Potato', 'Spinach', 'Carrot', 'Onion', 'Cabbage',
'Barccoli', 'Tomatoe', 'Pea', 'Cucumber', 'Eggplant']

def create_features(word):
my_dict = dict([(word, True)])
return my_dict

def test(words):
lst = [create_features(wd) for wd in words]

prob_dist = classifier.prob_classify(lst)
print(prob_dist.prob('fruit'))

class1= [(create_features(item), 'fruit') for item in dataFruits]
#print(class1)

class2 = [(create_features(item), 'veggie') for item in dataVeggies]
#print(class2)

train_set = class1[:] + class2
print(train_set)

# Train
classifier = NaiveBayesClassifier.train(train_set)


# Predict
test(['Apple', 'Banana'])

最佳答案

您的代码试图做的是构建一个基于名称特征的非常简单的分类器。根据其名称,项目将被分类为'fruit''veggie'。训练集包含一些名称及其各自的类别。

您收到的错误是由于训练集和测试集的格式错误造成的。训练集是一个特征集列表(每个训练示例一个特征集),并且应该具有以下形式的结构:

training_set = [featureset1, featureset2, ...]

每个功能集都是一个 (features, class),其中features是一个字典

{'f1': value1, 'f2': value2, ...}

class是一些值。例如,在您的分类器中,'Apple' 的特征集是:

({'Apple': True,
'Banana': False,
'Broccoli': False,
'Cabbage': False,
'Carrot': False,
'Cherry': False,
'Cucumber': False,
'Eggplant': False,
'Grape': False,
'Guava': False,
'Lemon': False,
'Mangos': False,
'Onion': False,
'Orange': False,
'Pea': False,
'Potato': False,
'Spinach': False,
'Strawberry': False,
'Tomato': False,
'Watermelon': False},
'fruit')

这是更正后的代码:

from nltk.classify import NaiveBayesClassifier, accuracy

dataFruits = ['Apple', 'Banana', 'Cherry', 'Grape', 'Guava',
'Lemon', 'Mangos', 'Orange', 'Strawberry', 'Watermelon']

dataVeggies = ['Potato', 'Spinach', 'Carrot', 'Onion', 'Cabbage',
'Broccoli', 'Tomato', 'Pea', 'Cucumber', 'Eggplant']

def create_features(word, featureNames):
my_dict = dict([(w, False) for w in featureNames])
my_dict[word] = True
return my_dict

def test(word):
lst = create_features(word, allFeatures)
prob_dist = classifier.prob_classify(lst)
print('{}'.format(word))
print('Fruit probability: {:.2f}\tVeggie probability: {:.2f}'.format( prob_dist.prob('fruit'), prob_dist.prob('veggie')))
return prob_dist

allFeatures = dataFruits + dataVeggies
class1= [(create_features(item, allFeatures), 'fruit') for item in dataFruits]

class2 = [(create_features(item, allFeatures), 'veggie') for item in dataVeggies]

train_set = class1[:] + class2
test_set = [(create_features(item, allFeatures), 'fruit') for item in ['Apple','Banana']]

# Train
classifier = NaiveBayesClassifier.train(train_set)


# Predict
test('Strawberry')
test('Strawby')

# Accuracy on test set
print('Accuracy on test set: {:.2f}'.format(accuracy(classifier, test_set)))

一个稍微好一点的分类器,也许这就是您所想到的(沿着http://www.nltk.org/book/ch06.html(文档分类)中的示例)。这里分类器只是预测篮子中是否包含更多水果或蔬菜。基于此您可以构建更复杂的分类器(具有更好的特征和更多的训练数据)。

from nltk.classify import NaiveBayesClassifier, accuracy

dataFruits = ['Apple', 'Banana', 'Cherry', 'Grape', 'Guava',
'Lemon', 'Mangos', 'Orange', 'Strawberry', 'Watermelon']

dataVeggies = ['Potato', 'Spinach', 'Carrot', 'Onion', 'Cabbage',
'Broccoli', 'Tomato', 'Pea', 'Cucumber', 'Eggplant']


def basket_features(basket):
basket_items = set(basket)
features = {}
for item in allFeatures:
features['contains({})'.format(item)] = (item in basket_items)
return features

def test(basket):
lst = basket_features(basket)
prob_dist = classifier.prob_classify(lst)
print('Basket: {}'.format(basket))
print('Fruit probability: {:.2f}\tVeggie probability: {:.2f}'.format(prob_dist.prob('fruit'), prob_dist.prob('veggie')))
return prob_dist

allFeatures = dataFruits + dataVeggies
class1= [(basket_features([item]), 'fruit') for item in dataFruits]

class2 = [(basket_features([item]), 'veggie') for item in dataVeggies]

train_set = class1[:] + class2

# Train
classifier = NaiveBayesClassifier.train(train_set)


# Predict
test(['Apple', 'Banana', 'Cherry', 'Carrot', 'Eggplant', 'Cabbage','Pea'])
test(['Apple', 'Banana', 'Mangos', 'Carrot', 'Eggplant', 'Cabbage','Pea', 'Cucumber'])
test(['Apple', 'Banana'])
test(['Apple', 'Banana', 'Grape'])

classifier.show_most_informative_features(5)

关于python - 设计朴素贝叶斯分类器时出现属性错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48479867/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com