gpt4 book ai didi

python - 文本分析-无法将Python程序的输出写入csv或xls文件

转载 作者:太空宇宙 更新时间:2023-11-03 15:17:13 25 4
gpt4 key购买 nike

嗨,我正在尝试使用 python 2.x 中的朴素贝叶斯分类器进行情感分析。它使用 txt 文件读取情绪,然后根据样本 txt 文件情绪给出积极或消极的输出。我希望输出与输入的形式相同,例如我有一个包含 1000 个原始情绪的文本文件,我希望输出显示每种情绪的积极或消极。请帮忙。下面是我正在使用的代码

import math
import string

def Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string):
y_values = [0,1]
prob_values = [None, None]

for y_value in y_values:
posterior_prob = 1.0

for word in test_string.split():
word = word.lower().translate(None,string.punctuation).strip()
if y_value == 0:
if word not in negative:
posterior_prob *= 0.0
else:
posterior_prob *= negative[word]
else:
if word not in positive:
posterior_prob *= 0.0
else:
posterior_prob *= positive[word]

if y_value == 0:
prob_values[y_value] = posterior_prob * float(total_negative) / (total_negative + total_positive)
else:
prob_values[y_value] = posterior_prob * float(total_positive) / (total_negative + total_positive)

total_prob_values = 0
for i in prob_values:
total_prob_values += i

for i in range(0,len(prob_values)):
prob_values[i] = float(prob_values[i]) / total_prob_values

print prob_values

if prob_values[0] > prob_values[1]:
return 0
else:
return 1


if __name__ == '__main__':
sentiment = open(r'C:/Users/documents/sample.txt')

#Preprocessing of training set
vocabulary = {}
positive = {}
negative = {}
training_set = []
TOTAL_WORDS = 0
total_negative = 0
total_positive = 0

for line in sentiment:
words = line.split()
y = words[-1].strip()
y = int(y)

if y == 0:
total_negative += 1
else:
total_positive += 1

for word in words:
word = word.lower().translate(None,string.punctuation).strip()
if word not in vocabulary and word.isdigit() is False:
vocabulary[word] = 1
TOTAL_WORDS += 1
elif word in vocabulary:
vocabulary[word] += 1
TOTAL_WORDS += 1

#Training
if y == 0:
if word not in negative:
negative[word] = 1
else:
negative[word] += 1
else:
if word not in positive:
positive[word] = 1
else:
positive[word] += 1

for word in vocabulary.keys():
vocabulary[word] = float(vocabulary[word])/TOTAL_WORDS

for word in positive.keys():
positive[word] = float(positive[word])/total_positive

for word in negative.keys():
negative[word] = float(negative[word])/total_negative

test_string = raw_input("Enter the review: \n")

classifier = Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string)
if classifier == 0:
print "Negative review"
else:
print "Positive review"

最佳答案

我已经检查了您在评论中发布的 github 存储库。我尝试运行该项目,但出现一些错误。

无论如何,我已经检查了项目结构和用于训练朴素贝叶斯算法的文件,我认为以下代码可用于将结果数据写入 Excel 文件(即 .xls)

with open("test11.txt") as f:
for line in f:
classifier = naive_bayes_classifier(positive, negative, total_negative, total_positive, line)
result = 'Positive' if classifier == 0 else 'Negative'
data_to_be_written += ([line, result],)

# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('test.xls')
worksheet = workbook.add_worksheet()

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0

# Iterate over the data and write it out row by row.
for item, cost in data_to_be_written:
worksheet.write(row, col, item)
worksheet.write(row, col + 1, cost)
row += 1

workbook.close()

接下来,对于包含要测试的句子的文件的每一行,我调用分类器并准备一个将写入 csv 文件中的结构。
然后循环结构体并写入xls文件。
为此,我使用了一个名为 xlsxwriter 的 Python 站点包。

正如我之前告诉过你的,我在运行该项目时遇到了一些问题,因此这段代码也没有经过测试。它应该运作良好,但是无论如何,如果您遇到麻烦,请告诉我。

问候

关于python - 文本分析-无法将Python程序的输出写入csv或xls文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43779723/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com