gpt4 book ai didi

python-2.7 - SKlearn : load trainning data by reading multiple files in a directory

转载 作者:行者123 更新时间:2023-12-04 18:35:18 24 4
gpt4 key购买 nike

我可以毫无问题地从单个文件输入测试数据。但是,每当我尝试从目录中的多个文件输入数据时,都会收到以下错误:AttributeError: 'NoneType' object has no attribute 'lower'。请在下面查看我的代码,我将不胜感激。谢谢。

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from nltk.corpus import stopwords
import numpy as np
import numpy.linalg as LA

import os
path = "C:\zircon"

def radfil():
for file in os.listdir(path):
current = os.path.join(path, file)
if os.path.isfile(current):
data = open(current, "rb").read()
print data

train_set = [radfil()]
test_set = ["The sun in the sky is bright."]
stopWords = stopwords.words('english')

vectorizer = CountVectorizer(stop_words=stopWords, min_df=1)
#print vectorizer
transformer = TfidfTransformer()
#print transformer

trainVectorizerArray = vectorizer.fit_transform(train_set).toarray()
testVectorizerArray = vectorizer.transform(test_set).toarray()
print 'Fit Vectorizer to train set', trainVectorizerArray
print 'Transform Vectorizer to test set', testVectorizerArray

最佳答案

我猜你的错误是由于试图在 None 类型的变量中执行 lower() 操作引起的。也许这发生在

trainVectorizerArray = vectorizer.fit_transform(train_set).toarray()

radfil() 将返回 None 类型。尝试组合来自文件的数据并向 radfil() 添加 return 语句。
这就是我可以在没有完整堆栈跟踪的情况下做的所有事情。

关于python-2.7 - SKlearn : load trainning data by reading multiple files in a directory,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17425842/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com