gpt4 book ai didi

python - 带矢量化器的腌制模型

转载 作者:行者123 更新时间:2023-11-30 09:47:59 24 4
gpt4 key购买 nike

我正在腌制一个模型以供以后使用。然后加载模型并在其上运行 predict_proba。我得到 ValueError: X has 1 features per example;期待 319。不确定我是否正确转换它

import csv, pickle
from sklearn import svm

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.calibration import CalibratedClassifierCV
import numpy as np
import operator

train_data = []
train_labels = []
test_lables = []
test_lables.append("nah")

with open('training_file', 'r') as f:
reader = csv.reader(f, dialect='excel', delimiter='\t')
for row in reader:
train_data.append(row[0])
train_labels.append(row[1])

lables = []

for item in train_labels:
if item in lables:
continue
else:
lables.append(item)


def linear_svc(train_data, train_labels):

vectorizer = TfidfVectorizer()
train_vectors = vectorizer.fit_transform(train_data)
classifier_linear = svm.LinearSVC()
clf = CalibratedClassifierCV(classifier_linear)
clf.fit(train_vectors, train_labels)

with open('test', 'wb') as fi:
pickle.dump(clf, fi)


def run_classifier():
vectorizer = TfidfVectorizer()
test_vectors = vectorizer.fit_transform(test_lables)
with open('test', 'rb') as fi:
clf = pickle.load(fi)
prediction_linear = clf.predict_proba(test_vectors)
return prediction_linear


#linear_svc(train_data, train_labels)
sorted_intent_probability = run_classifier()
print(sorted_intent_probability)

我首先调用linear_svc()方法。模型被腌制。然后我调用run_classifier()。我在这里做错了什么?另外,当我结合这两种方法时,它工作得很好:

def linear_svc(train_data, train_labels, test_lables):

vectorizer = TfidfVectorizer()
train_vectors = vectorizer.fit_transform(train_data)
test_vectors = vectorizer.transform(test_lables)
classifier_linear = svm.LinearSVC()
clf = CalibratedClassifierCV(classifier_linear)

clf.fit(train_vectors, train_labels)
prediction_linear = clf.predict_proba(test_vectors)
return prediction_linear

我是否还需要腌制矢量化器并稍后重复使用它?

最佳答案

我遇到了问题。当我创建 TfidfVectorizer() 的新实例时,我没有使用用于训练的相同功能。我做了以下更改

linear_svc_model = clf.fit(train_vectors, train_labels)
model_object = []
model_object.append(linear_svc_model)
model_object.append(vectorizer)

并腌制了这个 model_object。然后,在使用未腌制的分类器和矢量化器时,并在训练字符串上使用相同的方法。有效。

关于python - 带矢量化器的腌制模型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49838863/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com