gpt4 book ai didi

python-3.x - Scikit 学习 : Incorporate Naive Bayes Model Predictions into Logistic Regression?

转载 作者:行者123 更新时间:2023-11-30 08:46:10 25 4
gpt4 key购买 nike

我有各种客户属性( self 描述和年龄)的数据,以及这些客户是否会购买特定产品的二元结果

  {"would_buy": "No", 
"self_description": "I'm a college student studying biology",
"Age": 19},

我想使用MultinomialNB self 描述上预测would_buy,然后将这些预测合并到would_buy上的逻辑回归模型中,该模型还需要age 作为协变量。

到目前为止的文本模型代码(我是 SciKit 的新手!),带有简化的数据集。

from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

#Customer data that includes whether a customer would buy an item (what I'm interested), their self-description, and their age.
data = [
{"would_buy": "No", "self_description": "I'm a college student studying biology", "Age": 19},
{"would_buy": "Yes", "self_description": "I'm a blue-collar worker", "Age": 20},
{"would_buy": "No", "self_description": "I'm a Stack Overflow denzien", "Age": 56},
{"would_buy": "No", "self_description": "I'm a college student studying economics", "Age": 20},
{"would_buy": "Yes", "self_description": "I'm a UPS worker", "Age": 35},
{"would_buy": "No", "self_description": "I'm a Stack Overflow denzien", "Age": 56}
]

def naive_bayes_model(customer_data):
self_descriptions = [customer['self_description'] for customer in customer_data]
decisions = [customer['would_buy'] for customer in customer_data]

vectorizer = TfidfVectorizer(stop_words='english', ngram_range=(1,2))
X = vectorizer.fit_transform(self_descriptions, decisions)
naive_bayes = MultinomialNB(alpha=0.01)
naive_bayes.fit(X, decisions)
train(naive_bayes, X, decisions)

def train(classifier, X, y):
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=22)
classifier.fit(X_train, y_train)

print(classification_report(classifier.predict(X_test), y_test))


def main():
naive_bayes_model(data)



main()

最佳答案

简短的答案是使用 predict_probapredict_log_proba训练有素的 naive_bayes 上的方法来为逻辑回归模型创建输入。这些可以与 Age 值连接起来,为您的 LogisticRegression 模型创建训练和测试集。

但是,我确实想指出,您编写的代码在训练后无法让您访问 naive_bayes 模型。所以你肯定需要重构你的代码。

抛开这个问题不谈,这就是我将 naive_bayes 的输出合并到 LogisticRegression 中的方法:

descriptions = np.array([customer['self_description'] for customer in data])
decisions = np.array([customer['would_buy'] for customer in data])
ages = np.array([customer['Age'] for customer in data])

vectorizer = TfidfVectorizer(stop_words='english', ngram_range=(1,2))
desc_vec = vectorizer.fit_transform(descriptions, decisions)
naive_bayes = MultinomialNB(alpha=0.01)
desc_train, desc_test, age_train, age_test, dec_train, dec_test = train_test_split(desc_vec, ages, decisions, test_size=0.25, random_state=22)

naive_bayes.fit(desc_train, dec_train)
nb_train_preds = naive_bayes.predict_proba(desc_train)
lr = LogisticRegression()
lr_X_train = np.hstack((nb_tarin_preds, age_train.reshape(-1, 1)))
lr.fit(lr_X_train, dec_train)

lr_X_test = np.hstack((naive_bayes.predict_proba(desc_test), age_test.reshape(-1, 1)))
lr.score(lr_X_test, dec_test)

关于python-3.x - Scikit 学习 : Incorporate Naive Bayes Model Predictions into Logistic Regression?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47873165/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com