我正在进行情感分析并使用 keras 来预测电影评论的正面/负面。我想知道的是原始数据,那些是我的模型错误预测的。我只能从我的模型中获得准确性和损失,但我想获得我的模型预测错误的文本子集。怎么做?
import pandas as pd
from keras.preprocessing.text import Tokenizer
from keras.layers import Dense
import keras
import numpy as np
import gc
from sklearn.model_selection import train_test_split
dataset=pd.read_csv('balanced_dataset.csv')
tk=Tokenizer(num_words=2000)
tk.fit_on_texts(dataset.review)
x=tk.texts_to_matrix(dataset.review)
y=dataset.label
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=40)
model=keras.models.Sequential()
model.add(Dense(8,input_dim=2000))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['acc'])
del tk
gc.collect()
result=model.fit(x_train,y_train,batch_size=128,epochs=20,validation_split=0.1)
简单地说,使用:model.predict()
pred = model.predict(x_test)
indices = [i for i,v in enumerate(pred) if pred[i]!=y_test[i]]
subset_of_wrongly_predicted = [x_test[i] for i in indices ]
我是一名优秀的程序员,十分优秀!