gpt4 book ai didi

Tensorflow LSTM not learning for Univariate Time Series binary classification for Imbalanced Dataset(TensorFlow LSTM对不平衡数据集的单变量时间序列二进制分类不学习)

转载 作者:bug小助手 更新时间:2023-10-28 10:45:11 26 4
gpt4 key购买 nike



I have an unbalanced univariate time series data, class 0 instances = 7898, class 1 instances = 1371
I'm using the following LSTM model with ROC-AUC curve plotting and finding optimal threshold for evaluation. But the model is performing worse/only slightly better than random guess. It isn't learning.
Dataset: https://www.kaggle.com/datasets/avinemmatty/theft-data

我有一个不平衡的单变量时间序列数据,0类实例=7898,1类实例=1371我使用以下LSTM模型,绘制ROC-AUC曲线,并找到最佳评估阈值。但该模型的表现比随机猜测要差/仅略好一些。这不是学习。数据集:https://www.kaggle.com/datasets/avinemmatty/theft-data


I've applied normalization (0-1) and outlier removal on the dataset.

我已经对数据集应用了归一化(0-1)和异常值去除。


Here's some of the code I'm using and data preprocessing steps:
I'm applying SMOTE oversampling on the training data, so after SMOTE, I have train, test, val as follows:

下面是我使用的一些代码和数据预处理步骤:我对训练数据应用SMOTE过采样,所以在SMOTE之后,我有train,test,val如下:


(CHK_STATE
0.0 5528
1.0 5528
Name: count, dtype: int64,
CHK_STATE
0.0 1185
1.0 205
Name: count, dtype: int64,
CHK_STATE
0.0 1185
1.0 206
Name: count, dtype: int64)

Here is the code for plotting ROC-AUC curve and getting optimal threshold:

以下是绘制ROC-AUC曲线和获得最佳阈值的代码:


from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
import numpy as np

def plot_roc_curve(y_true, y_pred_prob):
# Compute ROC curve and ROC area for each class
fpr, tpr, thresholds = roc_curve(y_true, y_pred_prob)
roc_auc = auc(fpr, tpr)

# Find optimal threshold
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold = thresholds[optimal_idx]
print("Optimal Threshold:", optimal_threshold)

# Plot the ROC curve
plt.figure(figsize=(10,7))
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.scatter(fpr[optimal_idx], tpr[optimal_idx], marker='o', color='red', label='Optimal Threshold')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()

return optimal_threshold

Code for LSTM training:

LSTM培训代码:


def LSTM(X_train, X_test, X_val, y_train, y_test, y_val, class_weights):
print('LSTM with 1D Convolution:')

model = Sequential()

# Adding 1D Convolutional layers
model.add(Conv1D(filters=64, kernel_size=7, activation='relu', input_shape=(X_train.shape[1], 1)))
model.add(MaxPooling1D(pool_size=2))

# model.add(Conv1D(filters=128, kernel_size=7, activation='relu'))
# model.add(MaxPooling1D(pool_size=2))

# LSTM layers
model.add(CuDNNLSTM(64, return_sequences=True))
model.add(CuDNNLSTM(64))

# Fully connected layers
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(1, activation='sigmoid'))

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
lr = 0.0001
optimizer = optimizers.Adam(learning_rate=lr)
print("Learning Rate:", lr)
model.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])

X_train = X_train.to_numpy().reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.to_numpy().reshape((X_test.shape[0], X_test.shape[1], 1))
X_val = X_val.to_numpy().reshape((X_val.shape[0], X_val.shape[1], 1))

print("X_train Shape:", X_train.shape)
print("X_test Shape:", X_test.shape)
print("X_val Shape:", X_val.shape)
print("y_train Shape:", y_train.shape)
print("y_test Shape:", y_test.shape)
print("y_val Shape:", y_val.shape)

history = model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=200, shuffle=False, batch_size=256, callbacks=[early_stopping])
plt.figure(figsize=(10,6))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper right')
plt.show()

y_pred_prob = model.predict(X_test)
# Plot ROC curve for train, val also
optimal_threshold = plot_roc_curve(y_test, y_pred_prob)
prediction = (y_pred_prob > optimal_threshold).astype("int32")
f1 = results(y_test, prediction)

return prediction

Training output:

培训成果:


LSTM with 1D Convolution: Using 0.5 as threshold... 
Learning Rate: 0.0001
X_train Shape: (11056, 365, 1)
X_test Shape: (1390, 365, 1)
X_val Shape: (1391, 365, 1)
y_train Shape: (11056,)
y_test Shape: (1390,)
y_val Shape: (1391,)
Epoch 1/200 44/44 [==============================] - 6s 45ms/step - loss: 0.8197 - accuracy: 0.4955 - val_loss: 0.6875 - val_accuracy: 0.8519
Epoch 2/200 44/44 [==============================] - 1s 28ms/step - loss: 0.7627 - accuracy: 0.4974 - val_loss: 0.6878 - val_accuracy: 0.8483
Epoch 3/200 44/44 [==============================] - 1s 32ms/step - loss: 0.7386 - accuracy: 0.5005 - val_loss: 0.6862 - val_accuracy: 0.8354
Epoch 4/200 44/44 [==============================] - 1s 28ms/step - loss: 0.7217 - accuracy: 0.5078 - val_loss: 0.6870 - val_accuracy: 0.8224
Epoch 5/200 44/44 [==============================] - 1s 28ms/step - loss: 0.7156 - accuracy: 0.5078 - val_loss: 0.6869 - val_accuracy: 0.8030
Epoch 6/200 44/44 [==============================] - 1s 28ms/step - loss: 0.7125 - accuracy: 0.5021 - val_loss: 0.6882 - val_accuracy: 0.7843
Epoch 7/200 44/44 [==============================] - 1s 28ms/step - loss: 0.7093 - accuracy: 0.4999 - val_loss: 0.6899 - val_accuracy: 0.7132
Epoch 8/200 44/44 [==============================] - 1s 28ms/step - loss: 0.7062 - accuracy: 0.5000 - val_loss: 0.6969 - val_accuracy: 0.3918

enter image description here


Clearly an issue in the learning process
Can anyone suggest any steps I should try?

显然,这是学习过程中的一个问题,有没有人能建议我应该尝试的步骤?


One thing to note is that if I put shuffle = True in model.fit, then ROC-AUC is coming out much better. Why is this? But I need deterministic results, so I set it to be False.

需要注意的一件事是,如果我在mod.fit中加上Shuffle=True,那么ROC-AUC的结果会好得多。这是为什么?但我需要确定性的结果,所以我将其设置为假。


I think the shapes of the inputs are fine. How do I improve my results? I'm stuck at this point. I tried undersampling too, but that was of no use either. Changing the model architecture or learning rate doesnt help much. In my opinion it should do much better than a random guess.

我认为输入的形状很好。我如何提高我的成绩?我被困在这一点上了。我也试过抽样不足,但也没有用。改变模型架构或学习速度不会有多大帮助。在我看来,它应该比随机猜测好得多。


Classification Report as calculated by sklearn on the test dataset:

SkLearning在测试数据集上计算的分类报告:


Accuracy 80.71942446043165
RMSE: 0.4390965217303406
MAE: 0.19280575539568345
F1: [88.97119342 23.42857143]
precision recall f1-score support

0.0 0.87 0.91 0.89 1185
1.0 0.28 0.20 0.23 205

accuracy 0.81 1390
macro avg 0.58 0.56 0.56 1390
weighted avg 0.78 0.81 0.79 1390

更多回答

We never use dropout be default from the beginning - we use it only when we have reasons to suspect possible overfitting (which is the exact opposite of "my model doesn't learn" you report here); but this is not a programming question, hence it is actually off-topic here. Please see the intro and NOTE in stackoverflow.com/tags/machine-learning/info

我们从一开始就不使用dropout作为默认值-我们只在有理由怀疑可能的过拟合时使用它(这与你在这里报告的“我的模型不学习”完全相反);但这不是一个编程问题,因此它实际上是偏离主题的。请参阅stackoverflow.com/tags/machine-learning/info中的介绍和注释

优秀答案推荐
更多回答

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com