gpt4 book ai didi

python - 带有 LSTM 的 GridSearchCV/RandomizedSearchCV

转载 作者:行者123 更新时间:2023-11-30 09:43:30 48 4
gpt4 key购买 nike

我一直在尝试通过 RandomizedSearchCV 调整 LSTM 的超参数。

我的代码如下:

X_train = X_train.reshape((X_train.shape[0], 1, X_train.shape[1]))
X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1]))

print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

from imblearn.pipeline import Pipeline
from keras.initializers import RandomNormal

def create_model(activation_1='relu', activation_2='relu',
neurons_input = 1, neurons_hidden_1=1,
optimizer='Adam' ,
#input_shape = (X_train.shape[1], X_train.shape[2])
#input_shape=(X_train.shape[0],X_train.shape[1]) #input shape should be timesteps, features
):

model = Sequential()
model.add(LSTM(neurons_input, activation=activation_1, input_shape=(X_train.shape[1], X_train.shape[2]),
kernel_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42),
bias_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42)))

model.add(Dense(2, activation='sigmoid'))

model.compile (loss = 'sparse_categorical_crossentropy', optimizer=optimizer)
return model


clf=KerasClassifier(build_fn=create_model, epochs=10, verbose=0)

param_grid = {
'clf__neurons_input': [20, 25, 30, 35],
'clf__batch_size': [40,60,80,100],
'clf__optimizer': ['Adam', 'Adadelta']}



pipe = Pipeline([
('oversample', SMOTE(random_state=12)),
('clf', clf)
])

my_cv = TimeSeriesSplit(n_splits=5).split(X_train)

rs_keras = RandomizedSearchCV(pipe, param_grid, cv=my_cv, scoring='f1_macro',
refit='f1_macro', verbose=3,n_jobs=1, random_state=42)
rs_keras.fit(X_train, y_train)

我总是遇到错误:

Found array with dim 3. Estimator expected <= 2.

这是有道理的,因为 GridSearch 和 RandomizedSearch 都需要 [n_samples, n_features] 类型的数组。有人对如何处理此限制有经验或建议吗?

谢谢。

这是错误的完整回溯:

Traceback (most recent call last):

File "<ipython-input-2-b0be4634c98a>", line 1, in <module>
runfile('Scratch/prediction_lstm.py', wdir='/Simulations/2017-2018/Scratch')

File "\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)

File "\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "Scratch/prediction_lstm.py", line 204, in <module>
rs_keras.fit(X_train, y_train)

File "Anaconda3\lib\site-packages\sklearn\model_selection\_search.py", line 722, in fit
self._run_search(evaluate_candidates)

File "\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py", line 1515, in _run_search
random_state=self.random_state))

File "\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py", line 711, in evaluate_candidates
cv.split(X, y, groups)))

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 917, in __call__
if self.dispatch_one_batch(iterator):

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 182, in apply_async
result = ImmediateResult(func)

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 549, in __init__
self.results = batch()

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 225, in __call__
for func, args, kwargs in self.items]

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 225, in <listcomp>
for func, args, kwargs in self.items]

File "\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 528, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)

File "\Anaconda3\lib\site-packages\imblearn\pipeline.py", line 237, in fit
Xt, yt, fit_params = self._fit(X, y, **fit_params)

File "\Anaconda3\lib\site-packages\imblearn\pipeline.py", line 200, in _fit
cloned_transformer, Xt, yt, **fit_params_steps[name])

File "\Anaconda3\lib\site-packages\sklearn\externals\joblib\memory.py", line 342, in __call__
return self.func(*args, **kwargs)

File "\Anaconda3\lib\site-packages\imblearn\pipeline.py", line 576, in _fit_resample_one
X_res, y_res = sampler.fit_resample(X, y, **fit_params)

File "\Anaconda3\lib\site-packages\imblearn\base.py", line 80, in fit_resample
X, y, binarize_y = self._check_X_y(X, y)

File "\Anaconda3\lib\site-packages\imblearn\base.py", line 138, in _check_X_y
X, y = check_X_y(X, y, accept_sparse=['csr', 'csc'])

File "\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 756, in check_X_y
estimator=estimator)

File "\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 570, in check_array
% (array.ndim, estimator_name))

ValueError: Found array with dim 3. Estimator expected <= 2.

最佳答案

此问题不是由 scikit-learn 引起的。 RandomizedSearchCV 不检查输入的形状。这是各个 Transformer 或 Estimator 的工作,以确定传递的输入具有正确的形状。正如您从堆栈跟踪中看到的,该错误是由 imblearn 创建的,因为 SMOTE 需要数据为二维数据才能工作。

为避免这种情况,您可以在 SMOTE 之后、将数据传递到 LSTM 之前手动重新调整数据。有多种方法可以实现这一目标。

1)您传递二维数据(没有像您当前在以下几行中所做的那样显式 reshape ):

X_train = X_train.reshape((X_train.shape[0], 1, X_train.shape[1]))
X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1]))

到您的管道,并在 SMOTE 步骤之后、clf 之前,将数据重新整形为 3-D,然后将其传递给 clf .

2) 您将当前的 3-D 数据传递到管道,将其转换为 2-D 以便与 SMOTE 一起使用。然后,SMOTE 将输出新的过采样 2-D 数据,然后您再次将其重新整形为 3-D。

我认为更好的选择是 1。即使如此,您也可以:

  • 使用自定义类将数据从 2-D 转换为 3-D,如下所示:

    pipe = Pipeline([
    ('oversample', SMOTE(random_state=12)),

    # Check out custom scikit-learn transformers
    # You need to impletent your reshape logic in "transform()" method
    ('reshaper', CustomReshaper(),
    ('clf', clf)
    ])
  • 或使用已有的 Reshape class 。我正在使用Reshape

所以修饰符代码是(参见评论):

# Remove the following two lines, so the data is 2-D while going to "RandomizedSearchCV". 

# X_train = X_train.reshape((X_train.shape[0], 1, X_train.shape[1]))
# X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1]))


from keras.layers import Reshape

def create_model(activation_1='relu', activation_2='relu',
neurons_input = 1, neurons_hidden_1=1,
optimizer='Adam' ,):

model = Sequential()

# Add this before LSTM. The tuple denotes the last two dimensions of input
model.add(Reshape((1, X_train.shape[1])))
model.add(LSTM(neurons_input,
activation=activation_1,

# Since the data is 2-D, the following needs to be changed from "X_train.shape[1], X_train.shape[2]"
input_shape=(1, X_train.shape[1]),
kernel_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42),
bias_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42)))

model.add(Dense(2, activation='sigmoid'))

model.compile (loss = 'sparse_categorical_crossentropy', optimizer=optimizer)
return model

关于python - 带有 LSTM 的 GridSearchCV/RandomizedSearchCV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55774632/

48 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com