gpt4 book ai didi

python - Keras 回归的 GridSearch 实现

转载 作者:太空宇宙 更新时间:2023-11-03 21:45:14 24 4
gpt4 key购买 nike

尝试理解并实现 Keras 回归的 GridSearch 方法。这是我的简单可生成的回归应用程序。

import pandas as pd
import numpy as np
import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint


df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/concrete/slump/slump_test.data")
df.drop(['No','FLOW(cm)','Compressive Strength (28-day)(Mpa)'],1,inplace=True)

# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
result = []
for x in df.columns:
if x != target:
result.append(x)
# find out the type of the target column. Is it really this hard? :(
target_type = df[target].dtypes
target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type
# Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
if target_type in (np.int64, np.int32):
# Classification
dummies = pd.get_dummies(df[target])
return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32)
else:
# Regression
return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)

x,y = to_xy(df,'SLUMP(cm)')


x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.25, random_state=42)


#Create Model
model = Sequential()
model.add(Dense(128, input_dim=x.shape[1], activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')

monitor = EarlyStopping(monitor='val_loss', min_delta=1e-5, patience=5, mode='auto')
checkpointer = ModelCheckpoint(filepath="best_weights.hdf5",save_best_only=True) # save best model

model.fit(x_train,y_train,callbacks=[monitor,checkpointer],verbose=0,epochs=1000)
#model.fit(x_train,y_train,validation_data=(x_test,y_test),callbacks=[monitor,checkpointer],verbose=0,epochs=1000)
pred = model.predict(x_test)

score = np.sqrt(metrics.mean_squared_error(pred,y_test))

print("(RMSE): {}".format(score))

如果运行代码,您可以看到 loss 并不是太大的数字。

Regression Result

这是我可生产的 GridSearch 实现。首先,我简单地在网上搜索并找到了 KerasClassifier 的 GridSearch 应用程序,然后尝试将其修改为 KerasRegressor。我不确定我的修改是否正确。如果我假设一般概念是正确的,那么这段代码一定有问题,因为损失函数没有意义。损失函数是 MSE 但输出为负,不幸的是我无法弄清楚我哪里做错了。

from keras.wrappers.scikit_learn import KerasRegressor
import pandas as pd
import numpy as np
import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from sklearn.model_selection import GridSearchCV

df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/concrete/slump/slump_test.data")
df.drop(['No','FLOW(cm)','Compressive Strength (28-day)(Mpa)'],1,inplace=True)

#Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
result = []
for x in df.columns:
if x != target:
result.append(x)
# find out the type of the target column. Is it really this hard? :(
target_type = df[target].dtypes
target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type
# Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
if target_type in (np.int64, np.int32):
#Classification
dummies = pd.get_dummies(df[target])
return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32)
else:
#Regression
return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)

x,y = to_xy(df,'SLUMP(cm)')


x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.25, random_state=42)


def create_model(optimizer='adam'):
# create model
model = Sequential()
model.add(Dense(128, input_dim=x.shape[1], activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer=optimizer,metrics=['mse'])
return model

model = KerasRegressor(build_fn=create_model, epochs=100, batch_size=10, verbose=0)

optimizer = ['SGD', 'RMSprop', 'Adagrad']
param_grid = dict(optimizer=optimizer)

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(x_train, y_train)

#summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))

Grid Search Result

最佳答案

我已经测试了您的代码,并且我发现您没有在 GridSearchCV 中使用评分函数,因此根据文档 scikit-learn documentation :

If None, the estimator’s default scorer (if available) is used.

似乎默认情况下会使用 'neg_mean_absolute_error' (或 these scoring functions for regression 中的任何一个)来评分模型。

那是因为它可能说最好的模型是:

-75.820078 using {'optimizer':'Adagrad'}

关于python - Keras 回归的 GridSearch 实现,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52551511/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com