python - Keras 回归的 GridSearch 实现-6ren

python - Keras 回归的 GridSearch 实现

转载作者：太空宇宙更新时间：2023-11-03 21:45:14

24

4

尝试理解并实现 Keras 回归的 GridSearch 方法。这是我的简单可生成的回归应用程序。

import pandas as pd
import numpy as np
import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint


df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/concrete/slump/slump_test.data")
df.drop(['No','FLOW(cm)','Compressive Strength (28-day)(Mpa)'],1,inplace=True)

# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
    result = []
    for x in df.columns:
        if x != target:
            result.append(x)
    # find out the type of the target column.  Is it really this hard? :(
    target_type = df[target].dtypes
    target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type
    # Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
    if target_type in (np.int64, np.int32):
        # Classification
        dummies = pd.get_dummies(df[target])
        return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32)
    else:
        # Regression
        return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)

x,y = to_xy(df,'SLUMP(cm)')


x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)


#Create Model
model = Sequential()
model.add(Dense(128, input_dim=x.shape[1], activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')

monitor = EarlyStopping(monitor='val_loss', min_delta=1e-5, patience=5, mode='auto')
checkpointer = ModelCheckpoint(filepath="best_weights.hdf5",save_best_only=True) # save best model

model.fit(x_train,y_train,callbacks=[monitor,checkpointer],verbose=0,epochs=1000)
#model.fit(x_train,y_train,validation_data=(x_test,y_test),callbacks=[monitor,checkpointer],verbose=0,epochs=1000)
pred = model.predict(x_test)

score = np.sqrt(metrics.mean_squared_error(pred,y_test))

print("(RMSE): {}".format(score))

如果运行代码，您可以看到 loss 并不是太大的数字。

这是我可生产的 GridSearch 实现。首先，我简单地在网上搜索并找到了 KerasClassifier 的 GridSearch 应用程序，然后尝试将其修改为 KerasRegressor。我不确定我的修改是否正确。如果我假设一般概念是正确的，那么这段代码一定有问题，因为损失函数没有意义。损失函数是 MSE 但输出为负，不幸的是我无法弄清楚我哪里做错了。

from keras.wrappers.scikit_learn import KerasRegressor
import pandas as pd
import numpy as np
import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from sklearn.model_selection import GridSearchCV

df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/concrete/slump/slump_test.data")
df.drop(['No','FLOW(cm)','Compressive Strength (28-day)(Mpa)'],1,inplace=True)

#Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
    result = []
    for x in df.columns:
        if x != target:
            result.append(x)
    # find out the type of the target column.  Is it really this hard? :(
    target_type = df[target].dtypes
    target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type
    # Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
    if target_type in (np.int64, np.int32):
        #Classification
        dummies = pd.get_dummies(df[target])
        return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32)
    else:
        #Regression
        return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)

x,y = to_xy(df,'SLUMP(cm)')


x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)


def create_model(optimizer='adam'):
    # create model
    model = Sequential()
    model.add(Dense(128, input_dim=x.shape[1], activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer=optimizer,metrics=['mse'])
    return model

model = KerasRegressor(build_fn=create_model, epochs=100, batch_size=10, verbose=0)

optimizer = ['SGD', 'RMSprop', 'Adagrad']
param_grid = dict(optimizer=optimizer)

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(x_train, y_train)

#summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

最佳答案

我已经测试了您的代码，并且我发现您没有在 GridSearchCV 中使用评分函数，因此根据文档 scikit-learn documentation :

If None, the estimator’s default scorer (if available) is used.

似乎默认情况下会使用 'neg_mean_absolute_error' (或 these scoring functions for regression 中的任何一个)来评分模型。

那是因为它可能说最好的模型是:

-75.820078 using {'optimizer':'Adagrad'}

关于python - Keras 回归的 GridSearch 实现，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52551511/

24

4

0

文章推荐： javascript - jQuery/JS - 变量似乎没有更新？

文章推荐： html - Firefox 和 Chrome 的不同背景位置

文章推荐： c# - 使用字典时如何避免运行时错误

文章推荐： javascript - jQuery .each 首先附加第二个值等

python - 在投票分类器上运行 GridSearch
尝试在组合了几个分类器的投票集合上运行网格搜索。我在运行代码时不断遇到同样的错误 ValueError: Invalid parameter n_estimator for estimator Gra
python - 使用 Gridsearch 为回归模型选择最佳参数
我正在做线性回归建模，我使用网格搜索来选择最佳参数。下面是我为这项工作遵循的 python 步骤，但出现错误(ValueError:估算器 LinearRegression(copy_X=True，f
Scikit-Learn GridSearch 自定义评分函数
我需要对维度 (5000, 26421) 的数据集执行内核 pca 以获得较低维度的表示。为了选择分量的数量(比如 k)参数，我正在执行数据的减少和对原始空间的重建，并获得不同 k 值的重建数据和原始
python - 使用 gridsearch 调整超参数会导致过度拟合
使用 gridsearch 调整超参数会导致过度拟合。训练误差肯定很低，但测试误差很高。不能调整超参数来降低测试误差吗？ def custom_wmae(actual_values, predict
python - 如何解读 GridSearch 的最佳得分？
我使用不同的数据集训练不同的分类器，我需要了解如何正确衡量分类器的有效性。这是我的代码: iris = load_iris() param_grid = { 'criterion': ['g
python - 一次从 GridSearch 设置多个参数
在我的数据集上使用 GridSearchCV 后，我想提取所有最佳参数。 from sklearn.tree import DecisionTreeClassifier params_grid = {
python - 如何在 Gridsearch 中打印出每个组合的准确度分数？
我已经设置了一个 GridSearchCV 并有一组参数，我会找到最佳的参数组合。我的 GridSearch 总共包含 12 个候选模型。但是，我也有兴趣查看所有 12 个的准确度分数，而不仅仅是最
python - 我无法在 gridsearch 中添加优化器参数
from keras.wrappers.scikit_learn import KerasClassifier from sklearn.model_selection import GridSear
python - 运行 Gridsearch 时无法解决错误
我是机器学习领域的新手，我开始参加 Kaggle 比赛以获得一些实践经验。我正在参加知识竞赛 CIFAR 10- 图像中的对象识别，你必须在 10 类中对数千张图像进行分类，我使用的所有数据都可以在那
python - GridSearch 中的 Best_params
我使用 grid_search 来找到参数的最佳组合，并制作了一个绘图来查看参数更改时分数如何变化。当我运行 gs_clf.best_params_ 时，我将其作为参数的最佳组合:{'learning
python - CATBoost 和 GridSearch
model.fit(train_data, y=label_data, eval_set=eval_dataset) eval_dataset = Pool(val_data, val_labels)
python - GridSearch 'UndefinedMetricWarning' 和错误结果
我创建了一个简单的脚本来对随机森林分类器应用网格搜索，虽然我过去使用过它，但它现在似乎被破坏了，我找不到原因。 from sklearn.ensemble import RandomForestCla
python - 自定义转换器和 GridSearch - 管道中的 ValueError
我正在尝试使用一些自定义转换器优化 scikit-learn 管道中的超参数，但我不断收到错误消息: from sklearn.model_selection import TimeSeriesSpl
machine-learning - 在小数据集上使用 GridSearch 并将结果应用于大数据集是一个好主意吗？
我有一个带有 TfidVectorizer 和 OneVsRestClassifier(SGDClassifier) 的管道。这是我要执行的 gridSearch 的参数: parameters =
java - H2O 中的 GridSearch API
我正在尝试使用 GridSearch Scala 中的 api H2O 。我找到了this文档显示了在 R 和 Python 中做什么，但 Java 文档在实际获得最佳模型之前就停止了。谁能告诉我最后
python - "Parallel"使用 gridsearch 获得最佳模型的管道
在sklearn中，可以定义串行管道以获得管道所有连续部分的超参数的最佳组合。串行管道可以按如下方式实现: from sklearn.svm import SVC from sklearn impor
python - Keras 回归的 GridSearch 实现
尝试理解并实现 Keras 回归的 GridSearch 方法。这是我的简单可生成的回归应用程序。 import pandas as pd import numpy as np import skle
python - Scikit-learn 中用于多标签分类的 GridSearch
我正在尝试在十重交叉验证中的每个人中进行 GridSearch 以获得最佳超参数，它在我之前的多类分类工作中运行良好，但这次在多标签工作中情况并非如此。 X_train, X_test, y_trai
python - GridSearch 与 Keras 神经网络
我正在尝试为使用 keras 构建的神经网络执行参数调整。这是我的代码，在导致错误的行上有注释: from sklearn.cross_validation import StratifiedKFol
python - 多标签 OneVsRestClassifier 的 GridSearch？
我正在对多标签数据进行网格搜索，如下所示: #imports from sklearn.svm import SVC as classifier from sklearn.pipeline impor

首页

博学

6Ren·AI

商城

python - Keras 回归的 GridSearch 实现