gpt4 book ai didi

python - sklearn 和 n_jobs 中的超参数优化 > 1 : Pickling

转载 作者:行者123 更新时间:2023-12-01 03:11:42 26 4
gpt4 key购买 nike

我陷入了“困境”。这是我的代码的结构:

  • 行为类似于抽象类的基类
  • 可以实例化的子类
    • 一种设置参数并使用 n_jobs=-1 调用 RandomizedSearchCVGridSearchCV 的方法。
      • 本地函数 create_model,用于创建由 KerasClassifierKerasRegressor 调用的神经网络模型(请参阅 this 教程)

我收到一条错误消息,指出无法对本地对象进行 pickle。如果我更改n_jobs=1,那么就没有问题。所以我怀疑问题出在本地函数和并行处理上。有解决办法吗?经过一番谷歌搜索后,似乎序列化器dill可以在这里工作(我什至找到了一个名为multiprocessing_on_dill的包)。但我目前依赖于 sklearn 的软件包。

最佳答案

我找到了问题的“解决方案”。我真的很困惑为什么这些例子 heren_jobs=-1 合作,但我的代码没有。问题似乎出在本地函数 create_model 上驻留在子类的方法中。如果我将本地函数设置为子类的方法,我可以设置 n_jobs > 1 .

回顾一下,这是我的代码的结构:

  • 行为类似于抽象类的基类
  • 可以实例化的子类
    • 设置参数并调用 RandomizedSearchCV 的方法或GridSearchCVn_jobs=-1 .
    • 一种方法create_model ,创建由 KerasClassifier 调用的神经网络模型或KerasRegressor

代码的总体思路:

from abc import ABCMeta
import numpy as np
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV

class MLAlgorithms(metaclass=ABCMeta):

def __init__(self, X_train, y_train, X_test, y_test=None):
"""
Constructor with train and test data.
:param X_train: Train descriptor data
:param y_train: Train observed data
:param X_test: Test descriptor data
:param y_test: Test observed data
"""
...

@abstractmethod
def setmlalg(self, mlalg):
"""
Sets a machine learning algorithm.
:param mlalg: Dictionary of the machine learning algorithm.
"""
pass

@abstractmethod
def fitmlalg(self, mlalg, rid=None):
"""
Fits a machine learning algorithm.
:param mlalg: Machine learning algorithm
"""
pass


class MLClassification(MLAlgorithms):
"""
Main class for classification machine learning algorithms.
"""

def setmlalg(self, mlalg):
"""
Sets a classification machine learning algorithm.
:param mlalg: Dictionary of the classification machine learning algorithm.
"""
...

def fitmlalg(self, mlalg):
"""
Fits a classification machine learning algorithm.
:param mlalg: Classification machine learning algorithm
"""
...

# Function to create model, required for KerasClassifier
def create_model(self, n_layers=1, units=10, input_dim=10, output_dim=1,
optimizer="rmsprop", loss="binary_crossentropy",
kernel_initializer="glorot_uniform", activation="sigmoid",
kernel_regularizer="l2", kernel_regularizer_weight=0.01,
lr=0.01, momentum=0.0, decay=0.0, nesterov=False, rho=0.9, epsilon=1E-8,
beta_1=0.9, beta_2=0.999, schedule_decay=0.004):
from keras.models import Sequential
from keras.layers import Dense
from keras import regularizers, optimizers

# Create model
if kernel_regularizer.lower() == "l1":
kernel_regularizer = regularizers.l1(l=kernel_regularizer_weight)
elif kernel_regularizer.lower() == "l2":
kernel_regularizer = regularizers.l2(l=kernel_regularizer_weight)
elif kernel_regularizer.lower() == "l1_l2":
kernel_regularizer = regularizers.l1_l2(l1=kernel_regularizer_weight, l2=kernel_regularizer_weight)
else:
print("Warning: Kernel regularizer {0} not supported. Using default 'l2' regularizer.".format(
kernel_regularizer))
kernel_regularizer = regularizers.l2(l=kernel_regularizer_weight)

if optimizer.lower() == "sgd":
optimizer = optimizers.sgd(lr=lr, momentum=momentum, decay=decay, nesterov=nesterov)
elif optimizer.lower() == "rmsprop":
optimizer = optimizers.rmsprop(lr=lr, rho=rho, epsilon=epsilon, decay=decay)
elif optimizer.lower() == "adagrad":
optimizer = optimizers.adagrad(lr=lr, epsilon=epsilon, decay=decay)
elif optimizer.lower() == "adadelta":
optimizer = optimizers.adadelta(lr=lr, rho=rho, epsilon=epsilon, decay=decay)
elif optimizer.lower() == "adam":
optimizer = optimizers.adam(lr=lr, beta_1=beta_1, beta_2=beta_2, epsilon=epsilon, decay=decay)
elif optimizer.lower() == "adamax":
optimizer = optimizers.adamax(lr=lr, beta_1=beta_1, beta_2=beta_2, epsilon=epsilon, decay=decay)
elif optimizer.lower() == "nadam":
optimizer = optimizers.nadam(lr=lr, beta_1=beta_1, beta_2=beta_2, epsilon=epsilon,
schedule_decay=schedule_decay)
else:
print("Warning: Optimizer {0} not supported. Using default 'sgd' optimizer.".format(optimizer))
optimizer = "sgd"

model = Sequential()
model.add(
Dense(units=units, input_dim=input_dim,
kernel_initializer=kernel_initializer, activation=activation,
kernel_regularizer=kernel_regularizer))
for layer_count in range(n_layers - 1):
model.add(
Dense(units=units, kernel_initializer=kernel_initializer, activation=activation,
kernel_regularizer=kernel_regularizer))
model.add(Dense(units=output_dim,
kernel_initializer=kernel_initializer, activation=activation,
kernel_regularizer=kernel_regularizer))

# Compile model
model.compile(loss=loss, optimizer=optimizer, metrics=['accuracy'])
return model


class MLRegression(MLAlgorithms):
"""
Main class for regression machine learning algorithms.
"""
...

关于python - sklearn 和 n_jobs 中的超参数优化 > 1 : Pickling,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42843465/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com