gpt4 book ai didi

tensorflow - 在 keras 的 CPU 上并行训练多个神经网络

转载 作者:行者123 更新时间:2023-12-04 20:29:52 30 4
gpt4 key购买 nike

我想在 CPU 上并行训练数十个小型神经网络
在带有 Tensorflow 后端的 Keras 中。
默认情况下,Tensorflow 在训练单个 nn 时将批次拆分到核心上,但我的平均核心利用率仅为 50% 左右。
因此,将神经网络的完整训练分配给核心似乎是一个好主意,因此必须移动的数据更少。

我似乎无法找到如何指定这些操作。

另请注意,神经网络具有不同的架构,因此将所有内容组合成一个图形将导致矩阵更稀疏且速度更慢
执行。

最佳答案

完成这项工作有一些关键点:

  • 使用进程,而不是线程。线程将导致异步执行,但不是并行执行,因此只会使用一个 CPU 内核。
  • 出于实际目的,构建、编译和拟合神经网络应该在同一过程中进行。
  • 对于每个进程,需要初始化一个单独的 tensorflow 图和 session 。
  • 训练完网络后,您可能希望将它们序列化以备后用。使用 Keras 很重要 model.save(file_name) ,不定期酸洗。

  • 执行:

    扩展python Process类(class):

    from keras.layers import Dense
    from keras.models import Sequential
    from multiprocessing import Process, Queue
    import tensorflow as tf

    from train_val_set import TrainValSet


    class NNProcess(Process):
    def __init__(self, process_id: int, nr_nets: int, ret_queue: Queue):
    super(NNProcess, self).__init__()
    self.process_id = process_id
    self.neural_nets = []
    self.train_val_set = None
    self.nr_nets = nr_nets
    self.ret_queue = ret_queue

    def set_train_val(self, train_val_set: TrainValSet):
    self.train_val_set = train_val_set

    def get_session_config(self):
    num_cores = 1
    num_CPU = 1
    num_GPU = 0

    config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
    inter_op_parallelism_threads=num_cores, allow_soft_placement=False,
    device_count={'CPU': num_CPU, 'GPU': num_GPU})

    return config

    def run(self):
    print("process " + str(self.process_id) + " starting...")

    with tf.Session(graph=tf.Graph(), config=self.get_session_config()) as session:
    self.init_nets()
    self.compile()
    self.fit_nets(self.train_val_set)
    for i in range(0, self.nr_nets):
    file_name = self.neural_nets[i].name + "_" + str(i) + ".pickle"
    self.neural_nets[i].save(file_name)
    self.ret_queue.put(file_name)
    print("process " + str(self.process_id) + " finished.")

    def compile(self):
    for neural_net in self.neural_nets:
    neural_net.compile(loss='categorical_crossentropy',
    optimizer='sgd',
    metrics=['accuracy'])

    def init_nets(self):
    for i in range(0, self.nr_nets):
    model = Sequential()
    model.add(Dense(units=64, activation='relu', input_dim=100))
    model.add(Dense(units=10, activation='softmax'))
    self.neural_nets.append(model)

    def fit_nets(self, train_val_set: TrainValSet):
    for i in range(0, self.nr_nets):
    self.neural_nets[i].fit()

    辅助类:

    from pandas import DataFrame


    class TrainValSet:
    def __init__(self, df_train: DataFrame, df_val: DataFrame):
    self.x_train, self.y_train = self.get_x_y(df_train)
    self.x_val, self.y_val = self.get_x_y(df_val)

    def get_x_y(self, df: DataFrame):
    X = df.iloc[:, 0:-1].values
    y = df.iloc[:, -1].values

    return X, y

    主文件:

    import pandas as pd
    from multiprocessing import Manager
    import tensorflow as tf
    from keras import backend as K

    from train_val_set import TrainValSet
    from nn_process import NNProcess


    def load_train_val_test_datasets(dataset_dir: str, dataset_name: str):
    df_train = pd.read_csv(dataset_dir + dataset_name + "/" + dataset_name + "_train.csv", header=None)
    df_val = pd.read_csv(dataset_dir + dataset_name + "/" + dataset_name + "_val.csv", header=None)
    df_test = pd.read_csv(dataset_dir + dataset_name + "/" + dataset_name + "_test.csv", header=None)

    return df_train, df_val, df_test


    # config for prediction and evaluation only
    def get_session_config(num_cores):
    num_CPU = 1
    num_GPU = 0

    config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
    inter_op_parallelism_threads=num_cores, allow_soft_placement=True,
    device_count={'CPU': num_CPU, 'GPU': num_GPU})

    return config


    def train_test(nr_nets: int, nr_processes: int):
    df_train, df_val, df_test = load_train_val_test_datasets('MNIST')
    train_val_set = TrainValSet(df_train, df_val)
    nets_per_proc = int(nr_nets/nr_processes)

    nn_queue = Manager().Queue()

    processes = []

    for i in range(0, nr_processes):
    nn_process = NNProcess(i, nets_per_proc, nn_queue)
    nn_process.set_train_val(train_val_set)
    processes.append(nn_process)

    for nn_process in processes:
    nn_process.start()

    for nn_process in processes:
    nn_process.join()

    tf_session = tf.Session(config=get_session_config(4))
    K.set_session(tf_session)

    # ...
    # load neural nets from files
    # do predictions

    关于tensorflow - 在 keras 的 CPU 上并行训练多个神经网络,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48961330/

    30 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com