gpt4 book ai didi

Tensorflow 2.0 - 大数据集的 tf.estimator.DNNClassifier 训练

转载 作者:行者123 更新时间:2023-12-03 14:53:24 24 4
gpt4 key购买 nike

我正在尝试训练 DNNClassifier

    labels = ['BENIGN', 'Syn', 'UDPLag', 'UDP', 'LDAP', 'MSSQL', 'NetBIOS', 'WebDDoS']

# Build a DNN
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=[30, 10],
n_classes=len(labels),
label_vocabulary=labels)

def input_fn(features, labels, training=True, batch_size=32):
'''
An input function for training or evaluating
'''
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
# Shuffle and repeat if you are in training mode.
if training:
dataset = dataset.shuffle(1000).repeat()
return dataset.batch(batch_size)

# Train the model
classifier.train(
input_fn=lambda: input_fn(train_features, train_label, training=True),
steps=5000)
训练工作正常,直到使用更大的数据集
train_features.shape
>>> (15891114, 20)
train_label.shape
>>> (15891114,)
我正在使用 Google Colaboratory,一旦培训开始,我的 session 就会因超过 RAM 使用量(12GB 的 RAM)而崩溃
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python

/ops/resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:Layer dnn is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:106: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
在训练开始之前,只使用了 1GB 的 RAM,但是一旦训练开始,RAM 就会迅速饱和。

我通过提供 chunks 使其工作用于训练/评估模型的数据框。
尽管如此,当我提供整个数据帧用于训练或评估 Estimator 时,我不清楚为什么 RAM 会饱和。 .

最佳答案

我复制了您的 Google Colab 并复制了“我的云端硬盘”中的数据文件并训练了估算器,您的代码正常工作::s enter image description here .我可以毫无问题地训练 DNN:

我检查了我是否使用了大数据集:
enter image description here

我确实有一个 out of RAM当我重新计算一些 jupyter notebook 单元时的消息,但是当我重新启动内核时从来没有,然后在此之后 Run all cells .也许问题出在 jupyter 上?尝试将代码写入 .py 文件(放置在驱动器中),然后使用 subprocess 从 colab notebook 运行它,也许可以解决您的问题。

关于Tensorflow 2.0 - 大数据集的 tf.estimator.DNNClassifier 训练,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62190677/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com