tensorflow - Keras - fit_generator 中的 class_weight 与 sample

tensorflow - Keras - fit_generator 中的 class_weight 与 sample_weights

转载作者：行者123 更新时间：2023-12-03 23:39:30

28

4

在 Keras(使用 TensorFlow 作为后端)中，我正在构建一个模型，该模型正在处理具有高度不平衡类(标签)的巨大数据集。为了能够运行训练过程，我创建了一个生成器，它将数据 block 提供给 fit_generator .

根据 fit_generator 的文档，生成器的输出可以是元组 (inputs, targets)或元组 (inputs, targets, sample_weights) .考虑到这一点，这里有几个问题:

我的理解是class_weight考虑整个数据集的所有类的权重，而sample_weights考虑每个单独 block 的所有类的权重
由生成器创建。那是对的吗？如果没有，有人可以详细说明这个问题吗？

是否有必要同时提供 class_weight到fit_generator然后是 sample_weights作为每个 block 的输出？如果是，那为什么？如果不是，那么哪个更好？

如果我应该给 sample_weights对于每个 block ，如果特定 block 中缺少某些类，我如何映射权重？让我举个例子吧。在我的整个数据集中，我有 7 个可能的类(标签)。因为这些类高度不平衡，所以当我创建较小的数据 block 作为 fit_generator 的输出时，特定 block 中缺少某些类。我应该如何创建 sample_weights这些 block ？

最佳答案

My understanding is that the class_weight regards the weights of all classes for the entire dataset whereas the sample_weights regards the weights of all classes for each individual chunk created by the generator. Is that correct? If not, can someone elaborate on the matter?

class_weight在目标函数的计算中影响每个类的相对权重。 sample_weights ，顾名思义，允许进一步控制属于同一类的样本的相对权重。

Is it necessary to give both the class_weight to the fit_generator and then the sample_weights as an output for each chunk? If yes, then why? If not then which one is better to give?

这取决于您的应用程序。在对高度倾斜的数据集进行训练时，类权重很有用；例如，用于检测欺诈交易的分类器。当您对批处理中的 sample 没有同等的信心时， sample 重量很有用。一个常见的例子是对具有可变不确定性的测量执行回归。

If I should give the sample_weights for each chunk, how do I map the weights if some of the classes are missing from a specific chunk? Let me give an example. In my overall dataset, I have 7 possible classes (labels). Because these classes are highly imbalanced, when I create smaller chunks of data as an output from the fit_generator, some of the classes are missing from the specific chunk. How should I create the sample_weights for these chunks?

这不是问题。 sample_weights是在每个样本的基础上定义的，并且独立于类。因此， documentation声明 (inputs, targets, sample_weights)应该是相同的长度。

function _weighted_masked_objective在 engine/training.py有一个正在应用的 sample_weights 示例。

关于tensorflow - Keras - fit_generator 中的 class_weight 与 sample_weights，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43459317/

28

4

0

文章推荐： matlab - 在 Matlab 中使用 syms 变量更改绘图结果

文章推荐： php - xampp 文件无法在 macos bigsur 中解压缩

文章推荐： webpack - 意外 token : operator (>) from UglifyJs

python - fit_generator() 以最小的验证损失保存模型
我如何使用 keras 函数 fit_generator() 训练并同时保存具有最低验证损失的模型权重？最佳答案您可以在定义检查点时设置save_best_only=True: from kera
python - 如何在keras中使用model.fit_generator
我应该何时以及如何使用 fit_generator？fit 和 fit_generator 有什么区别？最佳答案如果您已准备好所有必要方面的数据和标签，并且可以简单地将它们分配给数组 x 和 y，
python - Model.fit_generator 抛出运行时错误 :
使用 Keras 训练 CNN，即使我做了 model.compile，keras。 fit_generator 抛出一个运行时错误，提示在使用 fit 之前先编译我的模型。 Error: Using
python - keras fit_generator 用于多个批处理和多个输入
我目前正在生成一个data_generate(batch_size)，它接受batch_size 作为参数。我的网络是多输入网络，有 33 个形状为 (45,8,3) 的输入如果批量大小 = 1
python - keras 模型中的 fit_generator()
def Generate(): i = 0 while 1: i = i%int(Numb/batch_size) my_input_batch = my_input[i*batch_
python - 具有多个张量输入的 Keras fit_generator
我有一个 Keras 模型，有 4 个张量输入和 1 个数组输出。使用 model.fit 可以正常工作方法 model.fit([inputUM, inputMU, inputUU, inputMM
python - 发电机停止 Keras fit_generator
我有一个包含 9 列的数据集，最后一个是带标题的 csv 格式的目标变量。我正在尝试编写一个生成器来在 keras 中训练模型。代码如下。训练在第一个时期运行，但在完成之前就永远停止/挂起。 from
python - 需要 fit_generator() 的具体示例
我正在制作一个输入形状为 (56088,22050,1) 的语音识别模型，它作为一个整体可以从 .npy 文件(大小约为 5GB)加载到内存中，但我想弄清楚一个更好的方法。我遇到了 keras fit
python - Keras fit_generator() 与扩展序列的生成器返回的样本数多于总数
我正在使用 Keras 训练神经网络。由于数据集的大小，我需要使用生成器和 fit_generator() 方法。我正在关注本教程: https://stanford.edu/~shervine/bl
python - fit_generator 的训练精度为 0
我尝试使用 TensorFlow、Keras 和 ImageDataGenerator 从头开始创建一个模型，但它没有按预期进行。我仅使用生成器加载图像，因此不使用数据增强。有两个包含训练数据和测
python - 在暹罗网络中使用 fit_generator 时出错
我正在尝试调整 Keras MNIST Siamese example使用发电机。关于example ，我们有: model.fit([tr_pairs[:, 0], tr_pairs[:, 1]]
python - Keras Fit_generator 回调
我正在使用从文件读取数据的 fit_generator，当它到达文件末尾时，它会从下一个文件加载数据。我还在 keras 中使用有状态 RNN，因此我需要手动重置状态，在这种情况下，每次生成器加载新文
python - 如何将 fit_generator 与分成批处理的顺序数据一起使用？
我正在尝试为我的 Keras lstm 模型编写一个生成器。将它与 fit_generator 方法一起使用。我的第一个问题是我的生成器应该返回什么？一批？序列？Keras 文档中的示例为每个数据条目
python - 如何: fit_generator in keras
我有点困惑如何在 keras 中使用 fit_generator。举例来说: 我们有 10000 个数据点我们要运行 10 个 epoch 批量大小为 512 使用 fit 我们只是: x, y
python - 具有多个输入层的 Keras fit_generator
我正在尝试为具有 3 个输入和一个处理文本数据的单个输出的模型实现自定义数据生成器，如下所示: # dummy model input_1 = Input(shape=(None,)) input_2
python - Keras fit_generator 运行很慢
我有一个使用以下代码声明的 Keras 模型: model = tf.keras.models.Sequential() model.add(tf.keras.layers.LSTM(units=50
python - model.fit_generator() 形状错误
import os from keras.preprocessing.image import ImageDataGenerator from keras.models import Sequenti
python - keras 的形状问题 `fit_generator()`
我在 keras 的 fit_generator 函数中使用的自定义生成器函数返回的 numpy 数组的形状方面遇到了一个看似简单的问题。生成器函数与此类似: def data_generator(
python - Keras fit_generator 问题
我关注了this tutorial为我的 Keras 模型创建自定义生成器。这是一个显示我面临的问题的 MWE: import sys, keras import numpy as np import
python - Keras fit_generator() - 时间序列的批处理如何工作？
上下文: 我目前正在使用带有 Tensorflow 后端的 Keras 进行时间序列预测，因此研究了提供的教程 here . 按照本教程，我来到了 fit_generator() 的生成器的位置。方法

首页

博学

6Ren·AI

商城

tensorflow - Keras - fit_generator 中的 class_weight 与 sample_weights