python - Keras 模型中的 Softmax 采样-6ren

python - Keras 模型中的 Softmax 采样

转载作者：行者123 更新时间：2023-12-01 00:47:41

28

4

我考虑过的一些方法:

继承自Model类 Sampled softmax in tensorflow keras

继承自Layers类 How can I use TensorFlow's sampled softmax loss function in a Keras model?

在这两种方法中，模型方法更干净，因为层方法有点老套 - 它将目标作为输入的一部分插入，然后再见多输出模型。

我需要一些关于模型类子类化的帮助 - 具体来说:1)与第一种方法不同 - 我想采用任意数量的层，就像我们在指定标准 keras 模型时所做的那样。例如，

class LanguageModel(tf.keras.Model):
    def __init__(self, **kwargs)

2)我希望在模型类中合并以下代码 - 但想让模型类识别

def call(self, y_true, input):
        """ reshaping of y_true and input to make them fit each other """
        input = tf.reshape(input, (-1,self.hidden_size))
        y_true = tf.reshape(y_true, (-1,1))
      weights = tf.Variable(tf.float64))
      biases = tf.Variable(tf.float64)
      loss = tf.nn.sampled_softmax_loss(
      weights=weights,
      biases=biases,
      labels=labels,
      inputs=inputs,
      ...,
      partition_strategy="div")
      logits = tf.matmul(inputs, tf.transpose(weights))
      logits = tf.nn.bias_add(logits, biases)
       y_predis = tf.nn.softmax_cross_entropy_with_logits_v2(
                                labels=inputs[1],
                                logits=logits)

3 我想我需要一些指针来指示我应该处理函数式 API 中 Model 类的哪些部分 - 知道我必须编写一个像上面这样的自定义损失函数。我猜问题是访问 tf.nn.sampledsoftmax 函数中的权重

最佳答案

我能想到的最简单的方法是定义一个忽略输出层结果的损失。

完整的 Colab 在这里: https://colab.research.google.com/drive/1Rp3EUWnBE1eCcaisUju9TwSTswQfZOkS

损失函数。请注意，它假设输出层是 Dense(activation='softmax') 并且忽略 y_pred。因此，在使用损失的训练/评估期间，密集层的实际输出是 NOP。

进行预测时使用输出层。

class SampledSoftmaxLoss(object):
  """ The loss function implements the Dense layer matmul and activation
  when in training mode.
  """
  def __init__(self, model):
    self.model = model
    output_layer = model.layers[-1]
    self.input = output_layer.input
    self.weights = output_layer.weights

  def loss(self, y_true, y_pred, **kwargs):
    labels = tf.argmax(y_true, axis=1)
    labels = tf.expand_dims(labels, -1)
    loss = tf.nn.sampled_softmax_loss(
        weights=self.weights[0],
        biases=self.weights[1],
        labels=labels,
        inputs=self.input,
        num_sampled = 3,
        num_classes = 4,
        partition_strategy = "div",
    )
    return loss

型号:

def make_model():
  inp = Input(shape=(10,))
  h1 = Dense(16, activation='relu')(inp)
  h2 = Dense(4, activation='linear')(h1)
  # output layer and last hidden layer must have the same dims
  out = Dense(4, activation='softmax')(h2)
  model = Model(inp, out)
  loss_calculator = SampledSoftmaxLoss(model)
  model.compile('adam', loss_calculator.loss)
  return model

tf.set_random_seed(42)
model = make_model()
model.summary()

请注意，SampledSoftmaxLoss 强制最后一个模型层的输入必须具有与类数相同的维度。

关于python - Keras 模型中的 Softmax 采样，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56821654/

28

4

0

文章推荐： python - AWS - S3 - 创建存储桶策略 - 错误 : Access Denied

文章推荐： php - Mysql NOW() 没有更新数据库

文章推荐：使用语义 UI 的 Codeigniter 分页

文章推荐： mysql - WHERE 连接表

softmax - Softmax 交叉熵是否适用于多标签分类？
如前所述 here ，交叉熵不是多标签分类的合适损失函数。我的问题是“这个事实是否也适用于 softmax 的交叉熵？”。如果是，如何与this part匹配的文件。我应该提到我的问题的范围在cnt
machine-learning - softmax 和 log-softmax 有什么区别？
这两个函数之间的区别已在这篇 pytorch 帖子中描述:What is the difference between log_softmax and softmax? 是:exp(x_i) / ex
python - Tensorflow tf.nn.softmax() 函数比手写的 softmax 性能好很多
我正在使用 tensorflow 编写一个简单的逻辑回归。我发现当使用 tf.nn.softmax 时，算法收敛得更快，最终精度更高。如果切换到我自己的 softmax 实现，网络收敛速度较慢，最终精
python - 使用 softmax 作为 tf.keras 中的连续层和使用 softmax 作为密集层的激活函数有什么区别？
使用 softmax 作为 tf.keras 中的连续层和使用 softmax 作为密集层的激活函数有什么区别？ tf.keras.layers.Dense(10, activation=tf.nn.
machine-learning - keras.activations.softmax 和 keras.layers.Softmax 之间有什么区别？
keras.activations.softmax 和 keras.layers.Softmax 之间有什么区别？为什么同一个激活函数有两种定义？ keras.activations.softmax:
使用 Softmax 进行二元分类
我正在使用带有二进制交叉熵的 Sigmoid 激活函数训练一个二进制分类器，它提供了大约 98% 的良好准确度。当我使用带有 categorical_crossentropy 的 softmax 进
tensorflow - 全卷积网络的每像素 softmax
我正在尝试实现类似完全卷积网络的东西，其中最后一个卷积层使用过滤器大小 1x1 并输出“分数”张量。分数张量的形状为 [Batch, height, width, num_classes]。我的问题
java - Softmax 激活实现
我目前正在用 Java 实现我自己的神经网络。我已经实现了一些常见的激活函数，例如 Sigmoid 或 ReLU，但我不知道如何实现 Softmax。我想要一个像这样的方法 private doub
java - Softmax 激活实现
我目前正在用 Java 实现我自己的神经网络。我已经实现了一些常见的激活函数，例如 Sigmoid 或 ReLU，但我不知道如何实现 Softmax。我想要一个像这样的方法 private doub
python - 将正态分布转换为 softmax
我在 github 上找到了一个很好的强化学习示例，我想使用它。我的问题是输出是正态分布层(下面的代码)，因为它用于连续 Action 空间，而我想将它用于离散 Action 空间，其中模型有 4 个
tensorflow - softmax 回归中的权重是一维还是二维？
我已经学习了 ML，并且一直在 Andrew N.G 的 coursera 类(class)中学习 DL，每次他谈到线性分类器时，权重都只是一个一维向量。即使在分配期间，当我们将图像滚动到一维向量(像
r - softmax 输出的神经网络无法收敛
我一直在研究斯坦福的深度学习教程，但我在其中一个练习(带有 softmax 输出层的神经网络)上遇到了问题。这是我在 R 中的实现: train <- function(training.set, l
matlab - Softmax 回归的向量化实现
我正在 Octave 中实现 softmax 回归。目前，我正在使用使用以下成本函数和导数的非矢量化实现。来源:Softmax Regression 现在我想在 Octave 中实现它的矢量化版本。
python - softmax python计算
我是机器学习的新手，正在学习如何在 python 中实现 softmax，我正在关注以下线程 Softmax function - python 我在做一些分析，如果我们有一个数组 batch = n
python - 大量错误的 Softmax
下面是我尝试计算 softmax 的一小段代码。它适用于单个阵列。但是对于更大的数字，比如 1000 等，它会爆炸 import numpy as np def softmax(x): print
keras - 如果可以激活多个输出，softmax 层的替代品是什么？
例如，我有一个 CNN，它试图从 MNIST 数据集(使用 Keras 编写的代码)中预测数字。它有 10 个输出，形成 softmax 层。只有一个输出可以为真(独立于 0 到 9 的每个数字):
pytorch - 我应该在交叉熵之前应用 softmax 吗？
pytorch教程 ( https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-bli
python - 理解 softmax 输出层的目标数据
我找到了一些 MNIST 手写字符分类问题的示例代码。代码开头如下: import tensorflow as tf # Load in the data mnist = tf.keras.datas
python - Keras softmax 输出和准确率
这是 Keras 模型的最后一层。 model.add(Dense(3, activation='softmax')) model.compile(loss='categorical_crossent
math - 为什么使用 softmax 而不是标准标准化？
在神经网络的输出层中，通常使用softmax函数来近似概率分布: 由于指数的原因，计算成本很高。为什么不简单地执行 Z 变换，使所有输出均为正，然后通过将所有输出除以所有输出之和来进行归一化？最佳答

首页

博学

6Ren·AI

商城

python - Keras 模型中的 Softmax 采样