tensorflow - 像在 Pytorch 中一样在 Tensorflow 中屏蔽零填充嵌入(并返回零梯度)-6ren

tensorflow - 像在 Pytorch 中一样在 Tensorflow 中屏蔽零填充嵌入(并返回零梯度)

转载作者：行者123 更新时间：2023-12-04 16:02:29

25

4

我正在尝试重新创建 PoolNet来自 Spotlight与 BPR loss在 Tensorflow 中，但我无法获得相同的结果。下面是我正在使用的模型(它是一个估算器 model_fn)。

def _pooling_model_fn(features, labels, mode, params):
 with tf.name_scope('inputs'):
    if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
        users_prev_items_inputs_train = features['item_seqs']
    elif mode == tf.estimator.ModeKeys.PREDICT:
        users_prev_items_inputs_train = tf.reshape(features['item_seqs'], [1, -1])

 with tf.device('/cpu:0'):
    prod_embeddings = tf.keras.layers.Embedding(params["num_items"], params["item_emb_size"], mask_zero=True)
    item_biases = tf.keras.layers.Embedding(params["num_items"], 1, mask_zero=True, embeddings_initializer=tf.keras.initializers.Zeros())
    prod_embed = prod_embeddings(users_prev_items_inputs_train)
    targets = tf.transpose(prod_embed, [0, 2, 1])

 sequence_embeddings = tf.expand_dims(targets, axis=3)

 sequence_embeddings = tf.pad(sequence_embeddings, paddings=tf.constant([[0, 0], [0, 0], [1, 0], [0, 0]]))

 sequence_embedding_sum = tf.cumsum(sequence_embeddings, 2)

 non_padding_entries = tf.cumsum(tf.cast(tf.not_equal(sequence_embeddings, tf.constant(0.0)), tf.float32), 2)  # .expand_as(sequence_embedding_sum)

 user_representations = tf.squeeze((sequence_embedding_sum / (non_padding_entries + 1)), [3])

 user_representations_so_far = user_representations[:, :, :-1]
 user_representations_new = user_representations[:, :, -1]

 if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
    global_step = tf.contrib.framework.get_or_create_global_step()

    with tf.name_scope('loss'):
        negative_samples = features['neg_samp']

        with tf.device('/cpu:0'):
            prod_embed_pos = prod_embeddings(users_prev_items_inputs_train)
            target_embedding_positive = tf.squeeze(tf.transpose(prod_embed_pos, [0, 2, 1]))

            prod_bias_pos = item_biases(users_prev_items_inputs_train) 
            target_bias_positive = tf.squeeze(prod_bias_pos)

        dot_positive = tf.reduce_sum(user_representations_so_far * target_embedding_positive, 1) + target_bias_positive

        with tf.device('/cpu:0'):
            prod_embed_neg = prod_embeddings(negative_samples)
            target_embedding_negative = tf.squeeze(tf.transpose(prod_embed_neg, [0, 2, 1]))

            prod_bias_neg = item_biases(negative_samples)
            target_bias_negative = tf.squeeze(prod_bias_neg)

        dot_negative = tf.reduce_sum(user_representations_so_far * target_embedding_negative, 1) + target_bias_negative

        mask = tf.not_equal(users_prev_items_inputs_train, 0)

        loss = bpr_loss(dot_positive, dot_negative, mask)

 if mode == tf.estimator.ModeKeys.TRAIN:
    with tf.name_scope('optimizer'):
        optimizer = tf.train.AdamOptimizer(learning_rate=params["lr"])
    train_op = optimizer.minimize(loss, global_step=global_step)
    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

 if mode == tf.estimator.ModeKeys.PREDICT:
    item_ids = np.arange(params['num_items']).reshape(-1, 1)
    item_ids_tensor = tf.convert_to_tensor(item_ids, dtype=tf.int64)

    with tf.device('/cpu:0'):
        prod_embed_pos = prod_embeddings(item_ids_tensor)  # tf.nn.embedding_lookup(prod_embeddings, item_ids_tensor)
        target_embedding_positive = tf.squeeze(tf.transpose(prod_embed_pos, [0, 2, 1]))

        prod_bias_pos = item_biases(item_ids_tensor)  # tf.nn.embedding_lookup(item_biases, item_ids_tensor)
        target_bias_positive = tf.squeeze(prod_bias_pos)

    dot_positive = tf.reduce_sum(user_representations_new * target_embedding_positive, 1) + target_bias_positive

    predictions = {
        'products': tf.reshape(dot_positive, [1, -1])
    }
    export_outputs = {
        'prediction': tf.estimator.export.PredictOutput(predictions)
    }
    return tf.estimator.EstimatorSpec(mode, predictions=predictions, export_outputs=export_outputs)

和损失函数

def bpr_loss(positive_predictions, negative_predictions, mask):
 loss1 = 1.0 - tf.nn.sigmoid(positive_predictions - negative_predictions)

 if mask is not None:
    mask = tf.cast(mask, loss1.dtype)
    final_loss = loss1 * mask
    return tf.reduce_sum(final_loss) / tf.reduce_sum(mask)

 return tf.reduce_mean(loss1)

使用上述模型，我无法在完全相同的数据集(和相同的随机种子)上获得与使用 Spotlight 时相同的预测。我最终发现问题出在零填充上。数据生成方式如下:

[[0,0,0,5,6,98],
 [0,62,15,4,8,47],
 [0,0,5,9,6,3,41],
 [78,21,2,56,1,3]]

它们有前导零填充，因此每个输入样本都具有相同的长度。

根据我的代码，我相信我已尽一切努力从损失、嵌入层(使用 Keras 的 mask_zero 参数)以及我正在计算的嵌入的平均值(使用 cumsum)中屏蔽掉这些零).尽管如此，在训练之后，零索引嵌入在不断变化(这意味着考虑到而不是排除在外并导致影响其余梯度并为我的结果添加噪声)。

Pytorch 在 Embedding layer 的实现中似乎有一个很好的特性您可以在其中使用填充 ID 设置 padding_idx，这将用零初始化。此外，它使该索引的梯度始终为零。所以基本上，我正在尝试使用 Tensorflow 做同样的事情。

如有任何帮助，我们将不胜感激。

最佳答案

我使用以下 solution 解决了它发布在 Tensorflow 的 Github 上。它现在似乎可以工作了。

mask_padding_zero_op = tf.scatter_update(lookup_table, 
                                     PADDING_ID, 
                                     tf.zeros([EMBEDDING_DIM,], dtype=DTYPE))

with tf.control_dependencies([mask_padding_zero_op]):
    # do embedding lookup...

关于tensorflow - 像在 Pytorch 中一样在 Tensorflow 中屏蔽零填充嵌入(并返回零梯度)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50103313/

25

4

0

文章推荐： arrays - 使用什么数据结构/数据持久化

文章推荐： .net - 无法加载程序集 'Microsoft.Office.Server.Search'

Android 屏蔽
我需要创建一个 View (作为其他 View 的 mask ) 。圆圈是透明的，外部区域是半透明的。我可以使用 canvas.clipRegion(..) 来实现它，但是 http://develo
python爬虫百度搜索屏蔽
踏入爬虫的迷宫多年以来，我一直是一个对编程充满了好奇心的人。探索着代码的世界，便如同探险家踏入密林深处，寻找未知的宝藏。最近，我将目光聚焦在了爬虫技术上，特别是百度搜索屏蔽问题上。百度搜索的诱惑
.htaccess 重定向到子文件夹(屏蔽)
我想将“www.adomain.com”重定向到“www.adomain.com/cms”。 cms 部分应该被屏蔽。我无法让它安静地工作。因此“cms”始终是网址的一部分。我尝试了这个解决方案:
ios - 屏蔽 UIView
我一直在我的 View Controller 中使用以下代码: UIView *view = [[CustomView alloc] init]; UIView *mask = [[CustomMas
iOS开发中使用UIWebView 屏蔽 alert警告框
如果是网页内容里面的alert,我们可以等网页加载完毕,也就是在webViewDidFinishLoad中执行下面的js代码,就可以屏蔽alert了
pyqt4 - 屏蔽 QLineEdit 文本
我正在使用 PyQt4 QLineEdit小部件接受密码。有一个setMasking属性，但不遵循如何设置屏蔽字符。最佳答案 editor = QLineEdit() editor.setEchoM
python - Pandas 中的动态过滤/屏蔽
我有一个包含员工信息的 Pandas 数据框，如下所示: df=pd.DataFrame({ 'Id':[1,2,3,4], 'Name':['Joe','Henry','Sam','
iphone - 屏蔽 CALayer - iPhone
我正在为 iPhone 创建一个自定义开/关切换开关(类似于标准开关)，并且我正在设置 slider 的蒙版，但调用 [[myView [layer] setMask:maskLayer] 设置
objective-c - 屏蔽 NsImageView
我如何能够在 Objective C 中屏蔽 nsimageview？例如，有一个带圆角的 nsimageview。最佳答案你不知道。如果你想以 NSImageView 不支持的方式绘制图像，则需
cocoa-touch - 屏蔽 UIImage
我正在开发一个可以更改边框或矩形 UIImage 的应用程序。边框会有所不同，但看起来 UIImage 是用剪刀剪掉的，或者有什么影响。做到这一点的最佳方法是什么？我的第一个想法是准备一堆具有我正
azure - 屏蔽 Azure 存储库中文件内的数据
我需要屏蔽数据，就像在 Azure Pipelines 中一样，但位于 Azure 存储库文件内。有没有一种方法可以设置与脚本分开存储的变量，例如在 Azure 管道中: variable = $(S
用于密码的 JavaFX TextInputDialog(屏蔽)
我没有找到解决问题的简单方法。我想使用 TextInputDialog，您必须在其中键入用户密码，以重置数据库中的所有数据。 TextInputDialog 的问题是它没有屏蔽文本，我不知道有什么选择
sql - 混淆/屏蔽/打乱个人信息
我正在寻找一种自行开发的方法来扰乱生产数据以用于开发和测试。我已经构建了几个脚本来生成随机社会安全号码、轮类出生日期、打乱电子邮件等。但我在尝试打乱客户姓名时遇到了困难。我想保留真实姓名，这样我们仍然
angularjs:屏蔽 SSN 的前五位数字
我正在尝试使用过滤器来屏蔽 SSN 的前 5 位数字，它应该看起来像这样 XXX-XX-1234 到目前为止我想出了什么: // {{SocialSecurityNumber | ssn}} angu
用于密码的 JavaFX TextInputDialog(屏蔽)
我没有找到解决问题的简单方法。我想使用 TextInputDialog，您必须在其中键入用户密码，以重置数据库中的所有数据。 TextInputDialog 的问题是它没有屏蔽文本，我不知道有什么选择
arrays - 屏蔽 Fortran 数组的更好方法？
我想屏蔽一个 Fortran 数组。这是我目前正在做的方式...... where (my_array <=15.0) mask_array = 1 elsewhere mask_ar
java - 屏蔽 JTextField 中的一些字符
当用户在用户界面上输入时，我需要屏蔽数字，用户应该看到一个屏蔽的数字，但在 Java 代码上，我应该得到整个数字，包括屏蔽的字符用户应该看到什么 4545********9632但在 Java 代码(
java - 屏蔽 javafx 文本字段输入
我正在使用 javafx 文本字段。我需要输入数字以及应符合格式 ###.###.###.### 的小数其中小数点根本不可编辑，数字可以是任何数字；不允许其他输入。并非所有数字占位符都需要填写，例如它
c++ - 屏蔽 C++ 头文件的内容
我必须交付一个小型 C++ 软件。我想用一些实现(我使用了表达式模板)来掩盖头文件，以使最终用户难以辨认，因此他们无法修改它们。我无法将代码包含在静态或动态库中。有什么方法可以屏蔽头文件的内容吗？谢
ios - 屏蔽 UIImage 并改为设置彩色图像
我正在开发一个与 gestureRecognizer 配合使用的应用程序。使用手势可以选择 UIImage(例如 rectangle.png)，并且可以使用 UIPopoverView 通过为所选图

首页

博学

6Ren·AI

商城

tensorflow - 像在 Pytorch 中一样在 Tensorflow 中屏蔽零填充嵌入(并返回零梯度)