python - Lightfm : handling user and item cold-start-6ren

python - Lightfm : handling user and item cold-start

转载作者：太空狗更新时间：2023-10-29 21:41:41

我记得 lightfm 的优点之一是模型没有冷启动问题，用户和项目都冷启动:lightfm original paper

但是，我仍然不明白如何使用lightfm 来解决冷启动问题。我在 user-item interaction data 上训练了我的模型。据我了解，我只能对存在于我的数据集中的 profile_id 进行预测。

def predict(self, user_ids, item_ids, item_features=None,
            user_features=None, num_threads=1):
    """
    Compute the recommendation score for user-item pairs.

    Arguments
    ---------

    user_ids: integer or np.int32 array of shape [n_pairs,]
         single user id or an array containing the user ids for the
         user-item pairs for which a prediction is to be computed
    item_ids: np.int32 array of shape [n_pairs,]
         an array containing the item ids for the user-item pairs for which
         a prediction is to be computed.
    user_features: np.float32 csr_matrix of shape [n_users, n_user_features], optional
         Each row contains that user's weights over features.
    item_features: np.float32 csr_matrix of shape [n_items, n_item_features], optional
         Each row contains that item's weights over features.
    num_threads: int, optional
         Number of parallel computation threads to use. Should
         not be higher than the number of physical cores.

    Returns
    -------

    np.float32 array of shape [n_pairs,]
        Numpy array containing the recommendation scores for pairs defined
        by the inputs.
    """

    self._check_initialized()

    if not isinstance(user_ids, np.ndarray):
        user_ids = np.repeat(np.int32(user_ids), len(item_ids))

    assert len(user_ids) == len(item_ids)

    if user_ids.dtype != np.int32:
        user_ids = user_ids.astype(np.int32)
    if item_ids.dtype != np.int32:
        item_ids = item_ids.astype(np.int32)

    n_users = user_ids.max() + 1
    n_items = item_ids.max() + 1

    (user_features,
     item_features) = self._construct_feature_matrices(n_users,
                                                       n_items,
                                                       user_features,
                                                       item_features)

    lightfm_data = self._get_lightfm_data()

    predictions = np.empty(len(user_ids), dtype=np.float64)

    predict_lightfm(CSRMatrix(item_features),
                    CSRMatrix(user_features),
                    user_ids,
                    item_ids,
                    predictions,
                    lightfm_data,
                    num_threads)

    return predictions

任何有助于我理解的建议或指示都将不胜感激。谢谢

最佳答案

LightFM 与任何其他推荐算法一样，如果没有提供有关这些用户的额外信息，则无法对全新用户进行预测。尝试为新用户提出建议时的诀窍是根据算法在训练期间看到的特征来描述他们。

这可能最好用一个例子来解释。假设您的训练集中有 ID 介于 0 和 10 之间的用户，并且您想要对 ID 为 11 的新用户进行预测。如果您只有新用户的 ID，则算法将无法进行预测:毕竟，它对用户 11 的偏好是什么一无所知。然而，假设您有一些特征来描述用户:也许在注册过程中，每个用户都选择了他们的一些兴趣(例如，恐怖电影或浪漫喜剧)。如果这些特征在训练过程中出现，该算法可以了解平均而言哪些偏好与这些特征相关联，并且能够为任何可以使用相同特征描述的新用户生成推荐。在此示例中，如果您可以提供他们在注册过程中选择的偏好，您就可以对用户 11 进行预测。

在 LightFM 实现中，所有这些特征都将编码在特征矩阵中，可能采用单热编码的形式。在为用户 11 提供推荐时，您将为该用户构建一个新的特征矩阵:只要该特征矩阵仅包含训练期间出现的特征，您就可以进行预测。

请注意，具有仅对应于单个用户的特征通常很有用——例如“是用户 0”特征、“是用户 1”特征等等。对于新用户，这样的特征是无用的，因为训练中没有信息可供模型用来了解该特征。

关于python - Lightfm : handling user and item cold-start，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46924119/

文章推荐： python - 使用气流将文件流式传输到kafka

文章推荐： c++ - 在 linux 中播放 wav 文件

文章推荐： c# - C# do-while 循环在 Web 开发中有什么常见的用法吗？

文章推荐： python - 如何有效地使用 Python 属性？

optimization - 如何优化 LightFM 的超参数？
我在我的数据集上使用了 LightFM 推荐库，它给出了下图中的结果。 NUM_THREADS = 4 NUM_COMPONENTS = 30 NUM_EPOCHS = 5 ITEM_ALPHA =
python - Pycharm:无法导入 lightfm
我尝试使用下面的代码加载 movie_lens 数据集从 lightfm.datasets 导入 fetch_movielens 运行这个我得到:ImportError:没有名为“lightfm.d
python - 评估 LightFM 推荐模型
我一直在研究 lightfm很长一段时间，发现生成建议真的很有用。但是，我想知道两个主要问题。在推荐等级很重要的情况下评估 LightFM 模型，我应该更多地依赖 precision@k 或其他提供
python - 使用 LightFM 和打印预测创建稀疏矩阵
我目前正在使用一个名为 LightFM 的 Python 库。但是我在将交互传递给 fit() 方法时遇到了一些问题。 Python 版本:3图书馆:http://lyst.github.io/lig
recommendation-engine - LightFM : Weights and Sample Weights
我希望深入了解 LightFM 实现的以下权重: 样本权重什么是sample_weights在 fit方法？我读到它们是为了模拟时间衰减，但这究竟是如何工作的？解释这一点的示例或文章将非常有帮助。
python - Lightfm : handling user and item cold-start
我记得 lightfm 的优点之一是模型没有冷启动问题，用户和项目都冷启动:lightfm original paper 但是，我仍然不明白如何使用lightfm 来解决冷启动问题。我在 user-i
python - LightFM train_interactions 在火车和测试集之间共享 : This will cause incorrect evaluation, 检查您的数据拆分
tl;dr:使用 Yelp 数据集制作推荐系统，但遇到测试交互矩阵和训练交互矩阵共享 68 个交互。这将导致不正确的评估，请检查您的数据拆分。运行以下 LightFM 代码时出错。 test_auc

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - Lightfm : handling user and item cold-start