gpt4 book ai didi

nlp - CBOW VS跳过-gram : why invert context and target words?

转载 作者:行者123 更新时间:2023-12-03 07:56:18 26 4
gpt4 key购买 nike

this页面,据说:

[...] skip-gram inverts contexts and targets, and tries to predict each context word from its target word [...]



然而,看看它产生的训练数据集,X 和 Y 对的内容似乎是可以互换的,因为这两对 (X, Y) :

(quick, brown), (brown, quick)



那么,如果最终是同一件事,为什么要区分上下文和目标呢?

另外,做 Udacity's Deep Learning course exercise on word2vec ,我想知道为什么他们似乎在这个问题中对这两种方法的区别如此之大:

An alternative to skip-gram is another Word2Vec model called CBOW (Continuous Bag of Words). In the CBOW model, instead of predicting a context word from a word vector, you predict a word from the sum of all the word vectors in its context. Implement and evaluate a CBOW model trained on the text8 dataset.



这不会产生相同的结果吗?

最佳答案

这是我对差异的过度简化和相当幼稚的理解:
众所周知, CBOW 正在学习根据上下文预测单词。或者通过查看上下文来最大化目标词的概率。而这恰好是稀有词的问题。例如,给定上下文 yesterday was a really [...] day CBOW模型会告诉你最有可能的词是beautifulnice .像delightful这样的词将得到模型的更少关注,因为它旨在预测最可能的单词。这个词会在很多单词更频繁的例子上被平滑。
另一方面,跳过-克 模型旨在预测上下文。给定词 delightful它必须理解它并告诉我们上下文很有可能是yesterday was really [...] day ,或其他一些相关的上下文。与 跳过-克 delightful不会试图与这个词竞争beautiful相反,delightful+context对将被视为新的观察结果。
更新
感谢@0xF 的分享 this article

According to Mikolov

Skip-gram: works well with small amount of the training data, represents well even rare words or phrases.

CBOW: several times faster to train than the skip-gram, slightly better accuracy for the frequent words


发现该主题的另一个补充 here :

In the "skip-gram" mode alternative to "CBOW", rather than averagingthe context words, each is used as a pairwise training example. Thatis, in place of one CBOW example such as [predict 'ate' fromaverage('The', 'cat', 'the', 'mouse')], the network is presented withfour skip-gram examples [predict 'ate' from 'The'], [predict 'ate'from 'cat'], [predict 'ate' from 'the'], [predict 'ate' from 'mouse'].(The same random window-reduction occurs, so half the time that wouldjust be two examples, of the nearest words.)

关于nlp - CBOW VS跳过-gram : why invert context and target words?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38287772/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com