当前位置：代码迷 >> 综合 >> Distributed Representations of Words and Phrasesand their Compositionality

详细解决方案

Distributed Representations of Words and Phrasesand their Compositionality

热度：36 发布时间：2023-12-07 01:17:49.0

摘要

首先表明continuous Skip-gram 模型学习到的distributed vector representations 可以捕获到语法和语义关系。对高频词的下采样有助于提升训练速度，且可以learn more regular word representations。用negative sampling来代替hierarchical softmax 。

word representations无法区别词序，也无法表示短语。文中举例，“Air Canada”（加拿大航空公司）的含义是很难对"Canada"、“Air”的含义进行组合来得到的。针对这种现象，提出了一种发现短语的方法，并表示是可以百万级别短语的词向量表示的。

查看全文