- 专利标题: Computing numeric representations of words in a high-dimensional space
-
申请号: US14715421申请日: 2015-05-18
-
公开(公告)号: US09740680B1公开(公告)日: 2017-08-22
- 发明人: Tomas Mikolov , Kai Chen , Gregory S. Corrado , Jeffrey A. Dean
- 申请人: Google Inc.
- 申请人地址: US CA Mountain View
- 专利权人: Google Inc.
- 当前专利权人: Google Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Fish & Richardson P.C.
- 主分类号: G10L15/00
- IPC分类号: G10L15/00 ; G06F17/27 ; G10L15/06
摘要:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.
信息查询