发明授权
US07356462B2 Automatic clustering of tokens from a corpus for grammar acquisition
有权
用于语法获取的语料库的令牌的自动聚类
- 专利标题: Automatic clustering of tokens from a corpus for grammar acquisition
- 专利标题(中): 用于语法获取的语料库的令牌的自动聚类
-
申请号: US10662730申请日: 2003-09-15
-
公开(公告)号: US07356462B2公开(公告)日: 2008-04-08
- 发明人: Srinivas Bangalore , Giuseppe Riccardi
- 申请人: Srinivas Bangalore , Giuseppe Riccardi
- 申请人地址: US NY New York
- 专利权人: AT&T Corp.
- 当前专利权人: AT&T Corp.
- 当前专利权人地址: US NY New York
- 主分类号: G06F17/27
- IPC分类号: G06F17/27
摘要:
A method of grammar learning from a corpus comprises, for the other non-context words, generating frequency vectors for each non-context token in a corpus based upon counted occurrences of a predetermined relationship of the non-context tokens to identified context tokens. Clusters are grown from the frequency vectors according to a lexical correlation among the non-context tokens.
公开/授权文献
信息查询