- 专利标题: Unsupervised hypernym induction machine learning
-
申请号: US16666800申请日: 2019-10-29
-
公开(公告)号: US11507828B2公开(公告)日: 2022-11-22
- 发明人: Md Faisal Mahbub Chowdhury , Robert G. Farrell , Nicholas Brady Garvan Monath , Michael Robert Glass , Md Arafat Sultan
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 代理机构: Scully, Scott, Murphy & Presser, P.C.
- 代理商 Anthony R. Curro
- 主分类号: G06F40/205
- IPC分类号: G06F40/205 ; G06N3/08 ; G06K9/62 ; G06N5/04
摘要:
Training a machine learning model such as a neural network, which can automatically extract a hypernym from unstructured data, is disclosed. A preliminary candidate list of hyponym-hypernym pairs can be parsed from the corpus. A preliminary super-term—sub-term glossary can be generated from the corpus, the preliminary super-term—sub-term glossary containing one or more super-term—sub-term pairs. A super-term—sub-term pair can be filtered from the preliminary super-term—sub-term glossary, responsive to detecting that the super-term—sub-term pair is not a candidate for hyponym-hypernym pair, to generate a final super-term—sub-term glossary. The preliminary candidate list of hyponym-hypernym pairs and the final super-term—sub-term glossary can be combined to generate a final list of hyponym-hypernym pairs. An artificial neural network can be trained using the final list of hyponym-hypernym pairs as a training data set, the artificial neural network trained to identify a hypernym given new text data.
公开/授权文献
- US20210125058A1 UNSUPERVISED HYPERNYM INDUCTION MACHINE LEARNING 公开/授权日:2021-04-29
信息查询