-
公开(公告)号:US20220100772A1
公开(公告)日:2022-03-31
申请号:US17039887
申请日:2020-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Srikanth Doss Kadarundalagi Raghura , Yogarshi Paritosh Vyas , Miguel Ballesteros Martinez , Yahor Pushkin , Sunil Mallya Kasaragod , Yaser Al-Onaizan , Sameer Karnik , Abhinav Goyal , Graham Vintcent Horwood , Kapil Singh Badesara
IPC: G06F16/25 , G06F16/23 , G06F21/62 , G06F40/289
Abstract: Methods, systems, and computer-readable media for context-sensitive linking of entities to private databases are disclosed. An entity linking service stores a plurality of representations of entities. Individual ones of the entities correspond to individual ones of a plurality of records in one or more private databases. The entity linking service determines a mention of an entity in a document. The entity linking service selects, from the plurality of records in the one or more private databases, a record corresponding to the entity. The record is selected based at least in part on the plurality of representations of the entities and based at least in part on a context of the mention of the entity in the document. The entity linking service generates output comprising a reference to the selected record in the one or more private databases.
-
公开(公告)号:US11847406B1
公开(公告)日:2023-12-19
申请号:US17217807
申请日:2021-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Sunil Mallya Kasaragod , Yahor Pushkin , Saman Zarandioon , Graham Vintcent Horwood , Miguel Ballesteros Martinez , Yogarshi Paritosh Vyas , Yinxiao Zhang , Diego Marcheggiani , Yaser Al-Onaizan , Xuan Zhu , Liutong Zhou , Yusheng Xie , Aruni Roy Chowdhury , Bo Pang
IPC: G06F17/00 , G06F40/143 , G06F40/169 , G06N20/00 , G06F40/154 , G06F40/103 , G06F40/284
CPC classification number: G06F40/143 , G06F40/103 , G06F40/154 , G06F40/169 , G06F40/284 , G06N20/00
Abstract: Techniques for performing natural language processing (NLP) on semi-structured data are described. An exemplary method includes receiving a semi-structured document to perform NLP on using a trained NLP model; converting the semi-structured document into a secondary format, wherein the secondary format includes spatial information for tokens of the semi-structured document; flattening the converted, secondary formatted semi-structured document into a Unicode Transformation Format text file; performing NLP on the Unicode Transformation Format text file using the trained NLP model; and providing a result of the NLP to a requester.
-