-
公开(公告)号:US20220100963A1
公开(公告)日:2022-03-31
申请号:US17039919
申请日:2020-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Rishita Rajal Anubhai , Yahor Pushkin , Graham Vintcent Horwood , Yinxiao Zhang , Ravindra Manjunatha , Jie Ma , Alessandra Brusadin , Jonathan Steuck , Shuai Wang , Sameer Karnik , Miguel Ballesteros Martinez , Sunil Mallya Kasaragod , Yaser Al-Onaizan
IPC: G06F40/30 , G06F40/295 , G06N20/00
Abstract: Methods, systems, and computer-readable media for event extraction from documents with co-reference are disclosed. An event extraction service identifies one or more trigger groups in a document comprising text. An individual one of the trigger groups comprises one or more textual references to an occurrence of an event. The one or more trigger groups are associated with one or more semantic roles for entities. The event extraction service identifies one or more entity groups in the document. An individual one of the entity groups comprises one or more textual references to a real-world object. The event extraction service assigns one or more of the entity groups to one or more of the semantic roles. The event extraction service generates an output indicating the one or more trigger groups and one or more entity groups assigned to the semantic roles.
-
公开(公告)号:US11847406B1
公开(公告)日:2023-12-19
申请号:US17217807
申请日:2021-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Sunil Mallya Kasaragod , Yahor Pushkin , Saman Zarandioon , Graham Vintcent Horwood , Miguel Ballesteros Martinez , Yogarshi Paritosh Vyas , Yinxiao Zhang , Diego Marcheggiani , Yaser Al-Onaizan , Xuan Zhu , Liutong Zhou , Yusheng Xie , Aruni Roy Chowdhury , Bo Pang
IPC: G06F17/00 , G06F40/143 , G06F40/169 , G06N20/00 , G06F40/154 , G06F40/103 , G06F40/284
CPC classification number: G06F40/143 , G06F40/103 , G06F40/154 , G06F40/169 , G06F40/284 , G06N20/00
Abstract: Techniques for performing natural language processing (NLP) on semi-structured data are described. An exemplary method includes receiving a semi-structured document to perform NLP on using a trained NLP model; converting the semi-structured document into a secondary format, wherein the secondary format includes spatial information for tokens of the semi-structured document; flattening the converted, secondary formatted semi-structured document into a Unicode Transformation Format text file; performing NLP on the Unicode Transformation Format text file using the trained NLP model; and providing a result of the NLP to a requester.
-
公开(公告)号:US12086548B2
公开(公告)日:2024-09-10
申请号:US17039919
申请日:2020-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Rishita Rajal Anubhai , Yahor Pushkin , Graham Vintcent Horwood , Yinxiao Zhang , Ravindra Manjunatha , Jie Ma , Alessandra Brusadin , Jonathan Steuck , Shuai Wang , Sameer Karnik , Miguel Ballesteros Martinez , Sunil Mallya Kasaragod , Yaser Al-Onaizan
IPC: G06F40/30 , G06F40/295 , G06N20/00
CPC classification number: G06F40/30 , G06F40/295 , G06N20/00
Abstract: Methods, systems, and computer-readable media for event extraction from documents with co-reference are disclosed. An event extraction service identifies one or more trigger groups in a document comprising text. An individual one of the trigger groups comprises one or more textual references to an occurrence of an event. The one or more trigger groups are associated with one or more semantic roles for entities. The event extraction service identifies one or more entity groups in the document. An individual one of the entity groups comprises one or more textual references to a real-world object. The event extraction service assigns one or more of the entity groups to one or more of the semantic roles. The event extraction service generates an output indicating the one or more trigger groups and one or more entity groups assigned to the semantic roles.
-
-