-
公开(公告)号:US20230376687A1
公开(公告)日:2023-11-23
申请号:US17746779
申请日:2022-05-17
Applicant: ADOBE INC.
Inventor: Vlad Ion Morariu , Tong Sun , Nikolaos Barmpalios , Zilong Wang , Jiuxiang Gu , Ani Nenkova Nenkova , Christopher Tensmeyer
IPC: G06F40/279 , G06N5/02
CPC classification number: G06F40/279 , G06N5/022
Abstract: Embodiments are provided for facilitating multimodal extraction across multiple granularities. In one implementation, a set of features of a document for a plurality of granularities of the document is obtained. Via a machine learning model, the set of features of the document are modified to generate a set of modified features using a set of self-attention values to determine relationships within a first type of feature and a set of cross-attention values to determine relationships between the first type of feature and a second type of feature. Thereafter, the set of modified features are provided to a second machine learning model to perform a classification task.