RECOGNITION METHOD AND ELECTRONIC DEVICE
    1.
    发明公开

    公开(公告)号:US20240013001A1

    公开(公告)日:2024-01-11

    申请号:US18349183

    申请日:2023-07-10

    摘要: A recognition method includes the following steps. A text is analyzed by a language recognition network to generate an entity feature, a relation feature and an overall feature. An input image is analyzed by an object detection network to generate candidate regions. Node features, aggregated edge features and compound features are generated by an enhanced cross-modal graph attention network according to the entity feature, the relation feature, the candidate regions and the overall feature. The entity feature and the relation feature are matched to the node features and the aggregated edge features to generate the first scores. The overall feature is matched to the compound features to generate second scores. Final scores corresponding to the candidate regions are generated according to the first scores and the second scores.

    Pseudo labelling for key-value extraction from documents

    公开(公告)号:US11823478B2

    公开(公告)日:2023-11-21

    申请号:US17714806

    申请日:2022-04-06

    IPC分类号: G06V30/414 G06V30/19

    摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.

    Character-based representation learning for table data extraction using artificial intelligence techniques

    公开(公告)号:US11972625B2

    公开(公告)日:2024-04-30

    申请号:US17740695

    申请日:2022-05-10

    IPC分类号: G06V30/41 G06V30/19

    摘要: Methods, apparatus, and processor-readable storage media for character-based representation learning for table data extraction using artificial intelligence techniques are provided herein. An example computer-implemented method includes identifying, from unstructured documents comprising tabular data, items of text and corresponding document position information using artificial intelligence-based text extraction techniques; generating an intermediate output by implementing character embedding with respect to the unstructured documents using an artificial intelligence-based encoder; determining structure-related information for the unstructured documents using one or more artificial intelligence-based graph-related techniques by inferring columns from the tabular data; generating a character-based representation of the unstructured documents using an artificial intelligence-based decoder by converting the inferred columns into one or more line items; classifying portions of the character-based representation using artificial intelligence-based statistical modeling techniques; and performing one or more automated actions based on the classifying.

    CHARACTER-BASED REPRESENTATION LEARNING FOR TABLE DATA EXTRACTION USING ARTIFICIAL INTELLIGENCE TECHNIQUES

    公开(公告)号:US20230368556A1

    公开(公告)日:2023-11-16

    申请号:US17740695

    申请日:2022-05-10

    IPC分类号: G06V30/41 G06V30/19

    摘要: Methods, apparatus, and processor-readable storage media for character-based representation learning for table data extraction using artificial intelligence techniques are provided herein. An example computer-implemented method includes identifying, from unstructured documents comprising tabular data, items of text and corresponding document position information using artificial intelligence-based text extraction techniques; generating an intermediate output by implementing character embedding with respect to the unstructured documents using an artificial intelligence-based encoder; determining structure-related information for the unstructured documents using one or more artificial intelligence-based graph-related techniques by inferring columns from the tabular data; generating a character-based representation of the unstructured documents using an artificial intelligence-based decoder by converting the inferred columns into one or more line items; classifying portions of the character-based representation using artificial intelligence-based statistical modeling techniques; and performing one or more automated actions based on the classifying.

    Pseudo labelling for key-value extraction from documents

    公开(公告)号:US12106595B2

    公开(公告)日:2024-10-01

    申请号:US18379091

    申请日:2023-10-11

    IPC分类号: G06V30/414 G06V30/19

    摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.

    METHOD FOR RECOGNIZING RECEIPT, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20230282016A1

    公开(公告)日:2023-09-07

    申请号:US17898678

    申请日:2022-08-30

    IPC分类号: G06V30/19 G06N3/08 G06T9/00

    摘要: Provided are method for recognizing a receipt, an electronic device and a storage medium, which relate to the fields of deep learning and pattern recognition. The method may include: a target receipt to be recognized is acquired; two-dimensional position information of multiple text blocks on the target receipt respectively is encoded, to obtain multiple encoding results; graph convolution is performed on the multiple encoding results respectively, to obtain multiple convolution results; and each of the multiple convolution results is recognized based on a first conditional random field model, to obtain a first prediction result at text block-level of the target receipt, wherein the first conditional random field model and a second conditional random field model are co-trained, so as to obtain a second prediction result at token-level of the target receipt.

    METHOD, DEVICE, COMPUTER EQUIPMENT AND STORAGE MEDIUM FOR IDENTIFYING ILLEGAL COMMODITY

    公开(公告)号:US20240331425A1

    公开(公告)日:2024-10-03

    申请号:US18460680

    申请日:2023-09-04

    申请人: ZHEJIANG LAB

    IPC分类号: G06V30/19 G06N5/02

    摘要: A method, a device, computer equipment and a storage medium for identify an illegal commodity. The method comprises: firstly, constructing a multi-modal knowledge graph according to a multi-modal knowledge graph data set, and extracting visual features of all visual modality entities and text features of all text modality entities in the knowledge graph; then obtaining a commodity image and a commodity text according to a database; then, generating commodity visual feature according to the commodity image; then generating the commodity text feature according to the commodity text; secondly, according to the visual features and text features, as well as the commodity visual feature and the commodity text feature, linking the commodity image and the commodity text to the knowledge graph by using an entity linking method; finally, obtaining the correlation between the commodity image and the commodity text according to the linked knowledge graph to determine the illegality of the commodity.