Systems and methods for generating and using semantic images in deep learning for classification and data extraction

Invention Grant

US11250255B2 Systems and methods for generating and using semantic images in deep learning for classification and data extraction 有权

Please log in to see more content

Patent Title: Systems and methods for generating and using semantic images in deep learning for classification and data extraction
Application No.: US17081705

Application Date: 2020-10-27
Publication No.: US11250255B2

Publication Date: 2022-02-15
Inventor: Uwe Ast
Applicant: OPEN TEXT SA ULC
Applicant Address: CA Halifax
Assignee: OPEN TEXT SA ULC
Current Assignee: OPEN TEXT SA ULC
Current Assignee Address: CA Halifax
Agency: Sprinkle IP Law Group
Main IPC: G06K9/00
IPC: G06K9/00 ; G06N5/04 ; G06N3/08 ; G06N20/00 ; G06K9/62 ; G06N3/04 ; G06F40/30 ; G06N5/02

Systems and methods for generating and using semantic images in deep learning for classification and data extraction

Abstract:

Disclosed is a new document processing solution that combines the powers of machine learning and deep learning and leverages the knowledge of a knowledge base. Textual information in an input image of a document can be converted to semantic information utilizing the knowledge base. A semantic image can then be generated utilizing the semantic information and geometries of the textual information. The semantic information can be coded by semantic type determined utilizing the knowledge base and positioned in the semantic image utilizing the geometries of the textual information. A region-based convolutional neural network (R-CNN) can be trained to extract regions from the semantic image utilizing the coded semantic information and the geometries. The regions can be mapped to the textual information for classification/data extraction. With semantic images, the number of samples and time needed to train the R-CNN for document processing can be significantly reduced.

Public/Granted literature

US20210073533A1 SYSTEMS AND METHODS FOR GENERATING AND USING SEMANTIC IMAGES IN DEEP LEARNING FOR CLASSIFICATION AND DATA EXTRACTION Public/Granted day:2021-03-11

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )