Invention Grant
- Patent Title: Systems and methods for generating and using semantic images in deep learning for classification and data extraction
-
Application No.: US17081705Application Date: 2020-10-27
-
Publication No.: US11250255B2Publication Date: 2022-02-15
- Inventor: Uwe Ast
- Applicant: OPEN TEXT SA ULC
- Applicant Address: CA Halifax
- Assignee: OPEN TEXT SA ULC
- Current Assignee: OPEN TEXT SA ULC
- Current Assignee Address: CA Halifax
- Agency: Sprinkle IP Law Group
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06N5/04 ; G06N3/08 ; G06N20/00 ; G06K9/62 ; G06N3/04 ; G06F40/30 ; G06N5/02

Abstract:
Disclosed is a new document processing solution that combines the powers of machine learning and deep learning and leverages the knowledge of a knowledge base. Textual information in an input image of a document can be converted to semantic information utilizing the knowledge base. A semantic image can then be generated utilizing the semantic information and geometries of the textual information. The semantic information can be coded by semantic type determined utilizing the knowledge base and positioned in the semantic image utilizing the geometries of the textual information. A region-based convolutional neural network (R-CNN) can be trained to extract regions from the semantic image utilizing the coded semantic information and the geometries. The regions can be mapped to the textual information for classification/data extraction. With semantic images, the number of samples and time needed to train the R-CNN for document processing can be significantly reduced.
Public/Granted literature
Information query