Invention Application
US20110258150A1 SYSTEMS AND METHODS FOR TRAINING DOCUMENT ANALYSIS SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM DOCUMENTS 审中-公开
用于培训文件分析系统的系统和方法,用于从文档自动提取数据

SYSTEMS AND METHODS FOR TRAINING DOCUMENT ANALYSIS SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM DOCUMENTS
Abstract:
A method of training a document analysis system to extract data from documents is provided. The method includes: automatically analyzing images and text features extracted from a document to associate the document with a corresponding document category; comparing the extracted text features with a set of text features associated with corresponding category of the document, in which the set of text features includes a set of characters, words, and phrases; if the extracted features are found to consist of the characters, words, and phrases belonging to the set of text features associated with the corresponding document category, storing the extracted text features as the data contained in the corresponding document; and, if the extracted text features are found to include at least one text feature that does not belong to the set of text features associated with the corresponding document category, submitting the unrecognized text features to a training phase.
Information query
Patent Agency Ranking
0/0