Invention Application
US20110255789A1 SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS CONTAINING MULTIPLE LAYOUT FEATURES
审中-公开
从包含多个布局特征的电子文档自动提取数据的系统和方法
- Patent Title: SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS CONTAINING MULTIPLE LAYOUT FEATURES
- Patent Title (中): 从包含多个布局特征的电子文档自动提取数据的系统和方法
-
Application No.: US13007443Application Date: 2011-01-14
-
Publication No.: US20110255789A1Publication Date: 2011-10-20
- Inventor: Depankar NEOGI , Steven K. LADD , Girish WELLING , Arjun KUMAR , Vartika SINGH , Matthew DUGGAN , Tushar MAHATA , Xiaobin YANG , Jian-Wu XU , Janice O'NEIL , Nirupam SARKAR , Gopal KRISHNA
- Applicant: Depankar NEOGI , Steven K. LADD , Girish WELLING , Arjun KUMAR , Vartika SINGH , Matthew DUGGAN , Tushar MAHATA , Xiaobin YANG , Jian-Wu XU , Janice O'NEIL , Nirupam SARKAR , Gopal KRISHNA
- Applicant Address: US MA Andover
- Assignee: COPANION, INC.
- Current Assignee: COPANION, INC.
- Current Assignee Address: US MA Andover
- Main IPC: G06K9/46
- IPC: G06K9/46

Abstract:
A method of automatically extracting data from an electronic document containing a plurality of layout features through progressive refinement is provided. The method includes: analyzing each document to automatically extract images and text features wherein each document includes at least two features that are related to each other, and wherein said analyzing compares extracted features with a first search space of candidate features to try and recognize the extracted features; if one of the at least two related features is not recognized and at least one feature is recognized, selecting a second search space of candidate features in response thereto and in response to predefined rules about the relationship between the two features; and comparing the unrecognized feature with said selected second search space.
Information query