发明授权
- 专利标题: Systems and methods to extract data automatically from a composite electronic document
- 专利标题(中): 从复合电子文档自动提取数据的系统和方法
-
申请号: US12132845申请日: 2008-06-04
-
公开(公告)号: US08140468B2公开(公告)日: 2012-03-20
- 发明人: Thomas Yu-Kiu Kwok , Thao Ngoc Nguyen , Kakan Roy
- 申请人: Thomas Yu-Kiu Kwok , Thao Ngoc Nguyen , Kakan Roy
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 代理机构: Tutunjian & Bitetto, P.C.
- 代理商 William J. Stock
- 主分类号: G06F7/00
- IPC分类号: G06F7/00 ; G06K9/46
摘要:
A system and method for automatically extracting contract data from electronic contracts includes an administrator module configured to provide templates for inputting document patterns and a list of contract data tags for each of a plurality of contract document types. A parser is configured to convert an electronic contract document into a contract text document and reformat the contract text document to provide a pattern for the text contract document. A pattern recognition engine is configured to determine a list of contract document types in the electronic contract by comparing and matching patterns of all known contract document types with the pattern of the contract text document. A contract data extraction engine is configured to extract contract data for each contract document type on the list.
公开/授权文献
信息查询