System and method for automatic preparation of data repositories from microfilm-type materials

    公开(公告)号:US20040076327A1

    公开(公告)日:2004-04-22

    申请号:US10272926

    申请日:2002-10-18

    CPC classification number: H04N1/2191 G06K9/00993 H04N1/2166 H04N2201/041

    Abstract: A system and a method for the conversion of archived documents to a digital format and storage of the data extracted in repositories which may be easily extracted and searched by a user over a network such as the Internet. The data is preferably stored in the form of microfilm, although optionally the present invention could be operative with other types of physical media, such as microfiche, paper and any type of printed material. The microfilm data is preferably divided and/or grouped into at least one file. Optionally and preferably, each file undergoes the following automatic processing stages: combining files; analyzing image layout; segmentation; OCR; optional segmentation improvement; and output to XML, or another suitable output data format and/or language. In the last stage, the data contained in the files is preferably extracted and then more preferably transmitted to the relevant repository unit.

    System and method for data publication through web pages
    2.
    发明申请
    System and method for data publication through web pages 有权
    通过网页数据发布的系统和方法

    公开(公告)号:US20030200507A1

    公开(公告)日:2003-10-23

    申请号:US10449059

    申请日:2003-06-02

    CPC classification number: G06F17/3089 G06F17/211 G06F17/30893

    Abstract: A system and a method for publishing a newspaper page or other data through a Web page, such that the information can be made available more easily through a network such as the Internet. The data is automatically converted to the Web page format by first rendering the newspaper page into a digital format; converting the digital format to a basic internal publishing format; and then publishing the data in any one of a number of different possible publishing formats, including but not limited, a mark-up language document such as a Web page for example. The present invention supports such advanced features as arrangement of the content of the newspaper according to relationships within the information of the content and/or according to the preference(s) of the user by analyzing the newspaper page as a plurality of objects. Each newspaper object may optionally be a title, an article, a picture and/or other graphic advertisement, and so forth. The different objects may optionally be separated into categories, such that objects in each category are preferably compressed according to a different image format.

    Abstract translation: 一种用于通过网页发布报纸页面或其他数据的系统和方法,使得可以通过诸如因特网的网络更容易地获得信息。 通过首先将报纸页面呈现为数字格式,数据自动转换为网页格式; 将数字格式转换为基本的内部发布格式; 然后以许多不同的可能的出版格式(包括但不限于)例如网页之类的标记语言文档中的任何一种发布数据。 本发明通过分析报纸页面作为多个对象,根据内容的信息内的关系和/或根据用户的偏好来支持报纸内容的布置。 每个报纸对象可以可选地是标题,文章,图片和/或其他图形广告等。 不同的对象可以可选地被分成类别,使得每个类别中的对象优选地根据不同的图像格式被压缩。

Patent Agency Ranking