-
1.
公开(公告)号:US20040076327A1
公开(公告)日:2004-04-22
申请号:US10272926
申请日:2002-10-18
Applicant: Olive Software Inc.
Inventor: Yonatan P. Stern , Emil Steinvil
IPC: G06K009/34
CPC classification number: H04N1/2191 , G06K9/00993 , H04N1/2166 , H04N2201/041
Abstract: A system and a method for the conversion of archived documents to a digital format and storage of the data extracted in repositories which may be easily extracted and searched by a user over a network such as the Internet. The data is preferably stored in the form of microfilm, although optionally the present invention could be operative with other types of physical media, such as microfiche, paper and any type of printed material. The microfilm data is preferably divided and/or grouped into at least one file. Optionally and preferably, each file undergoes the following automatic processing stages: combining files; analyzing image layout; segmentation; OCR; optional segmentation improvement; and output to XML, or another suitable output data format and/or language. In the last stage, the data contained in the files is preferably extracted and then more preferably transmitted to the relevant repository unit.
-
2.
公开(公告)号:US20030200507A1
公开(公告)日:2003-10-23
申请号:US10449059
申请日:2003-06-02
Applicant: Olive Software, Inc.
Inventor: Yonatan P. Stern , Emil Shteinvil
IPC: G06F015/00
CPC classification number: G06F17/3089 , G06F17/211 , G06F17/30893
Abstract: A system and a method for publishing a newspaper page or other data through a Web page, such that the information can be made available more easily through a network such as the Internet. The data is automatically converted to the Web page format by first rendering the newspaper page into a digital format; converting the digital format to a basic internal publishing format; and then publishing the data in any one of a number of different possible publishing formats, including but not limited, a mark-up language document such as a Web page for example. The present invention supports such advanced features as arrangement of the content of the newspaper according to relationships within the information of the content and/or according to the preference(s) of the user by analyzing the newspaper page as a plurality of objects. Each newspaper object may optionally be a title, an article, a picture and/or other graphic advertisement, and so forth. The different objects may optionally be separated into categories, such that objects in each category are preferably compressed according to a different image format.
Abstract translation: 一种用于通过网页发布报纸页面或其他数据的系统和方法,使得可以通过诸如因特网的网络更容易地获得信息。 通过首先将报纸页面呈现为数字格式,数据自动转换为网页格式; 将数字格式转换为基本的内部发布格式; 然后以许多不同的可能的出版格式(包括但不限于)例如网页之类的标记语言文档中的任何一种发布数据。 本发明通过分析报纸页面作为多个对象,根据内容的信息内的关系和/或根据用户的偏好来支持报纸内容的布置。 每个报纸对象可以可选地是标题,文章,图片和/或其他图形广告等。 不同的对象可以可选地被分成类别,使得每个类别中的对象优选地根据不同的图像格式被压缩。
-