摘要:
A live video stream captured by an on-device camera is displayed on a screen with an overlaid guideline. Video frames of the live video stream are analyzed for a video frame with acceptable quality. A text region is identified in the video frame approximate to the on-screen guideline and cropped from the video frame. The cropped image is transmitted to an optical character recognition (OCR) engine, which processes the cropped image and generates text in an editable symbolic form (the OCR'ed text). A confidence score is determined for the OCR'ed text and compared with a threshold value. If the confidence score exceeds the threshold value, the OCR'ed text is outputted.
摘要:
The present invention relates to systems and methods for analyzing media material having articles continuing across multiple pages. A media material analyzer includes a segmenter and an article composer. The segmenter identifies block segments associated with columnar body test in the media material. The article composer determines which of the identified block segments belong to a continuing article extending across multiple pages in the media material based on language statistics information and continuation transition information.
摘要:
A Mixed Media Reality (MMR) system and associated techniques are disclosed. The MMR system provides mechanisms for forming a mixed media document that includes media of at least two types (e.g., printed paper as a first medium and digital content and/or web link as a second medium). In one particular embodiment, the MMR system includes a content-based retrieval database configured with an index table to represent two-dimensional geometric relationships between objects extracted from a printed document in a way that allows look-up using a text-based index. A ranked set of document, page and location hypotheses can be computed given data from the index table. The techniques effectively transform features detected in an image patch into textual terms (or other searchable features) that represent both the features themselves and the geometric relationship between them. A storage facility can be used to store additional characteristics about each document image patch.
摘要:
A Mixed Media Reality (MMR) system and associated techniques are disclosed. The MMR system provides mechanisms for forming a mixed media document that includes media of at least two types (e.g., printed paper as a first medium and digital content and/or web link as a second medium). In one particular embodiment, the MMR system includes a content-based retrieval database configured with an index table to represent two-dimensional geometric relationships between objects extracted from a printed document in a way that allows look-up using a text-based index. A ranked set of document, page and location hypotheses can be computed given data from the index table. The techniques effectively transform features detected in an image patch into textual terms (or other searchable features) that represent both the features themselves and the geometric relationship between them. A storage facility can be used to store additional characteristics about each document image patch.
摘要:
Techniques for capturing and receiving digital information. The digital information may comprise information of one or more types including video information, audio information, images, documents, whiteboard information, notes information, and the like. Various actions may be performed based upon the digital information. The digital information may be captured by one or more capture devices.
摘要:
A printing system enables the printing of enhanced documents using a semantic classification scheme. A printing system receives an image to be printed. The system classifies the image according to the semantic classification scheme and, based on this classification, performs enhancement processing on the image. Depending on the desired application, the printing system may recognize and classify any number of image types and may then perform various enhancement processing functions on the image, where the type of enhancement processing performed is based on the classification of the image.
摘要:
A system and a method for visually summarizing a document comprising a display, a processor coupled to the display, and a memory coupled to the processor. Stored in the memory is a routine, which when executed by the processor, causes the processor to generate display data. The routine causes the processor to generate data through extracting at least one visual feature from a document having a plurality of pages, ranking the pages in a document, selecting a page for representing a document according to the visual feature, and displaying the selected page as display data.
摘要:
An image is displayed on a touch screen. A user's underline gesture on the displayed image is detected. The area of the image touched by the underline gesture and a surrounding region approximate to the touched area are identified. Skew for text in the surrounding region is determined and compensated. A text region including the text is identified in the surrounding region and cropped from the image. The cropped image is transmitted to an optical character recognition (OCR) engine, which processes the cropped image and returns OCR'ed text. The OCR'ed text is outputted.
摘要:
Automated techniques for comparing contents of images. For a given image (referred to as an “input image”), a set of images (referred to as “a set of candidate images”) are processed to determine if the set of candidate images comprises an image whose contents or portions thereof match contents included in a region of interest in the input image.
摘要:
A Mixed Media Reality (MMR) system and associated techniques are disclosed. The MMR system provides mechanisms for forming a mixed media document that includes media of at least two types (e.g., printed paper as a first medium and digital content and/or web link as a second medium). In one particular embodiment, the MMR system provides for position-based image matching.