-
公开(公告)号:US09191554B1
公开(公告)日:2015-11-17
申请号:US13677096
申请日:2012-11-14
Applicant: Amazon Technologies, Inc.
Inventor: Vasant Manohar , Sridhar Godavarthy , Viswanath Sankaranarayanan
CPC classification number: H04N1/00198 , G06F17/21 , H04N5/14
Abstract: Some implementations include using a trained classifier to identify page-turn events in a video. The video may be divided into multiple segments based on the page-turn events, with each segment of the multiple segments corresponding to a pair of adjacent pages in a book. Exemplar frames that provide non-redundant data compared to other frames may be chosen from each segment. The exemplar frames may be cropped to include content portions of pages. The exemplar frames may be aligned such that a pixel is located in a same position in each frame. Optical character recognition (OCR) may be performed on exemplar frames and the OCR for exemplar frames in each segment may be combined. The exemplar frames in each segment may be combined to create a composite image for each pair of adjacent pages in the book, and OCR may be performed on the composite image.
Abstract translation: 一些实现包括使用经过训练的分类器来识别视频中的翻页事件。 视频可以基于翻页事件被划分成多个片段,多个片段的每个片段对应于书中的一对相邻页面。 可以从每个段选择与其他帧相比提供非冗余数据的示例帧。 可以裁剪示例帧以包括页面的内容部分。 示例性帧可以对准,使得像素位于每个帧中的相同位置。 可以在示例性帧上执行光学字符识别(OCR),并且可以组合每个段中的示例帧的OCR。 每个段中的示例帧可以被组合以为书中的每对相邻页创建合成图像,并且可以在合成图像上执行OCR。