-
公开(公告)号:US10043069B1
公开(公告)日:2018-08-07
申请号:US14196669
申请日:2014-03-04
Applicant: Amazon Technologies, Inc.
Inventor: Yue Liu , Utkarsh Prateek , Avnish Sikka , Matthew Daniel Hart , Emilie Noelle McConville , Sonjeev Jahagirdar
IPC: G06K9/00
Abstract: A system for recognizing objects and/or text in image data may use context data to perform object/text recognition. The system may also use context data when determining potential functions to execute in response to recognizing the object/text. Context data may be gathered based on device sensor data, user profile data such as the behavior of a user or the behavior of those in a user's social network, or other factors. Recognition processing and/or function selection may be configured to account for context data when operating to improve output results.
-
公开(公告)号:US09659224B1
公开(公告)日:2017-05-23
申请号:US14230471
申请日:2014-03-31
Applicant: Amazon Technologies, Inc.
Inventor: Matthew Joseph Cole , Sonjeev Jahagirdar , Matthew Daniel Hart , David Paul Ramos , Ankur Datta , Utkarsh Prateek , Emilie Noelle McConville , Prashant Hegde , Avnish Sikka
CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K9/72 , G06K2209/01 , G06K9/00449 , G06K9/00463 , G06K9/00442
Abstract: Disclosed are techniques for merging optical character recognized (OCR'd) text from frames of image data. In some implementations, a device sends frames of image data to a server, where each frame includes at least a portion of a captured textual item. The server performs optical character recognition (OCR) on the image data of each frame. When OCR'd text from respective frames is returned to the device from the server, the device can perform matching operations on the text, for instance, using bounding boxes and/or edit distance processing. The device can merge any identified matches of OCR'd text from different frames. The device can then display the merged text with any corrections.
-
公开(公告)号:US09165186B1
公开(公告)日:2015-10-20
申请号:US14291493
申请日:2014-05-30
Applicant: Amazon Technologies, Inc.
Inventor: David Paul Ramos , Matthew Joseph Cole , Matthew Daniel Hart
IPC: G06K9/00
CPC classification number: G06K9/00442 , G06F17/241 , G06F17/2785 , G06K9/00979 , G06K9/03 , G06K2209/01
Abstract: Disclosed are techniques for providing additional information for text in an image. In some implementations, a computing device receives an image including text. Optical character recognition (OCR) is performed on the image to produce recognized text. One or more topics corresponding to the recognized text is determined. A word or a phrase is selected from the recognized text for providing additional information. One or more potential meanings of the selected word or phrase are determined. One of the potential meanings is selected using the one or more topics. A source of additional information corresponding to the selected meaning is selected for providing the additional information to a user's device.
Abstract translation: 公开了用于为图像中的文本提供附加信息的技术。 在一些实现中,计算设备接收包括文本的图像。 在图像上执行光学字符识别(OCR)以产生识别的文本。 确定与识别的文本相对应的一个或多个主题。 从识别的文本中选择一个单词或短语以提供附加信息。 确定所选择的单词或短语的一个或多个潜在含义。 使用一个或多个主题选择潜在的含义之一。 选择对应于所选择的含义的附加信息的来源以将附加信息提供给用户的设备。
-
公开(公告)号:US10216989B1
公开(公告)日:2019-02-26
申请号:US14884068
申请日:2015-10-15
Applicant: Amazon Technologies, Inc.
Inventor: David Paul Ramos , Matthew Joseph Cole , Matthew Daniel Hart
Abstract: Disclosed are techniques for providing additional information for text in an image. In some implementations, a computing device receives an image including text. Optical character recognition (OCR) is performed on the image to produce recognized text. A word or a phrase is selected from the recognized text for providing additional information. One or more potential meanings of the selected word or phrase are determined. One of the potential meanings is selected based on other text in the image. A source of additional information corresponding to the selected meaning is selected for providing the additional information to a user's device.
-
公开(公告)号:US09262689B1
公开(公告)日:2016-02-16
申请号:US14133347
申请日:2013-12-18
Applicant: Amazon Technologies, Inc.
Inventor: Avnish Sikka , David Paul Ramos , Matthew Daniel Hart , Yue Liu , Emilie Noelle McConville
CPC classification number: G06K9/34 , G06K9/325 , G06K2209/01
Abstract: Embodiments of the subject technology provide for determining a region of a first acquired image based at least on a viewing mode and a set of respective positions of graphical elements to decrease the pre-processing time and perceived latency for the first image. One or more regions of text in the first image are detected, and a set of regions of text that overlap with the region of the image is determined and pre-processed. The subject technology may then pre-process an entirety of a subsequent image (e.g., to pick up missing text from the region of the first image). Thus, additional OCR results may be provided to the user by using the subsequent image(s) and merging subsequent results with previous results from the first image.
Abstract translation: 本技术的实施例提供了至少基于观看模式和图形元素的各个位置的集合来确定第一获取图像的区域,以减少第一图像的预处理时间和感知等待时间。 检测第一图像中的一个或多个文本区域,并且确定并预处理与图像的区域重叠的一组文本区域。 主题技术可以预处理后续图像的整体(例如,从第一图像的区域拾取丢失的文本)。 因此,可以通过使用后续图像向用户提供附加的OCR结果,并将后续结果与来自第一图像的先前结果合并。
-
-
-
-