Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Sonjeev Jahagirdar"

1.

发明授权
Item recognition using context data 有权

公开(公告)号：US10043069B1

公开(公告)日：2018-08-07

申请号：US14196669

申请日：2014-03-04

Applicant: Amazon Technologies, Inc.

Inventor： Yue Liu , Utkarsh Prateek , Avnish Sikka , Matthew Daniel Hart , Emilie Noelle McConville , Sonjeev Jahagirdar

IPC: G06K9/00

Abstract: A system for recognizing objects and/or text in image data may use context data to perform object/text recognition. The system may also use context data when determining potential functions to execute in response to recognizing the object/text. Context data may be gathered based on device sensor data, user profile data such as the behavior of a user or the behavior of those in a user's social network, or other factors. Recognition processing and/or function selection may be configured to account for context data when operating to improve output results.

2.

发明授权
Visual and audio recognition for scene change events 有权
Title translation: 场景变化事件的视觉和音频识别

公开(公告)号：US09536161B1

公开(公告)日：2017-01-03

申请号：US14307090

申请日：2014-06-17

Applicant: Amazon Technologies, Inc.

Inventor： Christopher John Lish , Oleg Rybakov , Sonjeev Jahagirdar , Junxiong Jia , Neil David Cooper , Avnish Sikka

IPC: G06K9/20 , H04N5/232 , G06K9/32

CPC classification number: H04N5/23245 , G01S3/00 , G06K9/00664 , G06K2009/3291 , H04N5/232 , H04N5/23219 , H04N5/247

Abstract: Various embodiments describe systems and methods for utilizing a reduced amount of processing capacity for incoming data over time, and, in response to detecting a scene-change-event, notify one or more data processors that a scene-change-event has occurred, and cause incoming data to be processed as new data. In some embodiments, an incoming frame can be compared with a reference frame to determine a difference between the reference frame and the incoming frame. The reference frame may be correlated to a latest scene-change-event. In response to a determination that the difference does not meet one or more difference criteria, a user interface or at least one processor of the computing device can be notified to reduce processing of incoming data over time. In response to a determination that the difference meets the one or more difference criteria, the user interface or the at least one processor can be notified that a scene-change-event has occurred. Incoming data to the computing device is then treated as new and processed as those under an active condition. The current incoming frame can be selected as a new reference frame for detecting next scene-change-event.

Abstract translation: 各种实施例描述了随着时间的推移对于输入数据利用减少量的处理能力的系统和方法，并且响应于检测到场景改变事件，通知一个或多个数据处理器已经发生场景变化事件，以及将传入的数据作为新数据进行处理。在一些实施例中，输入帧可以与参考帧进行比较，以确定参考帧和输入帧之间的差异。参考帧可以与最新的场景变化事件相关联。响应于差异不符合一个或多个差异标准的确定，可以通知用户界面或计算设备的至少一个处理器以减少输入数据随时间的处理。响应于差异满足一个或多个差异标准的确定，可以向用户界面或至少一个处理器通知场景变化事件已经发生。然后将接收到计算设备的数据视为新的，并处理为处于活动状态的数据。可以将当前输入帧选择为用于检测下一个场景改变事件的新参考帧。

3.

发明授权
Merging optical character recognized text from frames of image data 有权

公开(公告)号：US09659224B1

公开(公告)日：2017-05-23

申请号：US14230471

申请日：2014-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Matthew Joseph Cole , Sonjeev Jahagirdar , Matthew Daniel Hart , David Paul Ramos , Ankur Datta , Utkarsh Prateek , Emilie Noelle McConville , Prashant Hegde , Avnish Sikka

IPC: G06K9/18 , G06K9/00

CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K9/72 , G06K2209/01 , G06K9/00449 , G06K9/00463 , G06K9/00442

Abstract: Disclosed are techniques for merging optical character recognized (OCR'd) text from frames of image data. In some implementations, a device sends frames of image data to a server, where each frame includes at least a portion of a captured textual item. The server performs optical character recognition (OCR) on the image data of each frame. When OCR'd text from respective frames is returned to the device from the server, the device can perform matching operations on the text, for instance, using bounding boxes and/or edit distance processing. The device can merge any identified matches of OCR'd text from different frames. The device can then display the merged text with any corrections.

4.

发明授权
Hybrid optical character recognition 有权
Title translation: 混合光学字符识别

公开(公告)号：US09305227B1

公开(公告)日：2016-04-05

申请号：US14139752

申请日：2013-12-23

Applicant: Amazon Technologies, Inc.

Inventor： Rakesh Madhavan Nambiar , Sonjeev Jahagirdar , Matthew Joseph Cole , Matias Omar Gregorio Benitez , Junxiong Jia , David Paul Ramos

IPC: G06K9/18

CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K2209/01

Abstract: Embodiments of the subject technology provide for a hybrid OCR approach which combines server and device side processing that can offset disadvantages of performing OCR solely on the server side or the device side. More specifically, the subject technology utilizes image characteristics such as glyph details and image quality measurements to opportunistically schedule OCR processing on the mobile device and/or server. In this regard, text extracted by a “faster” OCR engine (e.g., one with less latency) is displayed to a user, which is then updated by the result of a more accurate OCR engine (e.g., an OCR engine provided by the server). This approach allows factoring in additional parameters such as network latency and user preference for making scheduling decisions. Thus, the subject technology may provide significant gains in terms of reduced latency and increased accuracy by implementing one or more techniques associated with this hybrid OCR approach.

Abstract translation: 本技术的实施例提供了一种组合服务器和设备侧处理的混合OCR方法，其可以抵消仅在服务器侧或设备侧执行OCR的缺点。更具体地，本主题技术利用诸如字形细节和图像质量测量的图像特征来机会地在移动设备和/或服务器上调度OCR处理。在这方面，由“更快的”OCR引擎提取的文本（例如，具有较小延迟的引擎）被显示给用户，然后由更准确的OCR引擎（例如，由服务器提供的OCR引擎）的结果来更新）。这种方法允许考虑附加参数，例如网络延迟和用户偏好以进行调度决策。因此，本技术可以通过实施与该混合OCR方法相关联的一种或多种技术在减少的延迟和增加的准确性方面提供显着的增益。

5.

发明授权
Recognizing text from frames of image data using contextual information 有权
Title translation: 使用上下文信息识别来自图像数据帧的文本

公开(公告)号：US09355336B1

公开(公告)日：2016-05-31

申请号：US14259905

申请日：2014-04-23

Applicant: Amazon Technologies, Inc.

Inventor： Sonjeev Jahagirdar , Matthew Joseph Cole , David Paul Ramos , Utkarsh Prateek , Emilie Noelle McConville , Ankur Datta , Laura Varnum Finney , Yue Liu , Bhavesh Anil Doshi , Avnish Sikka , Michael Vanne

IPC: G06K9/00 , G06K9/62

CPC classification number: G06K9/6217 , G06K9/00979 , G06K9/723 , G06K2209/01

Abstract: Disclosed are techniques for recognizing text from one or more frames of image data using contextual information. In some implementations, image data including a captured textual item is processed to identify an entity in the image data. A context can be selected using the entity, where the context corresponds to a dictionary. Text in the captured textual item can be identified using the dictionary. The identified text can be output to a display device.

Abstract translation: 公开了使用上下文信息从一个或多个图像数据帧识别文本的技术。在一些实现中，处理包括捕获的文本项的图像数据以识别图像数据中的实体。可以使用实体选择上下文，其中上下文对应于字典。捕获的文本项目中的文本可以使用字典来识别。识别的文本可以输出到显示设备。

6.

发明授权
Graphical refinement for points of interest 有权
Title translation: 兴趣点的图形细化

公开(公告)号：US09269011B1

公开(公告)日：2016-02-23

申请号：US13764646

申请日：2013-02-11

Applicant: Amazon Technologies, Inc.

Inventor： Avnish Sikka , James Sassano , Sonjeev Jahagirdar , Pengcheng Wu , Nicholas Randal Sovich

IPC: G06K9/32

CPC classification number: G06K9/3233 , G06K9/00671 , G06K9/2081 , G09G5/377 , G09G2340/12

Abstract: Various embodiments crowd source images to cover various angles, zoom levels, and elevations of objects and/or points of interest (POIs) while under various lighting conditions. The crowd sourced images are tagged or associated with a particular POI or geographic location and stored in a database for use by an augmented reality (AR) application to recognize objects appearing in a live view of a scene captured by at least one camera of a computing device. The more comprehensive the database, the more accurately an object or POI in the scene will be recognized and/or tracked by the AR application. Accordingly, the more accurately an object is recognized and tracked by the AR application, the more smoothly and continuous the content and movement transitions thereof can be presented to users in the live view.

Abstract translation: 在各种照明条件下，各种实施例使源图像覆盖对象和/或兴趣点（POI）的各种角度，缩放级别和高程。所拍摄的图像被标记或与特定POI或地理位置相关联并存储在数据库中以供增强现实（AR）应用程序使用，以识别出现在计算机的至少一个照相机拍摄的场景的实时视图中的对象设备。数据库越全面，AR应用程序将识别和/或跟踪场景中的对象或POI越准确。因此，由AR应用程序识别和跟踪对象的准确度越高，在实时视图中，用户可以将其内容和移动转换更为平滑和连续地呈现给用户。

7.

发明授权
Using a front-facing camera to improve OCR with a rear-facing camera 有权
Title translation: 使用前置摄像头，使用后置摄像头改善OCR

公开(公告)号：US09269009B1

公开(公告)日：2016-02-23

申请号：US14283115

申请日：2014-05-20

Applicant: Amazon Technologies, Inc.

Inventor： Yue Liu , Sonjeev Jahagirdar , Matthew Joseph Cole , Utkarsh Prateek , Emilie Noelle McConville , Daniel Makoto Wilenson , Avnish Sikka

IPC: G06K9/18 , G06K9/00

CPC classification number: G06K9/18 , G06K9/00302 , G06K9/00664 , G06K9/033 , G06K2209/01

Abstract: Various embodiments enable a computing device to incorporate frame selection or preprocessing techniques into a text recognition pipeline in an attempt to improve text recognition accuracy in various environments and situations. For example, a mobile computing device can capture images of text using a first camera, such as a rear-facing camera, while capturing images of the environment or a user with a second camera, such as a front-facing camera. Based on the images captured of the environment or user, one or more image preprocessing parameters can be determined and applied to the captured images in an attempt to improve text recognition accuracy.

Abstract translation: 各种实施例使得计算设备能够将帧选择或预处理技术合并到文本识别流水线中，以试图改善各种环境和情况下的文本识别精度。例如，移动计算设备可以使用诸如后置摄像机之类的第一照相机捕获文本的图像，同时利用诸如前置摄像机的第二照相机拍摄环境图像或用户。基于捕获的环境或用户的图像，可以确定一个或多个图像预处理参数并将其应用于捕获的图像，以提高文本识别精度。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification