Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Matthew Joseph Cole"

1.

发明授权
Recognizing text from frames of image data using contextual information 有权
Title translation: 使用上下文信息识别来自图像数据帧的文本

公开(公告)号：US09355336B1

公开(公告)日：2016-05-31

申请号：US14259905

申请日：2014-04-23

Applicant: Amazon Technologies, Inc.

Inventor： Sonjeev Jahagirdar , Matthew Joseph Cole , David Paul Ramos , Utkarsh Prateek , Emilie Noelle McConville , Ankur Datta , Laura Varnum Finney , Yue Liu , Bhavesh Anil Doshi , Avnish Sikka , Michael Vanne

IPC: G06K9/00 , G06K9/62

CPC classification number: G06K9/6217 , G06K9/00979 , G06K9/723 , G06K2209/01

Abstract: Disclosed are techniques for recognizing text from one or more frames of image data using contextual information. In some implementations, image data including a captured textual item is processed to identify an entity in the image data. A context can be selected using the entity, where the context corresponds to a dictionary. Text in the captured textual item can be identified using the dictionary. The identified text can be output to a display device.

Abstract translation: 公开了使用上下文信息从一个或多个图像数据帧识别文本的技术。在一些实现中，处理包括捕获的文本项的图像数据以识别图像数据中的实体。可以使用实体选择上下文，其中上下文对应于字典。捕获的文本项目中的文本可以使用字典来识别。识别的文本可以输出到显示设备。

2.

发明授权
Using a front-facing camera to improve OCR with a rear-facing camera 有权
Title translation: 使用前置摄像头，使用后置摄像头改善OCR

公开(公告)号：US09269009B1

公开(公告)日：2016-02-23

申请号：US14283115

申请日：2014-05-20

Applicant: Amazon Technologies, Inc.

Inventor： Yue Liu , Sonjeev Jahagirdar , Matthew Joseph Cole , Utkarsh Prateek , Emilie Noelle McConville , Daniel Makoto Wilenson , Avnish Sikka

IPC: G06K9/18 , G06K9/00

CPC classification number: G06K9/18 , G06K9/00302 , G06K9/00664 , G06K9/033 , G06K2209/01

Abstract: Various embodiments enable a computing device to incorporate frame selection or preprocessing techniques into a text recognition pipeline in an attempt to improve text recognition accuracy in various environments and situations. For example, a mobile computing device can capture images of text using a first camera, such as a rear-facing camera, while capturing images of the environment or a user with a second camera, such as a front-facing camera. Based on the images captured of the environment or user, one or more image preprocessing parameters can be determined and applied to the captured images in an attempt to improve text recognition accuracy.

Abstract translation: 各种实施例使得计算设备能够将帧选择或预处理技术合并到文本识别流水线中，以试图改善各种环境和情况下的文本识别精度。例如，移动计算设备可以使用诸如后置摄像机之类的第一照相机捕获文本的图像，同时利用诸如前置摄像机的第二照相机拍摄环境图像或用户。基于捕获的环境或用户的图像，可以确定一个或多个图像预处理参数并将其应用于捕获的图像，以提高文本识别精度。

3.

发明授权
Text recognition near an edge 有权
Title translation: 靠近边缘的文本识别

公开(公告)号：US09239961B1

公开(公告)日：2016-01-19

申请号：US14495589

申请日：2014-09-24

Applicant: Amazon Technologies, Inc.

Inventor： Matthew Joseph Cole , Yue Liu , David Paul Ramos , Avnish Sikka

IPC: G06K9/00 , G06K9/18 , G06K9/32

CPC classification number: G06K9/00456 , G06K9/2081 , G06K9/325

Abstract: The recognition of text in an acquired image is improved by using general and type-specific heuristics that can determine the likelihood that a portion of the text is truncated at an edge of an image, frame, or screen. Truncated text can be filtered such that the user is not provided with an option to perform an undesirable task, such as to dial an incorrect number or connect to an incorrect Web address, based on recognizing an incomplete text string. The general and type-specific heuristics can be combined to improve confidence, and the image data can be pre-processed on the device before processing with an optical character recognition (OCR) engine. Multiple frames can be analyzed to attempt to recognize words or characters that might have been truncated in one or more of the frames.

Abstract translation: 通过使用可以确定文本的一部分在图像，帧或屏幕的边缘被截断的可能性的一般和类型特定的启发式算法来改进获取的图像中的文本的识别。截断的文本可以被过滤，以便基于识别不完整的文本字符串，用户未被提供执行不期望的任务的选项，例如拨打不正确的号码或连接到不正确的Web地址。一般和类型特定的启发式可以组合以提高置信度，并且可以在使用光学字符识别（OCR）引擎处理之前在设备上预处理图像数据。可以分析多个帧以尝试识别可能在一个或多个帧中被截断的字或字符。

4.

发明授权
Multi-layer keyword detection 有权

公开(公告)号：US12249331B2

公开(公告)日：2025-03-11

申请号：US18144465

申请日：2023-05-08

Applicant: Amazon Technologies, Inc.

Inventor： Christopher Wayne Lockhart , Matthew Joseph Cole , Xulei Liu

IPC: G10L15/02 , G10L15/08 , G10L15/22 , G10L15/26 , G10L15/30 , G10L15/32

Abstract: A system and method for temporarily disabling keyword detection to avoid detection of machine-generated keywords. A local device may operate two keyword detectors. The first keyword detector operates on input audio data received by a microphone to capture keywords uttered by a user. In these instances, the keyword may be detected by the first detector and the audio data may be indicated for speech processing. The system may determine output audio data responsive to the input audio data. The local device may process the output audio data to determine that it also includes the keyword. The device may then disable the first keyword detector while the output audio data is played back by an audio speaker of the local device. Thus the local device may avoid detection of a keyword originating from the output audio. The first keyword detector may be reactivated after a time interval during which the keyword might be detectable in the output audio.

5.

发明申请
MULTI-LAYER KEYWORD DETECTION 有权

公开(公告)号：US20220157311A1

公开(公告)日：2022-05-19

申请号：US17590406

申请日：2022-02-01

Applicant: Amazon Technologies, Inc.

Inventor： Christopher Wayne Lockhart , Matthew Joseph Cole , Xulei Liu

IPC: G10L15/22 , G10L15/02 , G10L15/08 , G10L15/26

Abstract: A system and method for temporarily disabling keyword detection to avoid detection of machine-generated keywords. A local device may operate two keyword detectors. The first keyword detector operates on input audio data received by a microphone to capture keywords uttered by a user. In these instances, the keyword may be detected by the first detector and the audio data may be indicated for speech processing. The system may determine output audio data responsive to the input audio data. The local device may process the output audio data to determine that it also includes the keyword. The device may then disable the first keyword detector while the output audio data is played back by an audio speaker of the local device. Thus the local device may avoid detection of a keyword originating from the output audio. The first keyword detector may be reactivated after a time interval during which the keyword might be detectable in the output audio.

6.

发明申请
MULTI-LAYER KEYWORD DETECTION 审中-公开

公开(公告)号：US20200175989A1

公开(公告)日：2020-06-04

申请号：US16783826

申请日：2020-02-06

Applicant: Amazon Technologies, Inc.

Inventor： Christopher Wayne Lockhart , Matthew Joseph Cole , Xulei Liu

IPC: G10L15/22 , G10L15/02 , G10L15/08 , G10L15/30 , G10L15/26

Abstract: A system and method for temporarily disabling keyword detection to avoid detection of machine-generated keywords. A local device may operate two keyword detectors. The first keyword detector operates on input audio data received by a microphone to capture keywords uttered by a user. In these instances, the keyword may be detected by the first detector and the audio data may be transmitted to a remote device for processing. The remote device may generate output audio data to be sent to the local device. The local device may process the output audio data to determine that it also includes the keyword. The device may then disable the first keyword detector while the output audio data is played back by an audio speaker of the local device. Thus the local device may avoid detection of a keyword originating from the output audio. The first keyword detector may be reactivated after a time interval during which the keyword might be detectable in the output audio.

7.

发明授权
Merging optical character recognized text from frames of image data 有权

公开(公告)号：US09659224B1

公开(公告)日：2017-05-23

申请号：US14230471

申请日：2014-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Matthew Joseph Cole , Sonjeev Jahagirdar , Matthew Daniel Hart , David Paul Ramos , Ankur Datta , Utkarsh Prateek , Emilie Noelle McConville , Prashant Hegde , Avnish Sikka

IPC: G06K9/18 , G06K9/00

CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K9/72 , G06K2209/01 , G06K9/00449 , G06K9/00463 , G06K9/00442

Abstract: Disclosed are techniques for merging optical character recognized (OCR'd) text from frames of image data. In some implementations, a device sends frames of image data to a server, where each frame includes at least a portion of a captured textual item. The server performs optical character recognition (OCR) on the image data of each frame. When OCR'd text from respective frames is returned to the device from the server, the device can perform matching operations on the text, for instance, using bounding boxes and/or edit distance processing. The device can merge any identified matches of OCR'd text from different frames. The device can then display the merged text with any corrections.

8.

发明授权
Hybrid optical character recognition 有权
Title translation: 混合光学字符识别

公开(公告)号：US09305227B1

公开(公告)日：2016-04-05

申请号：US14139752

申请日：2013-12-23

Applicant: Amazon Technologies, Inc.

Inventor： Rakesh Madhavan Nambiar , Sonjeev Jahagirdar , Matthew Joseph Cole , Matias Omar Gregorio Benitez , Junxiong Jia , David Paul Ramos

IPC: G06K9/18

CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K2209/01

Abstract: Embodiments of the subject technology provide for a hybrid OCR approach which combines server and device side processing that can offset disadvantages of performing OCR solely on the server side or the device side. More specifically, the subject technology utilizes image characteristics such as glyph details and image quality measurements to opportunistically schedule OCR processing on the mobile device and/or server. In this regard, text extracted by a “faster” OCR engine (e.g., one with less latency) is displayed to a user, which is then updated by the result of a more accurate OCR engine (e.g., an OCR engine provided by the server). This approach allows factoring in additional parameters such as network latency and user preference for making scheduling decisions. Thus, the subject technology may provide significant gains in terms of reduced latency and increased accuracy by implementing one or more techniques associated with this hybrid OCR approach.

Abstract translation: 本技术的实施例提供了一种组合服务器和设备侧处理的混合OCR方法，其可以抵消仅在服务器侧或设备侧执行OCR的缺点。更具体地，本主题技术利用诸如字形细节和图像质量测量的图像特征来机会地在移动设备和/或服务器上调度OCR处理。在这方面，由“更快的”OCR引擎提取的文本（例如，具有较小延迟的引擎）被显示给用户，然后由更准确的OCR引擎（例如，由服务器提供的OCR引擎）的结果来更新）。这种方法允许考虑附加参数，例如网络延迟和用户偏好以进行调度决策。因此，本技术可以通过实施与该混合OCR方法相关联的一种或多种技术在减少的延迟和增加的准确性方面提供显着的增益。

9.

发明授权
Providing additional information for text in an image 有权
Title translation: 为图像中的文本提供附加信息

公开(公告)号：US09165186B1

公开(公告)日：2015-10-20

申请号：US14291493

申请日：2014-05-30

Applicant: Amazon Technologies, Inc.

Inventor： David Paul Ramos , Matthew Joseph Cole , Matthew Daniel Hart

IPC: G06K9/00

CPC classification number: G06K9/00442 , G06F17/241 , G06F17/2785 , G06K9/00979 , G06K9/03 , G06K2209/01

Abstract: Disclosed are techniques for providing additional information for text in an image. In some implementations, a computing device receives an image including text. Optical character recognition (OCR) is performed on the image to produce recognized text. One or more topics corresponding to the recognized text is determined. A word or a phrase is selected from the recognized text for providing additional information. One or more potential meanings of the selected word or phrase are determined. One of the potential meanings is selected using the one or more topics. A source of additional information corresponding to the selected meaning is selected for providing the additional information to a user's device.

Abstract translation: 公开了用于为图像中的文本提供附加信息的技术。在一些实现中，计算设备接收包括文本的图像。在图像上执行光学字符识别（OCR）以产生识别的文本。确定与识别的文本相对应的一个或多个主题。从识别的文本中选择一个单词或短语以提供附加信息。确定所选择的单词或短语的一个或多个潜在含义。使用一个或多个主题选择潜在的含义之一。选择对应于所选择的含义的附加信息的来源以将附加信息提供给用户的设备。

10.

发明授权
Multi-layer keyword detection 有权

公开(公告)号：US11250851B2

公开(公告)日：2022-02-15

申请号：US16783826

申请日：2020-02-06

Applicant: Amazon Technologies, Inc.

Inventor： Christopher Wayne Lockhart , Matthew Joseph Cole , Xulei Liu

IPC: G10L15/02 , G10L15/08 , G10L15/22 , G10L15/26 , G10L15/30 , G10L15/32

Abstract: A system and method for temporarily disabling keyword detection to avoid detection of machine-generated keywords. A local device may operate two keyword detectors. The first keyword detector operates on input audio data received by a microphone to capture keywords uttered by a user. In these instances, the keyword may be detected by the first detector and the audio data may be transmitted to a remote device for processing. The remote device may generate output audio data to be sent to the local device. The local device may process the output audio data to determine that it also includes the keyword. The device may then disable the first keyword detector while the output audio data is played back by an audio speaker of the local device. Thus the local device may avoid detection of a keyword originating from the output audio. The first keyword detector may be reactivated after a time interval during which the keyword might be detectable in the output audio.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification