Language agnostic phonetic entity resolution

    公开(公告)号:US11157696B1

    公开(公告)日:2021-10-26

    申请号:US16017313

    申请日:2018-06-25

    Abstract: Techniques for performing entity resolution as part of natural language understanding processing are described. During offline operations, a system may convert text (representing entities known to the system) into audio of various languages. The languages into which the text is converted may depend on the location where the entity is likely to be spoken by users of the system. At runtime, the system processes a user input using text-based entity resolution. If text-based entity resolution fails, the system may identify user speech corresponding to an entity to be resolved, and attempt to phonetically match the user speech to the audio of the known entities. Results of the phonetic entity resolution may then used by downstream components, such as skills.

    Text recognition near an edge
    3.
    发明授权
    Text recognition near an edge 有权
    靠近边缘的文本识别

    公开(公告)号:US09239961B1

    公开(公告)日:2016-01-19

    申请号:US14495589

    申请日:2014-09-24

    CPC classification number: G06K9/00456 G06K9/2081 G06K9/325

    Abstract: The recognition of text in an acquired image is improved by using general and type-specific heuristics that can determine the likelihood that a portion of the text is truncated at an edge of an image, frame, or screen. Truncated text can be filtered such that the user is not provided with an option to perform an undesirable task, such as to dial an incorrect number or connect to an incorrect Web address, based on recognizing an incomplete text string. The general and type-specific heuristics can be combined to improve confidence, and the image data can be pre-processed on the device before processing with an optical character recognition (OCR) engine. Multiple frames can be analyzed to attempt to recognize words or characters that might have been truncated in one or more of the frames.

    Abstract translation: 通过使用可以确定文本的一部分在图像,帧或屏幕的边缘被截断的可能性的一般和类型特定的启发式算法来改进获取的图像中的文本的识别。 截断的文本可以被过滤,以便基于识别不完整的文本字符串,用户未被提供执行不期望的任务的选项,例如拨打不正确的号码或连接到不正确的Web地址。 一般和类型特定的启发式可以组合以提高置信度,并且可以在使用光学字符识别(OCR)引擎处理之前在设备上预处理图像数据。 可以分析多个帧以尝试识别可能在一个或多个帧中被截断的字或字符。

    Providing additional information for text in an image

    公开(公告)号:US10216989B1

    公开(公告)日:2019-02-26

    申请号:US14884068

    申请日:2015-10-15

    Abstract: Disclosed are techniques for providing additional information for text in an image. In some implementations, a computing device receives an image including text. Optical character recognition (OCR) is performed on the image to produce recognized text. A word or a phrase is selected from the recognized text for providing additional information. One or more potential meanings of the selected word or phrase are determined. One of the potential meanings is selected based on other text in the image. A source of additional information corresponding to the selected meaning is selected for providing the additional information to a user's device.

    Text detection using features associated with neighboring glyph pairs
    6.
    发明授权
    Text detection using features associated with neighboring glyph pairs 有权
    使用与相邻字形对相关联的功能的文本检测

    公开(公告)号:US09367736B1

    公开(公告)日:2016-06-14

    申请号:US14842125

    申请日:2015-09-01

    Abstract: A multi-orientation text detection method and associated system is disclosed that utilizes orientation-variant glyph features to determine a text line in an image regardless of an orientation of the text line. Glyph features are determined for each glyph in an image with respect to a neighboring glyph. The glyph features are provided to a learned classifier that outputs a glyph pair score for each neighboring glyph pair. Each glyph pair score indicates a likelihood that the corresponding pair of neighboring glyphs form part of a same text line. The glyph pair scores are used to identify candidate text lines, which are then ranked to select a final set of text lines in the image.

    Abstract translation: 公开了一种多方向文本检测方法和相关系统,其利用取向变体字形特征来确定图像中的文本行,而不管文本行的取向如何。 为相对于相邻字形的图像中的每个字形确定字形特征。 字形特征被提供给学习的分类器,其为每个相邻字形对输出字形对分数。 每个字形对得分表示对应的相邻字形对形成相同文本行的一部分的可能性。 字形对分数用于识别候选文本行,然后将其排序以选择图像中的最后一组文本行。

    Optimizing pre-processing times for faster response
    7.
    发明授权
    Optimizing pre-processing times for faster response 有权
    优化预处理时间以加快响应速度

    公开(公告)号:US09262689B1

    公开(公告)日:2016-02-16

    申请号:US14133347

    申请日:2013-12-18

    CPC classification number: G06K9/34 G06K9/325 G06K2209/01

    Abstract: Embodiments of the subject technology provide for determining a region of a first acquired image based at least on a viewing mode and a set of respective positions of graphical elements to decrease the pre-processing time and perceived latency for the first image. One or more regions of text in the first image are detected, and a set of regions of text that overlap with the region of the image is determined and pre-processed. The subject technology may then pre-process an entirety of a subsequent image (e.g., to pick up missing text from the region of the first image). Thus, additional OCR results may be provided to the user by using the subsequent image(s) and merging subsequent results with previous results from the first image.

    Abstract translation: 本技术的实施例提供了至少基于观看模式和图形元素的各个位置的集合来确定第一获取图像的区域,以减少第一图像的预处理时间和感知等待时间。 检测第一图像中的一个或多个文本区域,并且确定并预处理与图像的区域重叠的一组文本区域。 主题技术可以预处理后续图像的整体(例如,从第一图像的区域拾取丢失的文本)。 因此,可以通过使用后续图像向用户提供附加的OCR结果,并将后续结果与来自第一图像的先前结果合并。

    Hybrid optical character recognition
    9.
    发明授权
    Hybrid optical character recognition 有权
    混合光学字符识别

    公开(公告)号:US09305227B1

    公开(公告)日:2016-04-05

    申请号:US14139752

    申请日:2013-12-23

    CPC classification number: G06K9/18 G06K9/00979 G06K9/6292 G06K2209/01

    Abstract: Embodiments of the subject technology provide for a hybrid OCR approach which combines server and device side processing that can offset disadvantages of performing OCR solely on the server side or the device side. More specifically, the subject technology utilizes image characteristics such as glyph details and image quality measurements to opportunistically schedule OCR processing on the mobile device and/or server. In this regard, text extracted by a “faster” OCR engine (e.g., one with less latency) is displayed to a user, which is then updated by the result of a more accurate OCR engine (e.g., an OCR engine provided by the server). This approach allows factoring in additional parameters such as network latency and user preference for making scheduling decisions. Thus, the subject technology may provide significant gains in terms of reduced latency and increased accuracy by implementing one or more techniques associated with this hybrid OCR approach.

    Abstract translation: 本技术的实施例提供了一种组合服务器和设备侧处理的混合OCR方法,其可以抵消仅在服务器侧或设备侧执行OCR的缺点。 更具体地,本主题技术利用诸如字形细节和图像质量测量的图像特征来机会地在移动设备和/或服务器上调度OCR处理。 在这方面,由“更快的”OCR引擎提取的文本(例如,具有较小延迟的引擎)被显示给用户,然后由更准确的OCR引擎(例如,由服务器提供的OCR引擎)的结果来更新 )。 这种方法允许考虑附加参数,例如网络延迟和用户偏好以进行调度决策。 因此,本技术可以通过实施与该混合OCR方法相关联的一种或多种技术在减少的延迟和增加的准确性方面提供显着的增益。

    Text detection near display screen edge
    10.
    发明授权
    Text detection near display screen edge 有权
    文本检测附近显示屏边缘

    公开(公告)号:US09286683B1

    公开(公告)日:2016-03-15

    申请号:US13865114

    申请日:2013-04-17

    Inventor: David Paul Ramos

    CPC classification number: G06T7/004 G06K9/00979 G06K9/03 G06K9/228 G06K9/325

    Abstract: Approaches to enable a computing device, such as a phone or tablet computer, to detect when text contained in an image captured by the camera is sufficiently close to the edge of the screen and to infer whether the text is likely to be cut off by the edge of the screen such that the text contained in the image is incomplete. If the incomplete text corresponds to actionable text associated with a function that can be invoked on the computing device, the computing device may wait until the remaining portion of the actionable text is captured by the camera and made available for processing before invoking the corresponding function on the computing device.

    Abstract translation: 使诸如手机或平板电脑之类的计算设备能够检测包含在由相机拍摄的图像中的文本何时足够接近屏幕边缘并且推断文本是否可能被截断的方法 屏幕边缘使图像中包含的文本不完整。 如果不完整的文本对应于与可以在计算设备上调用的功能相关联的可操作的文本,则计算设备可以等待直到可执行文本的剩余部分被相机捕获并且在调用相应的功能之前可用于处理 计算设备。

Patent Agency Ranking