LOCAL IMAGE ENHANCEMENT FOR TEXT RECOGNITION
    11.
    发明申请
    LOCAL IMAGE ENHANCEMENT FOR TEXT RECOGNITION 有权
    本地图像增强文字识别

    公开(公告)号:US20140270528A1

    公开(公告)日:2014-09-18

    申请号:US13800951

    申请日:2013-03-13

    Abstract: Various embodiments enable regions of text to be identified in an image captured by a camera of a computing device for preprocessing before being analyzed by a visual recognition engine. For example, each of the identified regions can be analyzed or tested to determine whether a respective region contains a quality associated with poor text recognition results, such as poor contrast, blur, noise, and the like, which can be measured by one or more algorithms. Upon identifying a region with such a quality, an image quality enhancement can be automatically applied to the respective region without user instruction or intervention. Accordingly, once each region has been cleared of the quality associated with poor recognition, the regions of text can be processed with a visual recognition algorithm or engine.

    Abstract translation: 各种实施例使得在由视觉识别引擎分析之前,在由计算设备的照相机拍摄的图像中识别文本区域以进行预处理。 例如,可以分析或测试每个所识别的区域以确定相应区域是否包含与差的文本识别结果相关联的质量,例如差的对比度,模糊,噪声等,其可以由一个或多个 算法。 在识别具有这种质量的区域时,可以在没有用户指导或干预的情况下自动地将图像质量增强应用于相应区域。 因此,一旦每个区域已被清除与识别不良相关的质量,文本区域可以用视觉识别算法或引擎进行处理。

    Processing complex utterances for natural language understanding

    公开(公告)号:US11410646B1

    公开(公告)日:2022-08-09

    申请号:US16368399

    申请日:2019-03-28

    Abstract: A system capable of performing natural language understanding (NLU) on utterances including complex command structures such as sequential commands (e.g., multiple commands in a single utterance), conditional commands (e.g., commands that are only executed if a condition is satisfied), and/or repetitive commands (e.g., commands that are executed until a condition is satisfied). Audio data may be processed using automatic speech recognition (ASR) techniques to obtain text. The text may then be processed using machine learning models that are trained to parse text of incoming utterances. The models may identify complex utterance structures and may identify what command portions of an utterance go with what conditional statements. Machine learning models may also identify what data is needed to determine when the conditionals are true so the system may cause the commands to be executed (and stopped) at the appropriate times.

    Graphical refinement for points of interest
    14.
    发明授权
    Graphical refinement for points of interest 有权
    兴趣点的图形细化

    公开(公告)号:US09269011B1

    公开(公告)日:2016-02-23

    申请号:US13764646

    申请日:2013-02-11

    Abstract: Various embodiments crowd source images to cover various angles, zoom levels, and elevations of objects and/or points of interest (POIs) while under various lighting conditions. The crowd sourced images are tagged or associated with a particular POI or geographic location and stored in a database for use by an augmented reality (AR) application to recognize objects appearing in a live view of a scene captured by at least one camera of a computing device. The more comprehensive the database, the more accurately an object or POI in the scene will be recognized and/or tracked by the AR application. Accordingly, the more accurately an object is recognized and tracked by the AR application, the more smoothly and continuous the content and movement transitions thereof can be presented to users in the live view.

    Abstract translation: 在各种照明条件下,各种实施例使源图像覆盖对象和/或兴趣点(POI)的各种角度,缩放级别和高程。 所拍摄的图像被标记或与特定POI或地理位置相关联并存储在数据库中以供增强现实(AR)应用程序使用,以识别出现在计算机的至少一个照相机拍摄的场景的实时视图中的对象 设备。 数据库越全面,AR应用程序将识别和/或跟踪场景中的对象或POI越准确。 因此,由AR应用程序识别和跟踪对象的准确度越高,在实时视图中,用户可以将其内容和移动转换更为平滑和连续地呈现给用户。

    Using a front-facing camera to improve OCR with a rear-facing camera
    15.
    发明授权
    Using a front-facing camera to improve OCR with a rear-facing camera 有权
    使用前置摄像头,使用后置摄像头改善OCR

    公开(公告)号:US09269009B1

    公开(公告)日:2016-02-23

    申请号:US14283115

    申请日:2014-05-20

    Abstract: Various embodiments enable a computing device to incorporate frame selection or preprocessing techniques into a text recognition pipeline in an attempt to improve text recognition accuracy in various environments and situations. For example, a mobile computing device can capture images of text using a first camera, such as a rear-facing camera, while capturing images of the environment or a user with a second camera, such as a front-facing camera. Based on the images captured of the environment or user, one or more image preprocessing parameters can be determined and applied to the captured images in an attempt to improve text recognition accuracy.

    Abstract translation: 各种实施例使得计算设备能够将帧选择或预处理技术合并到文本识别流水线中,以试图改善各种环境和情况下的文本识别精度。 例如,移动计算设备可以使用诸如后置摄像机之类的第一照相机捕获文本的图像,同时利用诸如前置摄像机的第二照相机拍摄环境图像或用户。 基于捕获的环境或用户的图像,可以确定一个或多个图像预处理参数并将其应用于捕获的图像,以提高文本识别精度。

    Text recognition near an edge
    16.
    发明授权
    Text recognition near an edge 有权
    靠近边缘的文本识别

    公开(公告)号:US09239961B1

    公开(公告)日:2016-01-19

    申请号:US14495589

    申请日:2014-09-24

    CPC classification number: G06K9/00456 G06K9/2081 G06K9/325

    Abstract: The recognition of text in an acquired image is improved by using general and type-specific heuristics that can determine the likelihood that a portion of the text is truncated at an edge of an image, frame, or screen. Truncated text can be filtered such that the user is not provided with an option to perform an undesirable task, such as to dial an incorrect number or connect to an incorrect Web address, based on recognizing an incomplete text string. The general and type-specific heuristics can be combined to improve confidence, and the image data can be pre-processed on the device before processing with an optical character recognition (OCR) engine. Multiple frames can be analyzed to attempt to recognize words or characters that might have been truncated in one or more of the frames.

    Abstract translation: 通过使用可以确定文本的一部分在图像,帧或屏幕的边缘被截断的可能性的一般和类型特定的启发式算法来改进获取的图像中的文本的识别。 截断的文本可以被过滤,以便基于识别不完整的文本字符串,用户未被提供执行不期望的任务的选项,例如拨打不正确的号码或连接到不正确的Web地址。 一般和类型特定的启发式可以组合以提高置信度,并且可以在使用光学字符识别(OCR)引擎处理之前在设备上预处理图像数据。 可以分析多个帧以尝试识别可能在一个或多个帧中被截断的字或字符。

Patent Agency Ranking