Method of improving contrast for text extraction and recognition applications
    51.
    发明授权
    Method of improving contrast for text extraction and recognition applications 有权
    提高文本提取和识别应用对比度的方法

    公开(公告)号:US09171224B2

    公开(公告)日:2015-10-27

    申请号:US14023306

    申请日:2013-09-10

    CPC classification number: G06K9/36

    Abstract: An electronic device and method receive (for example, from a memory), a grayscale image of a scene of real world captured by a camera of a mobile device. The electronic device and method also receive a color image from which the grayscale image is generated, wherein each color pixel is stored as a tuple of multiple components. The electronic device and method determine a new intensity for at least one grayscale pixel in the grayscale image, based on at least one component of a tuple of a color pixel located in correspondence to the at least one grayscale pixel. The determination may be done conditionally, by checking whether a local variance of intensities is below a predetermined threshold in a subset of grayscale pixels located adjacent to the at least one grayscale pixel, and selecting the component to provide most local variance of intensities.

    Abstract translation: 接收(例如,从存储器)的电子设备和方法,由移动设备的照相机捕获的真实世界场景的灰度图像。 电子设备和方法还接收生成灰度图像的彩色图像,其中每个颜色像素被存储为多个分量的元组。 电子设备和方法基于与至少一个灰度像素相对应的彩色像素的元组的至少一个分量,确定灰度图像中的至少一个灰度像素的新强度。 可以通过在位于与至少一个灰度像素相邻的灰度级像素的子集中检查强度的局部方差是否低于预定阈值,并且选择该组件以提供最大局部的强度方差来有条件地进行确定。

    Feature extraction and use with a probability density function (PDF) divergence metric
    52.
    发明授权
    Feature extraction and use with a probability density function (PDF) divergence metric 有权
    特征提取和使用概率密度函数(PDF)散度度量

    公开(公告)号:US09141874B2

    公开(公告)日:2015-09-22

    申请号:US13789549

    申请日:2013-03-07

    CPC classification number: G06K9/4671 G06K9/4647 G06K2209/01

    Abstract: An image of real world is processed to identify blocks as candidates to be recognized. Each block is subdivided into sub-blocks, and each sub-block is traversed to obtain counts, in a group for each sub-block. Each count in the group is either of presence of transitions between intensity values of pixels or of absence of transition between intensity values of pixels. Hence, each pixel in a sub-block contributes to at least one of the counts in each group. The counts in a group for a sub-block are normalized, based at least on a total number of pixels in the sub-block. Vector(s) for each sub-block including such normalized counts may be compared with multiple predetermined vectors of corresponding symbols in a set, using any metric of divergence between probability density functions (e.g. Jensen-Shannon divergence metric). Whichever symbol has a predetermined vector that most closely matches the vector(s) is identified and stored.

    Abstract translation: 处理真实世界的图像以识别块作为要被识别的候选。 每个块被细分为子块,并且每个子块被遍历以在每个子块的组中获得计数。 组中的每个计数是存在像素的强度值之间的转换或像素的强度值之间不存在转换。 因此,子块中的每个像素有助于每个组中的至少一个计数。 至少基于子块中的像素总数,子块中的计数被归一化。 可以使用概率密度函数(例如,Jensen-Shannon散度度量)之间的任何度量的度量,将包括这种归一化计数的每个子块的向量与集合中的相应符号的多个预定向量进行比较。 无论哪个符号具有与矢量最接近匹配的预定向量被识别和存储。

    Lower modifier detection and extraction from devanagari text images to improve OCR performance
    53.
    发明授权
    Lower modifier detection and extraction from devanagari text images to improve OCR performance 有权
    较低的修改器检测和提取从devanagari文本图像,以提高OCR性能

    公开(公告)号:US09064191B2

    公开(公告)日:2015-06-23

    申请号:US13791188

    申请日:2013-03-08

    CPC classification number: G06K9/78 G06K9/32 G06K2209/01 G06K2209/013

    Abstract: Systems, apparatus and methods for extracting lower modifiers from a word image, before performing optical character recognition (OCR), based on a plurality of tests comprising a first test, a second test and a third test are presented. The method obtains the word image and performing a plurality of tests (e.g., a first test, a second test and a third test). The first test determines whether a vertical line spanning the height of the word image exists. The second test determines whether a jump of a number of components in the lower portion of the word image exists. The third test determines sparseness in a lower portion of the word image. The plurality of tests may run sequentially and/or in parallel. Results from the plurality of tests are used to decide whether a lower modifier exists by comparing and accumulating test results from the plurality of tests.

    Abstract translation: 提出了基于包括第一测试,第二测试和第三测试的多个测试之前,在执行光学字符识别(OCR)之前从单词图像中提取下修改器的系统,设备和方法。 该方法获得单词图像并执行多个测试(例如,第一测试,第二测试和第三测试)。 第一个测试确定是否存在跨越单词图像的高度的垂直线。 第二个测试确定是否存在单词图像下部的一些组件的跳转。 第三个测试确定单词图像下部的稀疏度。 多个测试可以顺序地和/或并行地运行。 多个测试的结果用于通过比较和累积来自多个测试的测试结果来决定是否存在较低的修饰符。

    Detecting and correcting skew in regions of text in natural images
    54.
    发明授权
    Detecting and correcting skew in regions of text in natural images 有权
    检测和纠正自然图像文本区域的偏差

    公开(公告)号:US08831381B2

    公开(公告)日:2014-09-09

    申请号:US13748562

    申请日:2013-01-23

    Abstract: An electronic device and method use a camera to capture an image of an environment outside the electronic device followed by identification of regions, based on pixel intensities in the image. At least one processor automatically computes multiple values of an indicator of skew in multiple regions in the image respectively. The multiple values are specific to the multiple regions, and thereafter used to determine whether unacceptable skew is present across the regions, e.g. globally in the image as a whole. When skew is determined to be unacceptable, user input is requested to correct the skew, e.g. by displaying on a screen, a symbol and receiving user input (e.g. by rotating an area of touch or rotating the electronic device) to align a direction of the symbol with a direction of the image, and then the process may repeat (e.g. capture image, detect skew, and if necessary request user input).

    Abstract translation: 电子设备和方法使用相机来基于图像中的像素强度来捕获电子设备外的环境的图像,然后识别区域。 至少一个处理器分别自动计算图像中多个区域的偏斜指标的多个值。 多个值对于多个区域是特定的,然后用于确定跨区域是否存在不可接受的偏斜,例如, 全球在整体形象上。 当确定歪斜是不可接受的时,请求用户输入来校正歪斜,例如。 通过在屏幕上显示符号和接收用户输入(例如通过旋转触摸区域或旋转电子设备)以使符号的方向与图像的方向对齐,然后该过程可以重复(例如捕获图像 ,检测偏斜,如有必要请求用户输入)。

    REDUNDANT ASPECT RATIO DECODING OF DEVANAGARI CHARACTERS
    55.
    发明申请
    REDUNDANT ASPECT RATIO DECODING OF DEVANAGARI CHARACTERS 审中-公开
    DEVANAGARI字符的冗余宽高比解码

    公开(公告)号:US20140023275A1

    公开(公告)日:2014-01-23

    申请号:US13844641

    申请日:2013-03-15

    Abstract: An electronic device and method receive a block sliced from a rectangular portion of an image of a scene of real world captured by a camera and use a property of the block to operate one of multiple optical character recognition (OCR) decoders. In an illustrative aspect, a first OCR decoder is configured to recognize characters whose property satisfies the test based on a first limit, the first limit being obtained by reducing a predetermined limit by an overlap amount. In this illustrative aspect, a second OCR decoder is configured to recognize characters whose property does not satisfy the test based on a second limit, the second limit being obtained by increasing the predetermined limit by the overlap amount. When the property of the block satisfies the test, the first OCR decoder is operated and alternatively the second OCR decoder is operated, resulting in candidates for a character being identified.

    Abstract translation: 电子设备和方法接收从相机拍摄的现实世界的场景的图像的矩形部分切割的块,并使用块的属性来操作多个光学字符识别(OCR)解码器之一。 在说明性方面,第一OCR解码器被配置为基于第一限制来识别其性能满足测试的字符,所述第一限制是通过将预定限制减小重叠量而获得的。 在该说明性方面,第二OCR解码器被配置为基于第二限制来识别属性不满足测试的字符,所述第二限制是通过将预定限度增加重叠量而获得的。 当块的属性满足测试时,第一OCR解码器被操作,并且可选地,第二OCR解码器被操作,导致正在识别的字符的候选。

    AUTOMATIC CORRECTION OF SKEW IN NATURAL IMAGES AND VIDEO
    56.
    发明申请
    AUTOMATIC CORRECTION OF SKEW IN NATURAL IMAGES AND VIDEO 有权
    自动校正自然图像和视频

    公开(公告)号:US20140022406A1

    公开(公告)日:2014-01-23

    申请号:US13831237

    申请日:2013-03-14

    Abstract: An electronic device and method use a camera to capture an image of an environment outside followed by identification of regions therein. A subset of the regions is selected, based on attributes of the regions, such as aspect ratio, height, and variance in stroke width. Next, a number of angles that are candidates for use as skew of the image are determined (e.g. one angle is selected for each region. based on peakiness of a histogram of the region, evaluated at different angles). Then, an angle that is most common among these candidates is identified as the angle of skew of the image. The just-described identification of skew angle is performed prior to classification of any region as text or non-text. After skew identification, at least all regions in the subset are rotated by negative of the skew angle, to obtain skew-corrected regions for use in optical character recognition.

    Abstract translation: 电子设备和方法使用相机来捕获外部环境的图像,然后识别其中的区域。 基于区域的属性,例如纵横比,高度和笔画宽度方差,选择区域的子集。 接下来,确定作为图像的偏斜使用的候选者的多个角度(例如,基于区域的直方图的峰值,以不同的角度进行评估,针对每个区域选择一个角度)。 然后,在这些候选中最常见的角度被识别为图像的偏斜角。 在将任何区域分类为文本或非文本之前执行刚刚描述的倾斜角的识别。 在偏斜识别之后,子集中的至少所有区域以歪斜角度的相位旋转,以获得用于光学字符识别的偏斜校正区域。

    Machine learning based rate-distortion optimizer for video compression

    公开(公告)号:US11496746B2

    公开(公告)日:2022-11-08

    申请号:US17165680

    申请日:2021-02-02

    Abstract: Systems and techniques are described for data encoding using a machine learning approach to generate a distortion prediction {circumflex over (D)} and a predicted bit rate {circumflex over (R)}, and to use {circumflex over (D)} and {circumflex over (R)} to perform rate-distortion optimization (RDO). For example, a video encoder can generate the distortion prediction {circumflex over (D)} and the bit rate residual prediction based on outputs of the one or more neural networks in response to the one or more neural networks receiving a residual portion of a block of a video frame as input. The video encoder can determine bit rate metadata prediction based on metadata associated with a mode of compression, and determine {circumflex over (R)} to be the sum of and . The video encoder can determine a rate-distortion cost prediction Ĵ as a function of {circumflex over (D)} and {circumflex over (R)}, and can determine a prediction mode for compressing the block based on Ĵ.

Patent Agency Ranking