Methods and apparatus for processing, searching and displaying PDF documents using a browser

    公开(公告)号:US12086197B2

    公开(公告)日:2024-09-10

    申请号:US17515251

    申请日:2021-10-29

    Inventor: Charlie Davis

    CPC classification number: G06F16/9538 G06F9/4881 G06F16/93 G06F40/106

    Abstract: Methods and apparatus for retrieving PDF documents, performing text extraction operations on portions or all of a retrieved document and supporting search operations in a manner that allows search results to be quickly provided for at least portions of a PDF document being viewed are described. The methods and apparatus are particularly useful in applications, such as many applications executed by a browser, where the application is limited to a single processing thread and thus must perform all or many processing operations sequentially. By prioritizing document pages which are being viewed for text extraction even before a search is initiated and by performing text extraction in small periods of time and storing the results, in many cases a user can be provided with text search results for a page being viewed in relatively little time and without the program, e.g., java script browser application, appearing non-responsive.

    Methods and apparatus for locating lines in images and using located lines to make image adjustments
    2.
    发明授权
    Methods and apparatus for locating lines in images and using located lines to make image adjustments 有权
    用于在图像中定位线并使用定位线进行图像调整的方法和装置

    公开(公告)号:US09336455B1

    公开(公告)日:2016-05-10

    申请号:US14618794

    申请日:2015-02-10

    Abstract: Methods and apparatus for identifying lines in an image are described. An image to be processed is divided into a plurality of tiles, and processing is performed on a per tile basis. Lines are identified in tiles and a weight is assigned to each line based on among other things, the length of the line. Quantized first and second parameter values, e.g., values defining where lines enters and leave an area, are used in defining the identified lines. A set of lines is selected based on the weight information and output or used in image processing the image including the lines.

    Abstract translation: 描述用于识别图像中的线的方法和装置。 要处理的图像被分成多个瓦片,并且以每瓦片为基础执行处理。 线条在瓷砖中被识别,并且基于线的长度,除了别的以外,每个线分配权重。 量化的第一和第二参数值,例如定义行进入和离开区域的值的值被用于定义所识别的行。 基于权重信息选择一组线,并输出或用于图像处理包括线的图像。

    Methods and apparatus for generating an efficient SVG file

    公开(公告)号:US09886426B1

    公开(公告)日:2018-02-06

    申请号:US14670128

    申请日:2015-03-26

    Inventor: Garland S Taylor

    Abstract: An input SVG file to be processed is accessed. Reusable symbols in the input SVG are identified, e.g., which satisfy a symbol size requirement. A set of symbols are selected from among the identified reusable symbols for conversion to glyphs of a custom binary font, e.g., based on symbol occurrence frequency. A binary font file is created corresponding to set the selected identified symbols in the SVG input file. An SVG output file is created including: binary font glyph definitions corresponding to the converted identified symbols, definitions of symbols from the SVG input file which have not been converted to glyphs and information indicating where the glyphs and symbols, which were not converted, are to be placed on an output display page. The generated SVG output file is a more efficient SVG file than the input SVG file. Different custom binary font files are created for different SVG input pages.

    Resource management methods and apparatus

    公开(公告)号:US09860194B1

    公开(公告)日:2018-01-02

    申请号:US14925854

    申请日:2015-10-28

    CPC classification number: H04L47/826 H04L41/5009 H04L43/0817

    Abstract: Methods and apparatus for managing resource utilization in a distributed system are described. Devices, e.g., servers, which use resources, e.g., processing cores, act as individual policy enforcement points. Individual servers retrieve and maintain local copies of resource lease records which are stored in a centralized data storage system. The individual server compares locally stored lease records to the retrieved lease records to check for any tampering in the centralized data storage and multiple states are supported to take into consideration transitory conditions and/or communications delays. Verification states include, e.g., a Pending Active state and a Pending Inactive State, in addition to an Active state and Inactive state, to delay licensing enforcement to account for centralized storage system eventual consistency delays.

    Methods and apparatus relating to image binarization

    公开(公告)号:US09704057B1

    公开(公告)日:2017-07-11

    申请号:US14637322

    申请日:2015-03-03

    Abstract: Image binarization methods and apparatus are described. A set of input image pixel values, e.g., a set of grayscale values corresponding to an input image, is processed to determine whether to recommend to use local binarization thresholds or a global binarization threshold. Edges including edge pixels are identified. A first histogram corresponding to edge pixel values and a second histogram corresponding to image pixel values are generated, subjected to one or more smoothing operations, and truncated, based on information derived from the edge histogram. Characteristics of the histograms including, e.g., minima, maxima, points of inflection, and hidden peaks, are determined, evaluated, and used to decide between local binarization thresholds and a global threshold. Based on the recommendation, a global threshold is used or local thresholds are used to process the set of input image pixel values and generate a corresponding set of bi-level values.

    Methods and apparatus for identifying labels and/or information associated with a label and/or using identified information
    6.
    发明授权
    Methods and apparatus for identifying labels and/or information associated with a label and/or using identified information 有权
    用于识别与标签相关联的标签和/或信息和/或使用识别的信息的方法和装置

    公开(公告)号:US09443139B1

    公开(公告)日:2016-09-13

    申请号:US14587858

    申请日:2014-12-31

    CPC classification number: G06K9/00469 G06K9/18 G06K9/2072 G06K9/726

    Abstract: Methods and apparatus for detecting labels included in a document or other binarized image, and for extracting and/or using information associated with a label, are described. A nodal structure modeling objects, e.g., characters, character strings or words, which make up various label aliases are described. The nodal structure is used to generate a score for portions of a binarized document with the scores being used to determine the presence or absence of one or more label aliases. When a label alias is determined to be present, information is extracted from the document and used as information corresponding to a label to which the identified label alias corresponds. Multiple different label aliases may correspond to a single label allowing multiple different aliases to be used to identify the same information. The label aliases and information extraction can be and sometimes used to extract information from scanned forms.

    Abstract translation: 描述用于检测包括在文档或其他二值化图像中的标签以及用于提取和/或使用与标签相关联的信息的方法和装置。 描述构成各种标签别名的节点结构建模对象,例如字符,字符串或单词。 节点结构用于为二进制化文档的部分生成分数,其中分数用于确定是否存在一个或多个标签别名。 当确定标签别名存在时,从文档中提取信息并将其用作对应于所标识的标签别名对应的标签的信息。 多个不同的标签别名可以对应于单个标签,允许使用多个不同的别名来标识相同的信息。 标签别名和信息提取可以并且有时用于从扫描表单中提取信息。

    METHODS AND APPARATUS FOR DETECTING PARTITIONS IN TABLES AND USING PARTITION INFORMATION

    公开(公告)号:US20220067360A1

    公开(公告)日:2022-03-03

    申请号:US17006783

    申请日:2020-08-28

    Inventor: John Reynolds

    Abstract: Methods and apparatus for training neural networks to identify information table partitions are described. Also described are methods and apparatus of using a trained neural network to process an image and provide partition information in an easy to use format. The format of the partition information is one which is simple to interpret, easy to communicate and uses values which facilitate successful training and recognition of partitions in tables whether the partitions be implicitly defined by data arrangement or explicitly define using lines. An image is treated as including a predetermined number of row and column portions. The neural network generates for each predetermined portion a partition present indicator value and a partition location value. The partition present value in some embodiments is a value in the range of 0 to 1 and the partition location value in some embodiments is a value in the range of −1 to +1.

    Methods and apparatus for improving QR code locator detectability and/or finding the corners of a locator pattern

    公开(公告)号:US10509934B1

    公开(公告)日:2019-12-17

    申请号:US15886734

    申请日:2018-02-01

    Inventor: John Reynolds

    Abstract: Various features relate to processing a scanned image to facilitate accurate locator pattern identification and/or detection of the corner locations of the locator pattern. In some embodiments to facilitate the identification of corner points the scanned image is processed to reduce the effect of noise and/or other damage on the subsequent location identification process. Individual white pixels which have black pixels on four sides are converted to black as part of the processing while multiple white pixels adjacent each other are left unaltered. In some embodiments processing does not alter the color of black pixels. Corner points of the locator pattern are identified through additional processing and identification of line segments satisfying an expected black, white, black, white, black segment portion ratio.

Patent Agency Ranking