Resume Document Parsing using Computer Vision and Optical Character Recognition with Reblocking Feedback

    公开(公告)号:US20230215206A1

    公开(公告)日:2023-07-06

    申请号:US17331463

    申请日:2021-05-26

    Applicant: Indeed, Inc.

    Abstract: Systems and methods are disclosed for parsing resume documents using computer vision and optical character recognition technology in combination with a user feedback interface system to facilitate user feedback to improve the overall processing quality of the resumes that are imported into computer resume processing systems. In at least one embodiment, the system and method prompt a user to upload an input resume document, which is processed with a first parsing pass to generate initial resume data by extracting a plurality of resume text blocks. Further processing identifies an initial set of bounding blocks and to visually displays the initial resume data for user review and feedback to regroup one or more of the initial set of bounding blocks into a regrouped bounding block. Additional processing consolidates into a group text block each of the resume text blocks corresponding to the regrouped one or more of the initial set of bounding blocks.

    SOCIAL MEDIA POST FACILITATION SYSTEMS AND METHODS

    公开(公告)号:US20190036866A1

    公开(公告)日:2019-01-31

    申请号:US15948689

    申请日:2018-04-09

    Applicant: Upheaval LLC

    Inventor: David ISEMINGER

    Abstract: Methods and systems are provided in which an improved interface implements a synergistic hybrid of user interactions and automatic operations so that user input is elicited sparingly, making it possible to generate customized social media posts with unexpected speed relative to any art-known techniques. A draft post is pre-populated with a first keyword that identifies a machine-recognized aspect of a photograph, for example, and an event descriptor partly based on the capture location. After adding user text, a complete post is then ready for broadcast.

    COMPUTER, DOCUMENT IDENTIFICATION METHOD, AND SYSTEM

    公开(公告)号:US20180349693A1

    公开(公告)日:2018-12-06

    申请号:US15918830

    申请日:2018-03-12

    Applicant: HITACHI, LTD.

    Abstract: A computer, which is configured to extract an attribute being a character string indicating a feature of a paper-based document, the computer stores template information dictionary information. The computer is configured to: execute character recognition processing on image data on the paper-based document; extract an attribute corresponding to each of the at least one type of attribute, which is defined in each of the plurality of templates, through use of a result of the character recognition processing and the plurality of templates; calculate a score regarding the extracted attribute for each of the plurality of templates; select one of the plurality of templates that has the highest extraction accuracy of the attribute based on the score; and generate output information through use of the selected template.

    METHOD FOR LINE AND WORD SEGMENTATION FOR HANDWRITTEN TEXT IMAGES

    公开(公告)号:US20180330181A1

    公开(公告)日:2018-11-15

    申请号:US16043010

    申请日:2018-07-23

    Inventor: Duanduan Yang

    Abstract: A method for segmenting an image containing handwritten text into line segments and word segments. The image is horizontally down sampled at a first ratio. Connected regions in the down-sampled image are detected; horizontal neighboring ones are merged to form lines, to segment the original image into line images. Each line image is horizontally down sampled at a second ratio which is smaller than the first ratio. Connected regions in the down-sampled line image are detected to obtain potential word segmentation positions. A path is a way of dividing the line at some or all of the potential word segmentation positions into multiple path segments; for each of all possible paths, word recognition is applied to each path segment to calculate a word recognition score, and an average word recognition score for the path is calculated; the path with the highest score gives the final word segmentation.

Patent Agency Ranking