- 专利标题: Automatic generation of training data for hand-printed text recognition
-
申请号: US17562344申请日: 2021-12-27
-
公开(公告)号: US11715317B1公开(公告)日: 2023-08-01
- 发明人: Jason James Grams
- 申请人: Konica Minolta Business Solutions U.S.A., Inc.
- 申请人地址: US CA San Mateo
- 专利权人: Konica Minolta Business Solutions U.S.A., Inc.
- 当前专利权人: Konica Minolta Business Solutions U.S.A., Inc.
- 当前专利权人地址: US CA San Mateo
- 代理机构: Osha Bergman Watanabe & Burton LLP
- 主分类号: G06V30/414
- IPC分类号: G06V30/414 ; G06V30/19 ; G06V30/16 ; G06V30/244
摘要:
A method for generating training data for hand-printed text recognition includes obtaining a structured document, obtaining a set of hand-printed character images and database metadata from a database, generating a modified document page image, and outputting a training file. The structured document includes a document page image that includes text characters and document metadata that associates each of the text characters to a document character label. The database metadata associates each of the set of hand-printed character images to a database character label. The modified document page image is generated by iteratively processing each of the text characters. The iterative processing includes determining whether an individual text character should be replaced, selecting a replacement hand-printed character image from the set of hand-printed character images, scaling the replacement hand-printed character image, and inserting the replacement hand-printed character image into the modified document page image.
公开/授权文献
信息查询