发明公开
- 专利标题: TEXT BLOCK SEGMENTATION
-
申请号: US17814856申请日: 2022-07-26
-
公开(公告)号: US20240046677A1公开(公告)日: 2024-02-08
- 发明人: Ang Yi , Jing Zhang , Hai Cheng Wang , Jun Hong Zhao , Rajesh M. Desai , Yang Zhong Li , Xue Xu
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 主分类号: G06V30/148
- IPC分类号: G06V30/148 ; G06V30/18
摘要:
A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of confidence in a size of the segmented text block based, at least in part, on comparing semantic entities associated with the segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node included in the graph and representative of the first type of text block segmentation pattern; and determining that the size of the segmented text block is non-optimal based on the calculated degree of confidence in the size of the segmented text block being below a predetermined threshold.
信息查询