-
1.
公开(公告)号:US12211304B2
公开(公告)日:2025-01-28
申请号:US17200448
申请日:2021-03-12
Inventor: Yulin Li , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang
IPC: G06F16/901 , G06N3/047 , G06N5/04 , G06V10/22 , G06V10/80 , G06V30/262 , G06V30/414 , G06V10/24
Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.
-
公开(公告)号:US20210264190A1
公开(公告)日:2021-08-26
申请号:US17206351
申请日:2021-03-19
Inventor: Xiameng Qin , Yulin Li , Ju Huang , Qunyi Xie , Junyu Han
IPC: G06K9/46 , G06K9/62 , G06F40/211 , G06F16/53
Abstract: The present application discloses an image questioning and answering method, apparatus, device and storage medium, relating to the technical field of image processing, computer vision, deep learning and natural language processing. The specific implementation solution is as follows: constructing a question graph with a topological structure and extracting a question feature of a query sentence, according to the query sentence; constructing a visual graph with a topological structure and a text graph with a topological structure according to a target image corresponding to the query sentence; performing fusion on the visual graph, the text graph and the question graph by using a fusion model, to obtain a final fusion graph; and determining reply information of the query sentence according to a reasoning feature extracted from the final fusion graph and the question feature.
-
公开(公告)号:US11854246B2
公开(公告)日:2023-12-26
申请号:US17201733
申请日:2021-03-15
Inventor: Yulin Li , Ju Huang , Xiameng Qin , Junyu Han
IPC: G06N3/08 , G06V30/413 , G06V30/414 , G06N3/047 , G06V10/82 , G06V30/18 , G06V30/19 , G06V30/412 , G06V30/16 , G06V30/10
CPC classification number: G06V10/82 , G06N3/047 , G06N3/08 , G06V30/18057 , G06V30/19173 , G06V30/412 , G06V30/413 , G06V30/414 , G06V30/10 , G06V30/1607
Abstract: A method, apparatus, device and storage medium for recognizing a bill image may include: performing text detection on a bill image, and determining an attribute information set and a relationship information set of each text box of at least two text boxes in the bill image; determining a type of the text box and an associated text box that has a structural relationship with the text box based on the attribute information set and the relationship information set of the text box; and extracting structured bill data of the bill image, based on the type of the text box and the associated text box that has the structural relationship with the text box.
-
-