-
公开(公告)号:US20210264190A1
公开(公告)日:2021-08-26
申请号:US17206351
申请日:2021-03-19
Inventor: Xiameng Qin , Yulin Li , Ju Huang , Qunyi Xie , Junyu Han
IPC: G06K9/46 , G06K9/62 , G06F40/211 , G06F16/53
Abstract: The present application discloses an image questioning and answering method, apparatus, device and storage medium, relating to the technical field of image processing, computer vision, deep learning and natural language processing. The specific implementation solution is as follows: constructing a question graph with a topological structure and extracting a question feature of a query sentence, according to the query sentence; constructing a visual graph with a topological structure and a text graph with a topological structure according to a target image corresponding to the query sentence; performing fusion on the visual graph, the text graph and the question graph by using a fusion model, to obtain a final fusion graph; and determining reply information of the query sentence according to a reasoning feature extracted from the final fusion graph and the question feature.
-
2.
公开(公告)号:US12211304B2
公开(公告)日:2025-01-28
申请号:US17200448
申请日:2021-03-12
Inventor: Yulin Li , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang
IPC: G06F16/901 , G06N3/047 , G06N5/04 , G06V10/22 , G06V10/80 , G06V30/262 , G06V30/414 , G06V10/24
Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.
-
3.
公开(公告)号:US20210201182A1
公开(公告)日:2021-07-01
申请号:US17200448
申请日:2021-03-12
Inventor: Yulin LI , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang
IPC: G06N5/04 , G06N3/04 , G06F16/901
Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.
-
公开(公告)号:US11881044B2
公开(公告)日:2024-01-23
申请号:US17353540
申请日:2021-06-21
Inventor: Chengquan Zhang , Mengyi En , Ju Huang , Qunyi Xie , Xiameng Qin , Kun Yao , Junyu Han , Jingtuo Liu , Errui Ding
IPC: G06V30/414 , G06T7/136 , G06T7/11 , G06F18/213 , G06V30/146 , G06V30/18 , G06V10/764 , G06V10/82 , G06V30/10
CPC classification number: G06V30/414 , G06F18/213 , G06T7/11 , G06T7/136 , G06V10/764 , G06V10/82 , G06V30/147 , G06V30/18057 , G06T2207/30176 , G06V30/10
Abstract: A method and apparatus for processing an image, a device and a storage medium are provided. An implementation of the method includes: acquiring a template image, the template image including at least one region of interest; determining a first feature map corresponding to each region of interest in the template image; acquiring a target image; determining a second feature map of the target image; and determining at least one region of interest in the target image according to the first feature map and the second feature map.
-
公开(公告)号:US11854246B2
公开(公告)日:2023-12-26
申请号:US17201733
申请日:2021-03-15
Inventor: Yulin Li , Ju Huang , Xiameng Qin , Junyu Han
IPC: G06N3/08 , G06V30/413 , G06V30/414 , G06N3/047 , G06V10/82 , G06V30/18 , G06V30/19 , G06V30/412 , G06V30/16 , G06V30/10
CPC classification number: G06V10/82 , G06N3/047 , G06N3/08 , G06V30/18057 , G06V30/19173 , G06V30/412 , G06V30/413 , G06V30/414 , G06V30/10 , G06V30/1607
Abstract: A method, apparatus, device and storage medium for recognizing a bill image may include: performing text detection on a bill image, and determining an attribute information set and a relationship information set of each text box of at least two text boxes in the bill image; determining a type of the text box and an associated text box that has a structural relationship with the text box based on the attribute information set and the relationship information set of the text box; and extracting structured bill data of the bill image, based on the type of the text box and the associated text box that has the structural relationship with the text box.
-
-
-
-