-
公开(公告)号:US11482023B2
公开(公告)日:2022-10-25
申请号:US16710528
申请日:2019-12-11
Inventor: Chengquan Zhang , Zuming Huang , Mengyi En , Junyu Han , Errui Ding
IPC: G06V30/262 , G06N20/00 , G06V10/22 , G06V30/148 , G06V30/10
Abstract: A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.
-
公开(公告)号:US20210209395A1
公开(公告)日:2021-07-08
申请号:US17212712
申请日:2021-03-25
Inventor: Zihan Ni , Yipeng Sun , Junyu Han
Abstract: The disclosure provides a method for recognizing a license plate. The implementation includes: obtaining a feature map including a plurality of feature vectors of a license plate region; sequentially inputting the plurality of feature vectors based on a first order into a first recurrent neural network for encoding to obtain a first code of each of the plurality of feature vectors; sequentially inputting the plurality of feature vectors based on a second order into a second recurrent neural network for encoding to obtain a second code of each of the plurality of feature vectors; generating a plurality of target codes of the plurality of feature vectors based on the first code of each of the plurality of feature vectors and the second code of each of the plurality of feature vectors; and decoding the plurality of target codes to obtain a plurality of characters in the license plate.
-
3.
公开(公告)号:US20210201182A1
公开(公告)日:2021-07-01
申请号:US17200448
申请日:2021-03-12
Inventor: Yulin LI , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang
IPC: G06N5/04 , G06N3/04 , G06F16/901
Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.
-
公开(公告)号:US11687779B2
公开(公告)日:2023-06-27
申请号:US17208611
申请日:2021-03-22
Inventor: Zhizhi Guo , Yipeng Sun , Jingtuo Liu , Junyu Han
CPC classification number: G06N3/08 , G06F18/10 , G06F18/253 , G06N3/04 , G06V10/52 , G06V10/806 , G06V40/171 , G06V40/172
Abstract: An image recognition method is provided, which is related to a technical field of artificial intelligence, and in particular, to a technical field of image processing. An implementation includes: performing five-sense-organ recognition on a preprocessed human face image and marking positions of the human facial five sense organs in the human face image, to obtain the marked human face image; determining human face images at multiple scales of the marked human face image, inputting the human face images of multiple scales into a backbone network model, and performing feature extraction, to obtain a wrinkle feature of the human face image at each of the multiple scales; and fusing the wrinkle feature at each scale that is located in a same area of the human face image, to obtain a wrinkle recognition result of the human face image.
-
公开(公告)号:US11210546B2
公开(公告)日:2021-12-28
申请号:US16822085
申请日:2020-03-18
Inventor: Yipeng Sun , Chengquan Zhang , Zuming Huang , Jiaming Liu , Junyu Han , Errui Ding
Abstract: The present disclosure proposes an end-to-end text recognition method and apparatus, computer device and readable medium. The method comprises: obtaining a to-be-recognized picture containing a text region; recognizing a position of the text region in the to-be-recognized picture and text content included in the text region with a pre-trained end-to-end text recognition model; the end-to-end text recognition model comprising a region of interest perspective transformation processing module for performing perspective transformation processing for the text region. The technical solution of the present disclosure does not need to serially arrange a plurality of steps, and may avoid introducing the accumulated errors and may effectively improve the accuracy of the text recognition.
-
公开(公告)号:US20210357710A1
公开(公告)日:2021-11-18
申请号:US17352668
申请日:2021-06-21
Inventor: Chengquan Zhang , Pengyuan Lv , Kun Yao , Junyu Han , Jingtuo Liu
Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.
-
公开(公告)号:US20210264190A1
公开(公告)日:2021-08-26
申请号:US17206351
申请日:2021-03-19
Inventor: Xiameng Qin , Yulin Li , Ju Huang , Qunyi Xie , Junyu Han
IPC: G06K9/46 , G06K9/62 , G06F40/211 , G06F16/53
Abstract: The present application discloses an image questioning and answering method, apparatus, device and storage medium, relating to the technical field of image processing, computer vision, deep learning and natural language processing. The specific implementation solution is as follows: constructing a question graph with a topological structure and extracting a question feature of a query sentence, according to the query sentence; constructing a visual graph with a topological structure and a text graph with a topological structure according to a target image corresponding to the query sentence; performing fusion on the visual graph, the text graph and the question graph by using a fusion model, to obtain a final fusion graph; and determining reply information of the query sentence according to a reasoning feature extracted from the final fusion graph and the question feature.
-
8.
公开(公告)号:US12211304B2
公开(公告)日:2025-01-28
申请号:US17200448
申请日:2021-03-12
Inventor: Yulin Li , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang
IPC: G06F16/901 , G06N3/047 , G06N5/04 , G06V10/22 , G06V10/80 , G06V30/262 , G06V30/414 , G06V10/24
Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.
-
公开(公告)号:US11861919B2
公开(公告)日:2024-01-02
申请号:US17352668
申请日:2021-06-21
Inventor: Chengquan Zhang , Pengyuan Lv , Kun Yao , Junyu Han , Jingtuo Liu
IPC: G06V20/00 , G06V20/62 , G06N3/08 , G06V30/262 , G06V20/58 , G06V30/148 , G06N3/045 , G06V30/28 , G06V30/10
CPC classification number: G06V20/62 , G06N3/045 , G06N3/08 , G06V20/582 , G06V20/63 , G06V30/153 , G06V30/262 , G06V30/274 , G06V30/10 , G06V30/287 , G06V30/293
Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.
-
公开(公告)号:US11074437B2
公开(公告)日:2021-07-27
申请号:US15930714
申请日:2020-05-13
Inventor: Shihu Li , Xiangda Yan , Yuanzhang Chang , Zhibin Hong , Tianshu Hu , Kun Yao , Junyu Han , Jingtuo Liu , Shengxian Zhu
Abstract: A method, an electronic device and a storage medium for expression driving are disclosed. The method may include: performing facial key point detection on a driven character in a first image to obtain a first facial key point sequence; performing the following processing for each second image of a plurality of second images obtained successively: performing facial key point detection on a driving character in the second image to obtain a second facial key point sequence; obtaining a difference between the second facial key point sequence and an expressionless key point sequence which has been determined previously according to an analysis on the second facial key point sequence for a previous second image, and performing expression drive rendering on the driven character based on the difference and the first facial key point sequence. The technical solution may enhance flexibility, interactivity, accuracy etc.
-
-
-
-
-
-
-
-
-