Patent search ap:("Beijing Baidu Netcom Science AND Technology Co. Page Ltd.") AND inv:"Junyu Han"

1.

发明授权
Method and apparatus for detecting text regions in image, device, and medium 有权

公开(公告)号：US11482023B2

公开(公告)日：2022-10-25

申请号：US16710528

申请日：2019-12-11

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Chengquan Zhang , Zuming Huang , Mengyi En , Junyu Han , Errui Ding

IPC: G06V30/262 , G06N20/00 , G06V10/22 , G06V30/148 , G06V30/10

Abstract: A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.

2.

发明申请
METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM FOR RECOGNIZING LICENSE PLATE 有权

公开(公告)号：US20210209395A1

公开(公告)日：2021-07-08

申请号：US17212712

申请日：2021-03-25

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Zihan Ni , Yipeng Sun , Junyu Han

IPC: G06K9/32 , G06K9/62 , G06N3/04 , G06N3/08

Abstract: The disclosure provides a method for recognizing a license plate. The implementation includes: obtaining a feature map including a plurality of feature vectors of a license plate region; sequentially inputting the plurality of feature vectors based on a first order into a first recurrent neural network for encoding to obtain a first code of each of the plurality of feature vectors; sequentially inputting the plurality of feature vectors based on a second order into a second recurrent neural network for encoding to obtain a second code of each of the plurality of feature vectors; generating a plurality of target codes of the plurality of feature vectors based on the first code of each of the plurality of feature vectors and the second code of each of the plurality of feature vectors; and decoding the plurality of target codes to obtain a plurality of characters in the license plate.

3.

发明申请
METHOD AND APPARATUS FOR PERFORMING STRUCTURED EXTRACTION ON TEXT, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210201182A1

公开(公告)日：2021-07-01

申请号：US17200448

申请日：2021-03-12

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Yulin LI , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang

IPC: G06N5/04 , G06N3/04 , G06F16/901

Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.

4.

发明授权
Image recognition method and apparatus, device, and computer storage medium 有权

公开(公告)号：US11687779B2

公开(公告)日：2023-06-27

申请号：US17208611

申请日：2021-03-22

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Zhizhi Guo , Yipeng Sun , Jingtuo Liu , Junyu Han

IPC: G06N3/08 , G06N3/04 , G06V40/16 , G06F18/10 , G06F18/25 , G06V10/80 , G06V10/52

CPC classification number: G06N3/08 , G06F18/10 , G06F18/253 , G06N3/04 , G06V10/52 , G06V10/806 , G06V40/171 , G06V40/172

Abstract: An image recognition method is provided, which is related to a technical field of artificial intelligence, and in particular, to a technical field of image processing. An implementation includes: performing five-sense-organ recognition on a preprocessed human face image and marking positions of the human facial five sense organs in the human face image, to obtain the marked human face image; determining human face images at multiple scales of the marked human face image, inputting the human face images of multiple scales into a backbone network model, and performing feature extraction, to obtain a wrinkle feature of the human face image at each of the multiple scales; and fusing the wrinkle feature at each scale that is located in a same area of the human face image, to obtain a wrinkle recognition result of the human face image.

5.

发明授权
End-to-end text recognition method and apparatus, computer device and readable medium 有权

公开(公告)号：US11210546B2

公开(公告)日：2021-12-28

申请号：US16822085

申请日：2020-03-18

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Yipeng Sun , Chengquan Zhang , Zuming Huang , Jiaming Liu , Junyu Han , Errui Ding

IPC: G06K9/46 , G06K9/32 , G06K9/62

Abstract: The present disclosure proposes an end-to-end text recognition method and apparatus, computer device and readable medium. The method comprises: obtaining a to-be-recognized picture containing a text region; recognizing a position of the text region in the to-be-recognized picture and text content included in the text region with a pre-trained end-to-end text recognition model; the end-to-end text recognition model comprising a region of interest perspective transformation processing module for performing perspective transformation processing for the text region. The technical solution of the present disclosure does not need to serially arrange a plurality of steps, and may avoid introducing the accumulated errors and may effectively improve the accuracy of the text recognition.

6.

发明申请
TEXT RECOGNITION METHOD AND DEVICE, AND ELECTRONIC DEVICE 有权

公开(公告)号：US20210357710A1

公开(公告)日：2021-11-18

申请号：US17352668

申请日：2021-06-21

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Chengquan Zhang , Pengyuan Lv , Kun Yao , Junyu Han , Jingtuo Liu

IPC: G06K9/72 , G06K9/34 , G06K9/00 , G06K9/46 , G06N3/04 , G06N3/08

Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.

7.

发明申请
IMAGE QUESTIONING AND ANSWERING METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210264190A1

公开(公告)日：2021-08-26

申请号：US17206351

申请日：2021-03-19

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xiameng Qin , Yulin Li , Ju Huang , Qunyi Xie , Junyu Han

IPC: G06K9/46 , G06K9/62 , G06F40/211 , G06F16/53

Abstract: The present application discloses an image questioning and answering method, apparatus, device and storage medium, relating to the technical field of image processing, computer vision, deep learning and natural language processing. The specific implementation solution is as follows: constructing a question graph with a topological structure and extracting a question feature of a query sentence, according to the query sentence; constructing a visual graph with a topological structure and a text graph with a topological structure according to a target image corresponding to the query sentence; performing fusion on the visual graph, the text graph and the question graph by using a fusion model, to obtain a final fusion graph; and determining reply information of the query sentence according to a reasoning feature extracted from the final fusion graph and the question feature.

8.

发明授权
Method and apparatus for performing structured extraction on text, device and storage medium 有权

公开(公告)号：US12211304B2

公开(公告)日：2025-01-28

申请号：US17200448

申请日：2021-03-12

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Yulin Li , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang

IPC: G06F16/901 , G06N3/047 , G06N5/04 , G06V10/22 , G06V10/80 , G06V30/262 , G06V30/414 , G06V10/24

Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.

9.

发明授权
Text recognition method and device, and electronic device 有权

公开(公告)号：US11861919B2

公开(公告)日：2024-01-02

申请号：US17352668

申请日：2021-06-21

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Chengquan Zhang , Pengyuan Lv , Kun Yao , Junyu Han , Jingtuo Liu

IPC: G06V20/00 , G06V20/62 , G06N3/08 , G06V30/262 , G06V20/58 , G06V30/148 , G06N3/045 , G06V30/28 , G06V30/10

CPC classification number: G06V20/62 , G06N3/045 , G06N3/08 , G06V20/582 , G06V20/63 , G06V30/153 , G06V30/262 , G06V30/274 , G06V30/10 , G06V30/287 , G06V30/293

Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.

10.

发明授权
Method, apparatus, electronic device and storage medium for expression driving 有权

公开(公告)号：US11074437B2

公开(公告)日：2021-07-27

申请号：US15930714

申请日：2020-05-13

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shihu Li , Xiangda Yan , Yuanzhang Chang , Zhibin Hong , Tianshu Hu , Kun Yao , Junyu Han , Jingtuo Liu , Shengxian Zhu

IPC: G06K9/00 , G06K9/62

Abstract: A method, an electronic device and a storage medium for expression driving are disclosed. The method may include: performing facial key point detection on a driven character in a first image to obtain a first facial key point sequence; performing the following processing for each second image of a plurality of second images obtained successively: performing facial key point detection on a driving character in the second image to obtain a second facial key point sequence; obtaining a difference between the second facial key point sequence and an expressionless key point sequence which has been determined previously according to an analysis on the second facial key point sequence for a previous second image, and performing expression drive rendering on the driven character based on the difference and the first facial key point sequence. The technical solution may enhance flexibility, interactivity, accuracy etc.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification